skip to navigation skip to content
Mon 8 Jul - Thu 11 Jul 2019
09:30 - 16:30

Venue: Bioinformatics Training Room, Craik-Marshall Building, Downing Site

Provided by: Bioinformatics


Bookings cannot be made on this event (Event is not taking bookings).

Other dates:

No more events

[ Show past events ]

Booking / availability

Variant Discovery with GATK4

Mon 8 Jul - Thu 11 Jul 2019


This workshop will focus on the core steps involved in calling germline short variants, somatic short variants, and copy number alterations with the Broad’s Genome Analysis Toolkit (GATK), using “Best Practices” developed by the GATK methods development team. A team of methods developers and instructors from the Data Sciences Platform at Broad will give talks explaining the rationale, theory, and real-world applications of the GATK Best Practices. You will learn why each step is essential to the variant-calling process, what key operations are performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset. If you are an experienced GATK user, you will gain a deeper understanding of how the GATK works under-the-hood and how to improve your results further, especially with respect to the latest innovations.

The hands-on tutorials for learning GATK tools and commands will be on Terra, a new platform developed at Broad in collaboration with Verily Life Sciences for accessing data, running analysis tools and collaborating securely and seamlessly. (If you’ve heard of or been a user of FireCloud, think of Terra as the new and improved user interface for FireCloud that makes doing research easier than before!)

  • Day 1: Introductory topics and hands-on tutorials. We will start off with introductory lectures on sequencing data, preprocessing, variant discovery, and pipelining. Then you will get hands-on with a recreation of a real variant discovery analysis in Terra.
  • Day 2: Germline short variant discovery. Through a combination of lectures and hands-on tutorials, you will learn: germline single nucleotide variants and indels, joint calling, variant filtering, genotype refinement, and callset evaluation.
  • Day 3: Somatic variant discovery. In a format similar to Day 2, you will learn: somatic single nucleotide variants and indels, Mutect2, and somatic copy number alterations.
  • Day 4: Pipelining and performing your analysis end-to-end in Terra. On the final day, you will learn how to write your own pipelining scripts in the Workflow Description Language (WDL) and execute them with the Cromwell workflow management system. You will also be introduced to additional tools that help you do your analysis end-to-end in Terra.

Please note that this workshop is focused on human data analysis. The majority of the materials presented does apply equally to non-human data, and we will address some questions regarding adaptations that are needed for analysis of non-human data, but we will not go into much detail on those points.

The training room is located on the first floor and there is currently no wheelchair or level access available to this level.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to Book or register Interest by linking here.

Target audience
  • The course is aimed primarily at mid-career scientists – especially those whose formal education likely included statistics, but who have not perhaps put this into practice since.
  • Graduate students, Postdocs and Staff members from the University of Cambridge, Affiliated Institutions and other external Institutions or individuals
  • Please be aware that these courses are only free for registered University of Cambridge students. All other participants will be charged a registration fee in some form. Registration fees and further details regarding the charging policy are available here.
  • Further details regarding eligibility criteria are available here
  • Familiarity with the basic terms and concepts of genetics and genomics.
  • Basic familiarity with the command line environment is required.
  • Sufficient UNIX experience might be obtained from one of the many UNIX tutorials available online.

Number of sessions: 4

# Date Time Venue Trainer
1 Mon 8 Jul   09:30 - 16:30 09:30 - 16:30 Bioinformatics Training Room, Craik-Marshall Building, Downing Site map
2 Tue 9 Jul   09:30 - 16:30 09:30 - 16:30 Bioinformatics Training Room, Craik-Marshall Building, Downing Site map
3 Wed 10 Jul   09:30 - 16:30 09:30 - 16:30 Bioinformatics Training Room, Craik-Marshall Building, Downing Site map
4 Thu 11 Jul   09:30 - 16:30 09:30 - 16:30 Bioinformatics Training Room, Craik-Marshall Building, Downing Site map
Topics covered

Bioinformatics, Data handling, Data mining, Data visualisation, Genomics, Sequence variations


After this course you should be able to:

  • Understand the overall variant discovery workflow rationale and requirements
  • Understand key methods and functionalities in light of the latest research
  • Understand key differences between germline and somatic variant discovery approaches
  • Apply analysis tools and Best Practices workflows to a real data set
  • Interpret analysis results and troubleshoot common problems
  • Write and execute WDL analysis pipelines

During this course you will learn about:

  • Pre-processing of high-throughput sequencing data
  • Variant discovery (germline and somatic short variants, somatic CNV)
  • Germline variant filtering and evaluation
  • Pipelining strategies

Presentations, demonstrations and practicals


Day 1 Topics
9:30 - 9:45 Opening remarks
9:45 - 10:15 Introduction to Sequence data / pre-processing workflow
10:15 - 10:45 Introduction to Germline variant discovery Best Practices workflows
10:45 - 11:15 Tea/coffee break
11:15-11:45 Introduction to Somatic variant discovery Best Practices workflows
11:45-12:15 Introduction to pipelining with WDL & Cromwell & Terra
12:15-12:30 Closing question time
12:30-13:30 Lunch (not provided)
13:30-13:55 Mapping
13:55-14:20 Marking duplicates
14:20-14:45 Base recalibration (BQSR)
14:45-15:15 Tea/coffee break
15:15-16:30 Hands-on IGV + GATK4 basics
Day 2 Topics
9:30 - 9:45 Recap of germline variant discovery Best Practices
9:45-10:15 HaplotypeCaller
10:15-10:45 Joint-calling with GenomicsDB + GenotypeGVCFs
10:45-11:15 Tea/coffee break
11:15-12:30 Hands-on joint-calling
12:30-13:30 Lunch (not provided)
13:30-14:00 Filtering with VQSR
14:00-14:30 Genotype Refinement
14:30-15:00 Callset Evaluation
15:00-15:30 Tea/coffee break
15:30-16:30 Hands-on filtering approaches
Day 3 Topics
9:30 - 9:45 Recap of somatic variant discovery Best Practices
9:45-10:30 Somatic SNVs and indels with Mutect2
10:30-11:00 Tea/coffee break
11:00-12:30 Hands-on Mutect2
12:30-13:30 Lunch (not provided)
13:30-14:00 Somatic CNVs with GATK CNV
14:00-15:15 Hands-on GATK CNV
15:15-15:45 Tea/coffee break
15:45-16:15 Preview of upcoming methods: germline CNV and SV
16:15-16:30 Open question time
Day 4 Topics
9:30 - 9:45 WDL/Cromwell 101
9:45-10:45 Hands-on WDL/Cromwell
10:45-11:15 Tea/coffee break
11:15-12:30 Self-paced WDL exercises
12:30-13:30 Lunch (not provided)
13:30-13:45 Terra 101
13:45-14:45 Hands-on Terra Part 1
14:45-15:15 Tea/coffee break
15:15-16:30 Hands-on Terra Part 2
Registration Fees
  • Free for registered University of Cambridge students
  • £ 50/day for all University of Cambridge staff, including postdocs, temporary visitors (students and researchers) and participants from Affiliated Institutions. Please note that these charges are recovered by us at the Institutional level
  • It remains the participant's responsibility to acquire prior approval from the relevant group leader, line manager or budget holder to attend the course. It is requested that people booking only do so with the agreement of the relevant party as costs will be charged back to your Lab Head or Group Supervisor.
  • £ 50/day for all other academic participants from external Institutions and charitable organizations. These charges must be paid at registration
  • £ 100/day for all Industry participants. These charges must be paid at registration
  • Further details regarding the charging policy are available here



Once a year

Specialized Training

Booking / availability