skip to navigation skip to content
Mon 16 Jul - Thu 19 Jul 2018
09:30 - 16:30

Venue: Bioinformatics Training Room, Craik-Marshall Building, Downing Site

Provided by: Bioinformatics


Booking

Bookings cannot be made on this event (Event is completed).


Other dates:

No more events



Register interest
Register your interest - if you would be interested in additional dates being scheduled.


Booking / availability

Variant Discovery with GATK4
PrerequisitesUpdated

Mon 16 Jul - Thu 19 Jul 2018

Description

This workshop will focus on the core steps involved in calling variants with the Broad’s Genome Analysis Toolkit, using the “Best Practices” developed by the GATK team. You will learn why each step is essential to the variant discovery process, what are the operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset.

In the course of this workshop, we highlight key functionalities such as the germline GVCF workflow for joint variant discovery in cohorts, somatic variant discovery using MuTect2, and copy number variation discovery using GATK-CNV. All analyses are demonstrated using GATK version 4. Finally, we demonstrate the use of pipelining tools to assemble and execute GATK workflows.

The workshop covers basic genomics, all currently supported Best Practices pipelines as well as pipelining with WDL/Cromwell/FireCloud. This includes the logic of the major pipelines, file formats and data transformations involved, and hands-on operation of the tools using goal-oriented exercises.

  • Day 1: Introduction to Genomics, GATK Best Practices and Pipelining
  • Day 2: Germline short variant discovery (SNPs + Indels)
  • Day 3: Somatic variant discovery (SNVs + Indels + CNVs)
  • Day 4: Writing pipelines with WDL and running them in FireCloud

Please note that this workshop is focused on human data analysis. The majority of the materials presented does apply equally to non-human data, and we will address some questions regarding adaptations that are needed for analysis of non-human data, but we will not go into much detail on those points.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to Book or register Interest by linking here.

Target audience
  • The course is aimed primarily at mid-career scientists – especially those whose formal education likely included statistics, but who have not perhaps put this into practice since.
  • Graduate students, Postdocs and Staff members from the University of Cambridge, Affiliated Institutions and other external Institutions or individuals
  • Please be aware that these courses are only free for University of Cambridge students. All other participants will be charged a registration fee in some form. Registration fees and further details regarding the charging policy are available here.
  • Further details regarding eligibility criteria are available here
Prerequisites
  • Familiarity with the basic terms and concepts of genetics and genomics.
  • Basic familiarity with the command line environment is required.
  • Sufficient UNIX experience might be obtained from one of the many UNIX tutorials available online.
Sessions

Number of sessions: 4

# Date Time Venue Trainers
1 Mon 16 Jul   09:30 - 16:30 09:30 - 16:30 Bioinformatics Training Room, Craik-Marshall Building, Downing Site map Geraldine Van der Auwera,  Eric Banks,  Kate Voss,  Takuto Sato,  Soo Hee Lee
2 Tue 17 Jul   09:30 - 16:30 09:30 - 16:30 Bioinformatics Training Room, Craik-Marshall Building, Downing Site map Geraldine Van der Auwera,  Eric Banks,  Kate Voss,  Takuto Sato,  Soo Hee Lee
3 Wed 18 Jul   09:30 - 16:30 09:30 - 16:30 Bioinformatics Training Room, Craik-Marshall Building, Downing Site map Geraldine Van der Auwera,  Eric Banks,  Kate Voss,  Takuto Sato,  Soo Hee Lee
4 Thu 19 Jul   09:30 - 16:30 09:30 - 16:30 Bioinformatics Training Room, Craik-Marshall Building, Downing Site map Geraldine Van der Auwera,  Eric Banks,  Kate Voss,  Takuto Sato,  Soo Hee Lee
Topics covered

Bioinformatics, Data handling, Data mining, Data visualisation, Genomics, Sequence variations

Objectives

After this course you should be able to:

  • Understand the overall variant discovery workflow rationale and requirements
  • Understand key methods and functionalities in light of the latest research
  • Understand key differences between germline and somatic variant discovery approaches
  • Apply analysis tools and Best Practices workflows to a real data set
  • Interpret analysis results and troubleshoot common problems
  • Write and execute WDL analysis pipelines
Aims

During this course you will learn about:

  • Pre-processing of high-throughput sequencing data
  • Variant discovery (germline and somatic short variants, somatic CNV)
  • Germline variant filtering and evaluation
  • Pipelining strategies
Format

Presentations, demonstrations and practicals

Timetable

Day 1 Topics
9:30 - 9:45 Opening remarks
9:45 - 10:15 Introduction to Sequence data / pre-processing workflow
10:15 - 10:45 Introduction to Germline variant discovery Best Practices workflows
10:45 - 11:15 Tea/coffee break
11:15-11:45 Introduction to Somatic variant discovery Best Practices workflows
11:45-12:15 Introduction to pipelining with WDL & Cromwell & FireCloud
12:15-12:30 Closing question time
12:30-13:30 Lunch (not provided)
13:30-13:55 Mapping
13:55-14:20 Marking duplicates
14:20-14:45 Base recalibration (BQSR)
14:45-15:15 Tea/coffee break
15:15-16:30 Hands-on IGV + GATK4 basics
Day 2 Topics
9:30 - 9:45 Recap of germline variant discovery Best Practices
9:45-10:15 HaplotypeCaller
10:15-10:45 Joint-calling with GenomicsDB + GenotypeGVCFs
10:45-11:15 Tea/coffee break
11:15-12:30 Hands-on joint-calling
12:30-13:30 Lunch (not provided)
13:30-14:00 Filtering with VQSR
14:00-14:30 Genotype Refinement
14:30-15:00 Callset Evaluation
15:00-15:30 Tea/coffee break
15:30-16:30 Hands-on filtering approaches
Day 3 Topics
9:30 - 9:45 Recap of somatic variant discovery Best Practices
9:45-10:30 Somatic SNVs and indels with Mutect2
10:30-11:00 Tea/coffee break
11:00-12:30 Hands-on Mutect2
12:30-13:30 Lunch (not provided)
13:30-14:00 Somatic CNVs with GATK CNV
14:00-15:15 Hands-on GATK CNV
15:15-15:45 Tea/coffee break
15:45-16:15 Preview of upcoming methods: germline CNV and SV
16:15-16:30 Open question time
Day 4 Topics
9:30 - 9:45 WDL/Cromwell 101
9:45-10:45 Hands-on WDL/Cromwell
10:45-11:15 Tea/coffee break
11:15-12:30 Self-paced WDL exercises
12:30-13:30 Lunch (not provided)
13:30-13:45 FireCloud 101
13:45-14:45 Hands-on FireCloud Part 1
14:45-15:15 Tea/coffee break
15:15-16:30 Hands-on FireCloud Part 2
Registration Fees
  • Free for University of Cambridge students
  • £ 50/day for all University of Cambridge staff, including postdocs, and participants from Affiliated Institutions. Please note that these charges are recovered by us at the Institutional level
  • It remains the participant's responsibility to acquire prior approval from the relevant group leader, line manager or budget holder to attend the course. It is requested that people booking only do so with the agreement of the relevant party as costs will be charged back to your Lab Head or Group Supervisor.
  • £ 50/day for all other academic participants from external Institutions and charitable organizations. These charges must be paid at registration
  • £ 100/day for all Industry participants. These charges must be paid at registration
  • Further details regarding the charging policy are available here
Duration

4

Frequency

Once a year

Related courses
Theme
Specialized Training

Booking / availability