Introduction to RNA-seq data analysis Updated
The aim of this course is to familiarize the participants with the primary analysis of RNA-seq data.
This course starts with a brief introduction to RNA-seq and discusses quality control issues. Next, we will present the alignment step, quantification of expression and differential expression analysis. For downstream analysis we will focus on tools available through the Bioconductor project for manipulating and analysing bulk RNA-seq.
Please note that if you are not eligible for a University of Cambridge Raven account you will need to book or register your interest by linking here.
- Graduate students, Postdocs and Staff members from the University of Cambridge, Affiliated Institutions and other external Institutions or individuals
- Please be aware that these courses are only free for University of Cambridge students. All other participants will be charged a registration fee in some form. Registration fees and further details regarding the charging policy are available here.
- Further details regarding eligibility criteria are available here
- Basic experience of command line UNIX
- Sufficient UNIX experience might be obtained from one of the many UNIX tutorials available online.
- Basic knowledge of the R syntax
- For a real beginner's introduction into R see here. More advanced R instructions can be found at Quick-R or An Introduction to R
Number of sessions: 3
# | Date | Time | Venue | Trainers |
---|---|---|---|---|
1 | Mon 3 Sep 2018 09:30 - 17:30 | 09:30 - 17:30 | Bioinformatics Training Room, Craik-Marshall Building | G.E. Parada-Gonzalez, Xiaopei Su, Ashley Sawle |
2 | Tue 4 Sep 2018 09:30 - 17:30 | 09:30 - 17:30 | Bioinformatics Training Room, Craik-Marshall Building | G.E. Parada-Gonzalez, Oscar Rueda, Xiaopei Su, Abigail Edwards |
3 | Wed 5 Sep 2018 09:30 - 17:30 | 09:30 - 17:30 | Bioinformatics Training Room, Craik-Marshall Building | Ashley Sawle, Dr S. Ballereau, Abigail Edwards |
Bioinformatics, Functional genomics, Data visualisation, Transcriptomics, Data handing, Data mining, RNA-seq,
After this course you should be able to:
- Design properly your RNA-Seq experiments considering advantages and limitations of RNA-seq assays
- Assess the quality of your datasets
- Perform alignment and quantification of expression through different tools and pipelines
- Know what tools are available in Bioconductor for RNA-seq data analysis and understand the basic object-types that are utilised
- Produce a list of differentially expressed genes from an RNA-seq experiment
During this course you will learn about:
- RNA sequencing technology and considerations on experiment design
- Quality control of raw sequencing reads: FASTQC and fastx toolkit
- Read alignment to a reference genome: Hisat2
- Extract information from SAM/BAM files: samtools
- Transcriptome reconstruction: stringtie
- Transcriptome merging, comparison and quantification: gffcompare and stringtie
- Genome alignment and read count quantification: STAR
- Lightweight alignment and quantification: Salmon
- Sources of variation in RNA-seq data
- Differential expression analysis using edgeR and DEseq
- Annotation resources in Bioconductor
- Identifying over-represented gene sets among a list of differentially expressed genes
Presentations, demonstrations and practicals
Day 1 | Topics |
09:30 - 10:30 | Lecture: Introduction to RNA-seq |
10:30 - 10:45 | Tea/coffee break |
10:45 - 12:00 | Practical: RNA-seq analysis - qc and adapter removal |
12:00 - 13:00 | Lunch (not provided) |
13:00 - 15:00 | Practical: RNA-seq analysis - alignment and transcriptome reconstruction |
15:00 - 15:15 | Tea/coffee break |
15:15 - 17:30 | Practical: RNA-seq analysis - transcriptome merging, comparison and quantification |
Day 2 | |
09:30 - 10:00 | Practical: RNA-seq analysis - Alignment and quantification with STAR |
10:00 - 10:15 | Tea/coffee break |
10:15 - 12:00 | Practical: RNA-seq analysis - Salmon quasi-alignment |
12:00 - 13:00 | Lunch (not provided) |
13:00 - 15:00 | Lecture/practical: Linear Models and Statistics for Differential Expression |
15:00 - 15:15 | Tea/coffee break |
15:15 - 17:30 | Lecture/practical: Linear Models and Statistics for Differential Expression |
Day 3 | |
09:30 - 11:00 | Importing and Preprocessing in R |
11:00 - 11:15 | Tea/coffee break |
11:15 - 12:30 | Differential Expression |
12:30 - 13:30 | Lunch (not provided) |
13:30 - 15:00 | Annotation and Visualisation of Differential Expression |
15:00 - 15:15 | Tea/coffee break |
15:15 - 17:00 | Gene set analysis and Gene Ontology testing |
- Free for University of Cambridge students
- £ 50/day for all University of Cambridge staff, including postdocs, and participants from Affiliated Institutions. Please note that these charges are recovered by us at the Institutional level
- It remains the participant's responsibility to acquire prior approval from the relevant group leader, line manager or budget holder to attend the course. It is requested that people booking only do so with the agreement of the relevant party as costs will be charged back to your Lab Head or Group Supervisor.
- £ 50/day for all other academic participants from external Institutions and charitable organizations. These charges must be paid at registration
- £ 100/day for all Industry participants. These charges must be paid at registration
- Further details regarding the charging policy are available here
3
2 times a year
- Introduction to high-throughput sequencing data analysis
- Analysis of small RNA-seq data
- Single-cell RNA-seq analysis (ONLINE LIVE TRAINING)
- Analysis of DNA Methylation using Sequencing (IN-PERSON)
- Introduction to genome variation analysis using NGS
Booking / availability