Analysis of expression Proteomics data in R (IN-PERSON)
Prerequisites

Description

This workshop focuses on expression proteomics, which aims to characterise the protein diversity and abundance in a particular system. You will learn about the bioinformatic analysis steps involved when working with these kind of data, in particular several dedicated proteomics Bioconductor packages, part of the R programming language. We will use real-world datasets obtained from label free quantitation (LFQ) as well as tandem mass tag (TMT) mass spectrometry. We cover the basic data structures used to store and manipulate protein abundance data, how to do quality control and filtering of the data, as well as several visualisations. Finally, we include statistical analysis of differential abundance across sample groups (e.g. control vs. treated) and further evaluation and biological interpretation of the results via gene ontology analysis. By the end of this workshop you should have the skills to make sense of expression proteomics data, from start to finish.

If you do not have a University of Cambridge Raven account please book or register your interest here.

Additional information

♿ The training room is located on the first floor and there is currently no wheelchair or level access.
Our courses are only free for registered University of Cambridge students. All other participants will be charged according to our charging policy.
Attendance will be taken on all courses and a charge is applied for non-attendance, including for University of Cambridge students. After you have booked a place, if you are unable to attend any of the live sessions, please email the Bioinfo Team.
Further details regarding eligibility criteria are available here.
Guidance on visiting Cambridge and finding accommodation is available here.

Target audience

The course is targeted to either proteomics practitioners or data analysts/bioinformaticians that would like to learn how to use R to analyse proteomics data.
Familiarity with mass spectrometry or proteomics in general is desirable, but not essential as we will walk through a MS typical experiment and data as part of learning about the tools.

Prerequisites

Basic understanding of mass spectometry.
- Watch this iBiology video for an excellent overview.
A working knowledge of R and the tidyverse (course registration page).
- If you are not able to attend this prerequisite course, please work through our R materials ahead of the course.
Familiarity with other Bioconductor data classes, such as those used for RNA-seq analysis, is useful but not required.

Topics covered

Bioinformatics, Biology, Data handling, Data visualisation, Proteomics, Bioconductor

Objectives

During this course you will learn about:

How mass spectrometry can be used to quantify protein abundance and some of the methods used for peptide quantitation.
The bioinformatics steps involved in processing and analysing expression proteomics data.
How to assess the quality of your data, deal with missing values and summarise peptide-level data to protein-level.
How to perform differential expression analysis to compare protein abundances between different groups of samples.

Aims

After this course, you should be able to:

Import data into R/Biocondutor, starting from the files produced by third party software such as Proteome Discoverer, MaxQuant and FragPipe.
Manipulate protein expression data using dedicated data structures that are used to store these multi-dimensional datasets.
Produce several visualisations to help assess the quality of your data and explore and communicate your results.
Recognise the importance of data normalisation and the methods used to achieve it.
Find differentially expressed proteins between groups of samples and annotate the results using gene ontology analysis.

Format

Presentations and practicals

Timetable

Day	Time	Topics
Day 1	9:30 - 09:40	Welcome
	9:40 - 10:15	Introduction
	10:15 - 11:15	Import and infrastructure
	11:15 - 11:30	Break
	11:30 - 12:30	Data cleaning: filtering
	12:30 - 13:30	Lunch (not provided)
	13:30 - 15:00	Data cleaning: FDR and missing data
	15:00 - 15:15	Break
	15:15 - 17:00	Data normalisation and aggregation

Day 2	9:30 - 11:00	Exploration and visualisation of protein data
	11:00 - 11:15	Break
	11:15 - 12:30	Statistical analysis
	12:30 - 13:30	Lunch (not provided)
	13:30 - 15:00	Statistical analysis: diagnostics, interpretation and visualisation
	15:00 - 15:15	Break
	15:15 - 17:00	[if time allows] GO analysis

Registration fees

Free for registered University of Cambridge students
£ 60/day for all University of Cambridge staff, including postdocs, temporary visitors (students and researchers) and participants from Affiliated Institutions. Please note that these charges are recovered by us at the Institutional level
It remains the participant's responsibility to acquire prior approval from the relevant group leader, line manager or budget holder to attend the course. It is requested that people booking only do so with the agreement of the relevant party as costs will be charged back to your Lab Head or Group Supervisor.
£ 60/day for all other academic participants from external Institutions and charitable organizations. These charges must be paid at registration
£ 120/day for all Industry participants. These charges must be paid at registration
Further details regarding the charging policy are available here

Duration

Frequency

twice per year

Related courses

Theme

Bioinformatics

Analysis of expression Proteomics data in R (IN-PERSON)
Prerequisites

Contact training provider

Privacy policy
Cookie policy

Study at Cambridge

About the University

Research at Cambridge

Analysis of expression Proteomics data in R (IN-PERSON) Prerequisites

Contact training provider

Privacy policy Cookie policy

Study at Cambridge

About the University

Research at Cambridge

Analysis of expression Proteomics data in R (IN-PERSON)
Prerequisites

Privacy policy
Cookie policy