Bioinformatics Training - Bioinformatics courses

Analysis of single cell RNA-seq data (IN-PERSON) Thu 16 May 2024 09:30 Finished

Recent technological advances have made it possible to obtain genome-wide transcriptome data from single cells using high-throughput sequencing (scRNA-seq). Even though scRNA-seq makes it possible to address problems that are intractable with bulk RNA-seq data, analysing scRNA-seq is also more challenging.

In this course we will be surveying the existing problems as well as the available computational and statistical frameworks available for the analysis of scRNA-seq.

If you do not have a University of Cambridge Raven account please book or register your interest here.

Additional information

♿ The training room is located on the first floor and there is currently no wheelchair or level access.
Our courses are only free for registered University of Cambridge students. All other participants will be charged according to our charging policy.
Attendance will be taken on all courses and a charge is applied for non-attendance, including for University of Cambridge students. After you have booked a place, if you are unable to attend any of the live sessions, please email the Bioinfo Team.
Further details regarding eligibility criteria are available here.
Guidance on visiting Cambridge and finding accommodation is available here.

Analysis of small RNA-seq data

Tue 2 May 2017 09:30 Finished

This course focuses on methods for the analysis of small non-coding RNA data obtained from high-throughput sequencing (HTS) applications (small RNA-seq). During the course, approaches to the investigation of all classes of small non-coding RNAs will be presented, in all organisms.

Day 1 will focus on the analysis of microRNAs and day 2 will cover the analysis of other types of small RNAs, including Piwi-interacting (piRNA), small interfering (siRNA), small nucleolar (snoRNA) and tRNA-derived (tsRNA).

Please note that if you are not eligible for a University of Cambridge Raven account you will need to book by linking here.

An Introduction to Biological Networks & their Visualization (Webinar) Fri 10 Jul 2020 10:00 Finished

This webinar is an Introduction to Biological Networks, their types, and applications. It will include two of the most commonly used open source Network Visualisation Platforms (R-igraph and Cytoscape) with step-wise protocols for creating and visualising your own data as a network. It will present some of the major layout algorithms, visual styles and tips for effective visualisation, with examples from biology revealing how these can improve analysis and provide insights.

The webinar will be presented in the form of a lecture as well as a tutorial with step-wise screenshots that enable listeners to emulate simple Network creation and analysis. Please note that this is a webinar and not a coding exercise. Links to publicly available resources and hands-on tutorials will be shared with you for further reading and practice.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to book or register your interest by linking here.

An Introduction to Data Exploration, Experimental Design, and Biomarker Expression Analysis using JMP Software Tools Thu 10 Oct 2019 13:00 Finished

Through the use of real world examples and the JMP, JMP Pro, and JMP Genomics software, we will cover best practices used in both industry and academia today to visually explore data, plan biological experiments, detect differential expression patterns, find signals in next-generation sequencing data and easily discover statistically appropriate biomarker profiles and patterns.

The training room is located on the first floor and there is currently no wheelchair or level access available to this level.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to book or register your interest by linking here.

An introduction to long-read sequencing

Thu 13 Feb 2020 09:30 Finished

Analysis of whole genome data unearths a multitude of variants of different classes, which need to be filtered, annotated and validated to arrive at a causative variant for a disease. The current short length sequences, whilst being excellent at identifying single nucleotide variants and short insertions/deletions, struggle to correctly map structural variants (SVs). Long-read sequencing technologies offer improvements in the characterisation of genetic variation and regions that are difficult to assess with short-read sequences.

The aim of this course is to familiarise participants with long read sequencing technologies, their applications and the bioinformatics tools used to assemble this kind of data. Lectures will introduce this technology and provide insight into methods for the analysis of genomic data, while the hands-on sessions will allow participants to run analysis pipelines, focusing on data generated by the Oxford Nanopore Technologies (ONT) platform.

The training room is located on the first floor and there is currently no wheelchair or level access available to this level.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to book or register your interest by linking here.

An Introduction to Machine Learning (ONLINE LIVE TRAINING) Mon 17 Jul 2023 09:30 Finished

THIS COURSE IS NOT RETURNING IN ITS CURRENT FORM. PLEASE CHECK OUR WEBSITE FOR MORE INFORMATION.

Machine learning gives computers the ability to learn without being explicitly programmed. It encompasses a broad range of approaches to data analysis with applicability across the biological sciences. Lectures will introduce commonly used algorithms and provide insight into their theoretical underpinnings. In the practicals students will apply these algorithms to real biological data-sets using the R language and environment.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to book or register your interest by linking here.

An Introduction to MATLAB for biologists (ONLINE LIVE TRAINING) Mon 12 Jul 2021 09:30 Finished

PLEASE NOTE The Bioinformatics Team are presently teaching as many courses live online, with tutors available to help you work through the course material on a personal copy of the course environment. We aim to simulate the classroom experience as closely as possible, with opportunities for one-to-one discussion with tutors and a focus on interactivity throughout.

This course aims to give you an introduction to the basics of Matlab. During the two day course we will use a practical based approach to give you the confidence to start using Matlab in your own work. In particular we will show you how to write your own scripts and functions and how to use pre-written functions. We will also explore the many ways in which help is available to Matlab users. In addition we will cover basic computer programming in Matlab to enable you to write more efficient scripts.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to Book or register Interest by linking here.

An Introduction to R: Software For Statistical Analysis Mon 26 Oct 2015 13:30 Finished

The aim of this course is to introduce participants to the basics of statistical analysis and the open source statistical software R, a free software environment for statistical computing and graphics.

Participants will actively use R throughout the course, during which they will be introduced to principles of statistical thinking and interpretation by example, exercises and discussion about a range of problems. The examples will be used to present a variety of statistical concepts and techniques, with no focus on any specific discipline.

Important information: We have 12 configured laptops for use at the workshop. After these laptops have been allocated, participants will either need to share, or bring their own. These laptops will be allocated to the first individuals to express an interest in using them. When booking, please indicate under "Special requirements" if you wish to use one of the 12 laptops or bring your own. Participants bringing their own laptop will be given instructions on what software to install.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to book or register interest by linking here.

An Introduction to Solving Biological Problems with PERL Mon 4 Dec 2017 09:30 Finished

This course is aimed at those new to programming and provides an introduction to programming using Perl.

During this course you will learn the basics of the Perl programming language, including how to store data in Perl’s standard data structures such as arrays and hashes, and how to process data using loops, functions, and many of Perl’s built in operators. You will learn how to write and run your own Perl scripts and how to pass options and files to them. The course also covers sorting, regular expressions, references and multi-dimensional data structures.

The course will be taught using the online Learning Perl materials created by Sofia Robb of the University of California Riverside.

The course website providing links to the course materials is here.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to book by linking here.

An Introduction to Solving Biological Problems with R Tue 11 Jun 2019 09:30 Finished

Please note that this course has been discontinued and has been replaced by the Introduction to R for biologists.

R is a highly-regarded, free, software environment for statistical analysis, with many useful features that promote and facilitate reproducible research.

In this course, we give an introduction to the R environment and explain how it can be used to import, manipulate and analyse tabular data. After the course you should feel confident to start exploring your own dataset using the materials and references provided.

The course website providing links to the course materials is here.

Please note that although we will demonstrate how to perform statistical analysis in R, we will not cover the theory of statistical analysis in this course. Those seeking an in-depth explanation of how to perform and interpret statistical tests are advised to see the list of Related courses. Moreover, those with some programming experience in other languages (e.g. Python, Perl) might wish to attend the follow-on Data Analysis and Visualisation in R course.

This event is supported by the BBSRC Strategic Training Awards for Research Skills (STARS) grant (BB/P022766/1).

The training room is located on the first floor and there is currently no wheelchair or level access available to this level.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to book or register your interest by linking here.

Bacterial Genome Assembly and Annotation in Galaxy

Thu 8 Jun 2017 09:30 Finished

The workshop will cover the basics of de novo genome assembly using a small genome example. This includes project planning steps, selecting fragment sizes, initial assembly of reads into fully covered contigs, and then assembling those contigs into larger scaffolds that may include gaps. The end result will be a set of contigs and scaffolds with sufficient average length to perform further analysis on, including genome annotation (link to that nomination). This workshop will use tools and methods targeted at small genomes. The basics of assembly and scaffolding presented here will be useful for building larger genomes, but the specific tools and much of the project planning will be different.

This workshop will also introduce genome annotation in the context of small genomes. We’ll begin with genome annotation concepts, and then introduce resources and tools for automatically annotating small genomes. The workshop will finish with a review of options for further automatic and manual tuning of the annotation, and for maintaining it as new assemblies or information becomes available.

This session will include an introduction to the Galaxy platform.

This event is co-organized with EMBL-ABR and the Genomics Virtual Lab. Course materials can be found here.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to book by linking here.

Basic statistics and data handling Wed 28 Feb 2018 09:30 Finished

This three day course is intended to open doors to applying statistics - whether directly increasing skills and personally undertaking analyses, or by expanding knowledge towards identifying collaborators. The end goal is to drive confident engagement with data analysis and further training - increasing the quality and reliability of interpretation, and putting that interpretation and subsequent presentation into the hands of the researcher. Each day of the course will deliver a mixture of lectures, workshops and hands-on practicals – and will focus on the following specific elements.

Day 1 focuses on basic approaches and the computer skills required to do downstream analysis. Covering: Basic skills for data manipulation in R. How to prepare your data effectively. Principles of experimental design and how this influences analysis.

On day 2, participants will explore the core concepts of statistics – so that they can begin to see how they can be applied to their own work, and to also help with better critical evaluation of the work of others. Covering: Basic statistics concepts and practice: power, variability, false discovery, t-test, effect size, simulations to understand what a p-value means.

On day 3 we will continue to explore core concepts of statistics, focusing on linear regression and multiple testing correction.

Course materials are available here.

This event is supported by the BBSRC Strategic Training Awards for Research Skills (STARS) grant (BB/P022766/1).

Please note that if you are not eligible for a University of Cambridge Raven account you will need to book by linking here.

Big Data and Cloud Computing

Fri 1 Jun 2018 09:30 Finished

Recent advances in genomics, proteomics, imaging and other technologies, have resulted in data being generated at a faster rate than they can be meaningfully analysed. In this course we will show you how cloud computing can be used to meet the challenges of storage, management and analysis of big data. The first half of the course will introduce cloud infrastructure technologies. The second half will cover tools for collaborative working, resource management, and creation of workflows. The instructors will demonstrate how they are using cloud computing in their own research.

N.B. If you sign up for this course, you will be automatically registered for an AWS educate account, which will provide you with sufficient AWS credits to complete the course exercises. If you decide to continue using cloud computing after the course, you will need to either purchase more credits or apply for a grant from programs like: AWS Cloud Credits for Research, Microsoft Azure for Research or Google Cloud Platform Education Grants.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to book by linking here.

BIIT web-tools for high-throughput data analysis from ELIXIR-Estonia

Thu 12 May 2016 09:30 Finished

In this course we will introduce web-based, open source tools to analyse and interpret high-throughput biological data.

The main focus will be g:Profiler - a toolset for finding most significant functional groups for a given gene or protein list; MEM - a query engine allowing to mine hundreds of public gene expression datasets to find most co-expressed genes based on a query gene; and ClustVis - a web tool for visualizing clustering of multivariate data using Principal Component Analysis (PCA) plot and heatmap.

MEM and g:Profiler are ELIXIR-Estonia node services.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to book by linking here.

Bioinformatics for Principal Investigators Mon 16 Sep 2019 09:30 Finished

The aim of this workshop is to provide principal investigators with an introduction to the challenges of working with biological data and to the best practices, and tools, needed to perform bioinformatics research effectively and reproducibly.

On day 1, we will cover the importance of experimental design, discuss the challenges associated with (i) the analysis of high-throughput sequencing data (utilising RNA-seq as a working example) and (ii) the application of machine learning algorithms, as well as issues relating to reusability and reproducibility.

On day 2, we will put into practice concepts from day 1, running a RNA-seq data analysis pipeline, going from raw reads through differential expression analysis and the interpretation of downstream analysis results. Challenges encountered at each step of the analytical pipeline will be discussed. Please note that day 2 is optional.

The training room is located on the first floor and there is currently no wheelchair or level access available to this level.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to book or register your interest by linking here.

Biological data analysis using the InterMine API (Online)

Tue 17 Nov 2020 13:00 Finished

PLEASE NOTE The Bioinformatics Team are presently teaching as many courses live online. We aim to simulate the classroom experience as closely as possible, with opportunities for one-to-one discussion with tutors and a focus on interactivity throughout.

InterMine is a freely available open-source data warehouse built specifically for the integration and analysis of complex biological data sets.

InterMine-based data analysis platforms are available for many organisms including mouse, rat, budding yeast, plants (over 87 plant genomes), nematodes, fly, zebrafish, Hymenoptera, Planaria, and more recently human.

Genomic and proteomic data within InterMine databases includes pathways, gene expression, interactions, sequence variants, GWAS, regulatory data and protein expression. InterMine provides sophisticated query and visualisation tools both through a web interface and a powerful web service API, with multiple language bindings including Python and R.

This course will focus on programmatic access to InterMine through the API and InterMine searches will be done using Python and R scripts. The exercises will mainly use the fly, human and mouse databases, but the course is applicable to anyone working with data for which an InterMine database is available (a comprehensive list of InterMine databases is available here.

This event is organised alongside a half day course on Biological data analysis using the InterMine User Interface. More information on this event are available here.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to Book or register Interest by linking here.

Biological data analysis using the InterMine User Interface (Online) Mon 16 Nov 2020 13:00 Finished

PLEASE NOTE The Bioinformatics Team are presently teaching as many courses live online. We aim to simulate the classroom experience as closely as possible, with opportunities for one-to-one discussion with tutors and a focus on interactivity throughout.

InterMine is a freely available open-source data warehouse built specifically for the integration and analysis of complex biological data.

InterMine-based data analysis platforms are available for many organisms including mouse, rat, budding yeast, plants (over 87 plant genomes), nematodes, fly, zebrafishHymenoptera, Planaria, and more recently human.

Genomic and proteomic data within InterMine databases includes pathways, gene expression, interactions, sequence variants, GWAS, regulatory data and protein expression. InterMine provides sophisticated query and visualisation tools both through a web interface and a powerful web service API, with multiple language bindings including Python and R.

This course will focus on the InterMine web interface and will introduce participants to all aspects of the user interface, starting with some simple exercises and building up to more complex analysis encompassing several analysis tools and comparative analysis across organisms. The exercises will mainly use the fly, human and mouse databases, but the course is applicable to anyone working with data for which an InterMine database is available (a comprehensive list of InterMine databases is available here.)

This event is organised alongside a half day course on Biological data analysis using the InterMine API. More information on this event is available here.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to Book or register Interest by linking here.

Biological Imaging Data Management for Facility Managers

Wed 13 Jul 2016 09:30 Finished

The Open Microscopy Environment (OME) is an open-source software project that develops tools that enable access, analysis, visualization, sharing and publication of biological image data.

OME has three components:

OME-TIFF, standardised file format and data model;
Bio-Formats, a software library for reading proprietary image file formats; and
OMERO, a software platform for image data management and analysis.

In this one day course, we will present the OMERO platform, and show how Facility Managers can use it to manage users, groups, and their microscopy, HCS and digital pathology data.

Help pages on 'Using OMERO for Facility Managers' can be found here.

This course is organized alongside a one day course on Biological Imaging Data Management for Life Scientists. More information on this event are available here.

This course will be delivered by members of the OMERO team. The OME project is supported by BBSRC and Wellcome Trust.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to book or register your interest by linking here.

Biological Imaging Data Management for Life Scientists

Thu 7 Dec 2017 09:30 Finished

The Open Microscopy Environment (OME) is an open-source software project that develops tools that enable access, analysis, visualization, sharing and publication of biological image data.

OME has three components:

OME-TIFF, standardised file format and data model;
Bio-Formats, a software library for reading proprietary image file formats; and
OMERO, a software platform for image data management and analysis.

In this one day course, we will present the OMERO platform, and show how to import, organise, view, search, annotate and publish imaging data. Additionally, we will briefly introduce how to use a variety of processing tools with OMERO.

This course is organized alongside a one day course on Biological Imaging Data Processing for Data Scientists. More information on this event are available here.

This course will be delivered by members of the OMERO team. The OME project is supported by BBSRC and Wellcome Trust.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to book or register your interest by linking here.

Biological Imaging Data Processing for Data Scientists

Fri 8 Dec 2017 09:30 Finished

The Open Microscopy Environment (OME) is an open-source software project that develops tools that enable access, analysis, visualization, sharing and publication of biological image data.

OME has three components:

OME-TIFF, standardised file format and data model;
Bio-Formats, a software library for reading proprietary image file formats; and
OMERO, a software platform for image data management and analysis.

In this one day course, we will present the OMERO platform, and show how to transition from manual data processing to automated processing workflows. We will introduce how to write applications against the OMERO API, how to integrate a variety of processing tools with OMERO and how to automatically generate output ready for publication.

This course is organized alongside a one day course on Biological Imaging Data Management for Life Scientists. More information on this event are available here.

This course will be delivered by members of the OMERO team. The OME project is supported by BBSRC and Wellcome Trust.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to book or register your interest by linking here.

Building Computational Pipelines with Snakemake (IN PERSON)

Fri 26 May 2023 09:30 Finished

High-throughput data analyses usually involve many data processing steps, including the use of a range of command line tools and scripts to transform, filter, aggregate and visualise data. Each tool may require a specific set of inputs and options to be defined and, as we chain multiple tools together, this can become challenging to manage. As analyses pipelines become more complex and with the ever-increasing amounts of data being collected in research, reproducible and scalable automatic workflow management becomes increasingly important.

The Snakemake workflow management system is a tool to create reproducible and scalable data analyses pipelines/workflows. Workflows are described via a human-readable, Python-based language. They can be seamlessly scaled to server, cluster, grid and cloud environments, without the need to modify the workflow definition. Finally, Snakemake workflows can entail a description of the required software, which will be automatically deployed to any execution environment.

With over 500k downloads on Bioconda, and over 2k citations, Snakemake is a widely used and accepted standard for reproducible data science that has powered numerous research goals and publications.

This 1-day workshop will cover the principles for building workflows using Snakemake, as well as more advanced strategies to fully customise, automate and scale your analysis.

The training room is located on the first floor and there is currently no wheelchair or level access available to this level.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to Book or register Interest by linking here.

Complex Network Analysis for Biologists (ONLINE LIVE TRAINING) Thu 9 Dec 2021 09:30 Finished

PLEASE NOTE The Bioinformatics Team are presently teaching many courses live online, with tutors available to help you work through the course material on a personal copy of the course environment. We continue to monitor advice from the UK government and the University of Cambridge on resuming in-person teaching back in the training room.

Complex natural systems permeate many aspects of everyday life—including human intelligence, social media, biomedicine, agriculture, economics, even our personal and professional relationships. The past decade has seen intensification of research into structural and dynamical properties of complex networks. This course will introduce the basic principles of network theory, and hands-on DIY Network analysis using Cytoscape, one of the most widely used global platforms for construction and analysis of biomolecular networks such as gene regulatory interactions, protein complexes, hydrogen-bonding meshwork in active sites and neuronal networks. The aim is to conceptualize your own textual, tabular or genomic datasets as networks, and to understand how simple topological features can help to decipher complex properties of systems and processes.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to book or register your interest by linking here.

Core Statistics Mon 9 Nov 2020 10:00 Finished

PLEASE NOTE that this course will be taught live online, with demonstrators available to help you throughout if have any questions. All lecture components will be recorded and uploaded to the course Moodle page so that you will be able to access that information even if technical or time zone restrictions means that you aren't able to join us for the live sessions.

This virtually delivered course is intended to provide a strong foundation in practical statistics and data analysis using the R or Python software environments. The underlying philosophy of the course is to treat statistics as a practical skill rather than as a theoretical subject and as such the course focuses on methods for addressing real-life issues in the biological sciences.

There are three core goals for this course:

Use R or Python confidently for statistics and data analysis
Be able to analyse datasets using standard statistical techniques
Know which tests are and are not appropriate

Both R and Python are free software environments that are suitable for statistical and data analysis.

In this course, we explore classical statistical analysis techniques starting with simple hypothesis testing and building up to linear models and power analyses. The focus of the course is on practical implementation of these techniques and developing robust statistical analysis skills rather than on the underlying statistical theory.

After the course you should feel confident to be able to select and implement common statistical techniques using R or Python and moreover know when, and when not, to apply these techniques.

Core Statistics using R (IN-PERSON) Wed 10 Jul 2024 09:30 [Places]

This award winning course is intended to provide a strong foundation in practical statistics and data analysis using the R software environment. The underlying philosophy of the course is to treat statistics as a practical skill rather than as a theoretical subject and as such the course focuses on methods for addressing real-life issues in the biological sciences.

There are three core goals for this course:

Use R confidently for statistics and data analysis
Be able to analyse datasets using standard statistical techniques
Know which tests are and are not appropriate

R is an open source programming language so all of the software we will use in the course is free.

In this course, we explore classical statistical analysis techniques starting with simple hypothesis testing and building up to linear models and power analyses. The focus of the course is on practical implementation of these techniques and developing robust statistical analysis skills rather than on the underlying statistical theory.

After the course you should feel confident to be able to select and implement common statistical techniques using R and moreover know when, and when not, to apply these techniques.

If you do not have a University of Cambridge Raven account please book or register your interest here.

Additional information

♿ The training room is located on the first floor and there is currently no wheelchair or level access.
Our courses are only free for registered University of Cambridge students. All other participants will be charged according to our charging policy.
Attendance will be taken on all courses and a charge is applied for non-attendance, including for University of Cambridge students. After you have booked a place, if you are unable to attend any of the live sessions, please email the Bioinfo Team.
Further details regarding eligibility criteria are available here.
Guidance on visiting Cambridge and finding accommodation is available here.

Core Statistics using R (ONLINE LIVE TRAINING) Wed 8 Sep 2021 14:00 Finished

The Bioinformatics Team are presently teaching this course live online, with tutors available to help you throughout if have any questions. We continue to monitor advice from the UK government and the University of Cambridge on resuming in-person teaching in our training room.

This award winning virtually delivered course is intended to provide a strong foundation in practical statistics and data analysis using the R software environment. The underlying philosophy of the course is to treat statistics as a practical skill rather than as a theoretical subject and as such the course focuses on methods for addressing real-life issues in the biological sciences.

There are three core goals for this course:

Use R confidently for statistics and data analysis
Be able to analyse datasets using standard statistical techniques
Know which tests are and are not appropriate

R is an open source programming language so all of the software we will use in the course is free.

In this course, we explore classical statistical analysis techniques starting with simple hypothesis testing and building up to linear models and power analyses. The focus of the course is on practical implementation of these techniques and developing robust statistical analysis skills rather than on the underlying statistical theory.

After the course you should feel confident to be able to select and implement common statistical techniques using R and moreover know when, and when not, to apply these techniques.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to book or register your interest by linking here.

All Bioinformatics courses

Please note that this course has been discontinued and has been replaced by the Introduction to R for biologists.

Contact training provider

Privacy policy
Cookie policy

Study at Cambridge

About the University

Research at Cambridge

All Bioinformatics courses

Please note that this course has been discontinued and has been replaced by the Introduction to R for biologists.

Contact training provider

Privacy policy Cookie policy

Study at Cambridge

About the University

Research at Cambridge

Privacy policy
Cookie policy