Working on HPC clusters (ONLINE LIVE TRAINING) Prerequisites
Knowing how to use High Performance Computing (HPC) systems is crucial for fields such as bioinformatics, big data analysis, image processing, machine learning, parallel task execution, and other high-throughput applications.
In this introductory course, you will learn the fundamentals of HPC, including what it is and how to effectively utilise it. We will cover best practices for working with HPC systems, explain the roles of "login" and "compute" nodes, outline the typical filesystem organization on HPC clusters, and cover job scheduling with the widely-used SLURM scheduler.
This hands-on workshop is designed to be accessible to researchers from various backgrounds, providing numerous opportunities to practice and apply the skills you acquire.
As an optional session for those interested, we will also introduce the (free) HPC facilities available at Cambridge University (the course is not otherwise Cambridge-specific).
If you do not have a University of Cambridge Raven account please book or register your interest here.
- Our courses are only free for registered University of Cambridge students. All other participants will be charged according to our charging policy.
- Attendance will be taken on all courses and a charge is applied for non-attendance. After you have booked a place, if you are unable to attend any of the live sessions, please email the Bioinfo Team.
- Further details regarding eligibility criteria are available here.
- This course is aimed at students and researchers of any background.
- We assume no prior knowledge of what a HPC is or how to use it.
- It may be particularly useful for those who have attended other Facility Bioinformatics Training Courses and now need to process their data on a Linux server. It will also benefit those who find themselves using their personal computers to run computationally demanding analysis/simulations and would like to learn how to adapt these to run on a HPC.
- A working knowledge of the UNIX command line is absolutely essential (course registration page).
- If you are not able to attend this prerequisite course, please work through our Unix command line materials ahead of the course (up to section 8).
Number of sessions: 3
# | Date | Time | Venue | Trainers |
---|---|---|---|---|
1 | Mon 21 Oct 09:30 - 13:00 | 09:30 - 13:00 | Bioinformatics Training Facility - Online LIVE Training | Lajos Kalmar, Raquel Manzano-Garcia, Dr Bajuna Salehe, Victor Flores, Shanlin Rao |
2 | Tue 22 Oct 09:30 - 13:00 | 09:30 - 13:00 | Bioinformatics Training Facility - Online LIVE Training | Lajos Kalmar, Raquel Manzano-Garcia, Dr Bajuna Salehe, Shanlin Rao |
3 | Wed 23 Oct 09:30 - 13:00 | 09:30 - 13:00 | Bioinformatics Training Facility - Online LIVE Training | Lajos Kalmar, Raquel Manzano-Garcia, Dr Bajuna Salehe, Shanlin Rao |
High Performance Computing (HPC), remote servers, shell scripting, SLURM
During this course you will learn about:
- What is a HPC and how does it differ from a regular computer?
- How do I access and work on a HPC?
- How do I run jobs on a HPC?
- How can I run many similar jobs in parallel?
- How can I access, install and manage software on a HPC?
After this course you should be able to:
- Describe what a HPC is and how it is generally organised.
- Distinguish between a login and a compute node.
- Connect to a HPC and navigate through its filesystem using the command-line.
- Move files in/out of the HPC using Filezilla or alternative command-line tools.
- Edit script files directly on a remote server using Visual Studio Code.
- Describe the role of a Job Scheduler and what resources to consider when running jobs.
- Use the SLURM job scheduler to run analysis on the HPC.
- Customise the use of SLURM and take advantage of its inbuilt “job arrays” feature to parallelise similar jobs.
- Obtain an account on the Cambridge University HPC server, and apply the knowledge learned here to use it effectively for your own work.
Presentations, demonstrations and practicals.
- Participants must have their own computers to work on and a stable internet connection for the duration of the course.
- We encourage you to use your own computer for this course. This is so you can leave the course prepared to work on a HPC that you may have access to in your institution.
- Please install the necessary software following our setup instructions. If you have any issues installing the software, please get in touch with us some time before the course.
Day | Time | Topics | |
---|---|---|---|
Day 1 | 09:30 - 09:40 | Welcome | |
09:40 - 10:45 | Introduction to HPC | ||
10:45 - 11:00 | Break | ||
11:00 - 11:30 | Connecting to a HPC cluster | ||
11:30 - 13:00 | Using the SLURM job scheduler | ||
Day 2 | 9:30 - 10:30 | Using the SLURM job scheduler (cont.) | |
10:30 - 11:30 | Managing software | ||
11:30 - 11:45 | Break | ||
11:45 - 13:00 | Parallelising jobs with arrays | ||
Day 3 | 09:30 - 10:15 | Parallelising Jobs with Arrays (cont.) | |
10:15 - 11:15 | Job dependencies | ||
11:15 - 11:30 | Break | ||
11:30 - 12:00 | Moving files | ||
12:00 - 13:00 | HPC resources at Cambridge University |
- Free for registered University of Cambridge students
- £ 60/day for all University of Cambridge staff, including postdocs, temporary visitors (students and researchers) and participants from Affiliated Institutions. Please note that these charges are recovered by us at the Institutional level
- It remains the participant's responsibility to acquire prior approval from the relevant group leader, line manager or budget holder to attend the course. It is requested that people booking only do so with the agreement of the relevant party as costs will be charged back to your Lab Head or Group Supervisor.
- £ 60/day for all other academic participants from external Institutions and charitable organizations. These charges must be paid at registration
- £ 120/day for all Industry participants. These charges must be paid at registration
- Further details regarding the charging policy are available here
1.5
A number of times per year
Booking / availability