Cambridge Digital Humanities - Cambridge Digital Humanities courses

Computer Vision: A critical introduction

Tue 25 May 2021 10:00 Finished

Machine vision systems can potentially help humanities researchers see historical and cultural image collections differently, and could provide tools to answer new research questions. This CDH Basics session provides an introductory overview of basic tasks in machine vision, such as Image Classification, Object Detection and Image Captioning within a critical framework highlighting the challenges of algorithmic bias and the limits of automation as a method for humanistic enquiry.

Creating Databases from Historical Sources (Workshop) Mon 25 Feb 2019 11:00 Finished

This workshop will examine strategies for transforming a variety of sources into structured digital data, ranging from crumbling manuscripts to printed documents and books.

(Critical) Machine Vision for the Humanities [remote delivery]

Tue 9 Jun 2020 15:00 Finished

Leonardo Impett, Cambridge Digital Humanities

Application forms should be returned to CDH Learning (learning@cdh.cam.ac.uk) by Friday 22 May 2020. Successful applicants will be notified by 26 May 2020.

This course will introduce graduate students, early-career researchers, and professionals in the humanities to the technologies of image recognition and machine vision, including recent developments in machine vision research in the past half-decade. The course will seek to combine a technical understanding of how machine vision systems work, with a detailed understanding of the possibilities they open to research and study in the humanities, and with a critical exploration of the social, political and ideological dimensions of machine vision.

Learning outcomes

By the end of the course, students should be able to:

Understand the basic tasks of machine vision, such as Image Classification, Object Detection, Image-to-Image Translation, Image Captioning, Image Segmentation etc.
Understand the fundamental technical operations of image processing and machine vision: the pixel encoding of images, Gaussian and convolutional filters,
Explore critical aspects of machine vision in a technically-informed way: e.g. the problems in algorithmic bias brought about by featureless convolutional networks
Develop and run their own simple machine vision and image processing pipelines, in a visual programming language compiling to Python
Understand the potential synergies and limitations of machine vision applications in humanities research and cultural heritage institutions

Data Presentation and Preservation

Tue 28 Jan 2020 11:30 Finished

The afterlife of your research data forms a vitally important part of your research project. Research funders and academic journal publishers are often strongly committed to the re-use of data and are reluctant to fund or publish research where datasets are not accessible for the purposes of peer review or further use. Yet the push for open data exists in tension with the expectations of data protection law which requires transparency from researchers about how long they will retain personal data. This session will explore good practice in data sharing and archiving as well as introducing sources of further information and advice within the University on this topic.

Data Wrangling (Workshop)

Mon 4 Feb 2019 14:00 Finished

Garbage in, garbage out! Your output is as good or as bad as your input. Data collected from online sources is often dirty and messy. Discover how to clean and organise your data. After transforming raw data into a structured dataset, you will be ready to perform data analysis.

Delving into Massive Digital Archives - finding lost, forgotten and neglected texts (Guided Project)

Mon 12 Oct 2020 11:00 Finished

Application forms https://www.cdh.cam.ac.uk/file/cdhdelvingintomassivedaapplicationdocx should be returned to CDH Learning (learning@cdh.cam.ac.uk) by Tuesday 6 October 2020. Successful applicants will be notified by Thursday 8 October 2020.

Massive digital archives such as the Internet Archive offer researchers tantalising possibilities for the recovery of lost, forgotten and neglected literary texts. Yet the reality can be very frustrating due to limitations in the design of the archives and the tools available for exploring them. This programme supports researchers in understanding the issues they are likely to encounter and developing practical methods for delving into massive digital archives.

Digital Archival Photography | in-depth

Tue 9 Jan 2024 10:30 Finished

Following the introductory Methods Workshops, held on 21st November 2023, this session will focus on how to adopt the principles to the projects chosen by the participants. This will cover learning a practical approach to taking images fit for purpose in any conditions with available resources. It may also address any more advanced imaging topics such as image stitching, Optical Character Recognition, Multispectral Imaging, or photogrammetry if these are in the interest of the participants. It will also be an opportunity to visit the Digital Content Unit at Cambridge University Library.

Digital Data Collection and Wrangling

Tue 14 Jan 2020 11:30 Finished

This session addresses the technical and ethical aspects of digital data collection and wrangling – two fundamental stages in the lifecycle of a digital research project. Participants will be introduced to online data sources and practices of internet-mediated data collection, including retrieving data from social media platforms. As data collected from online sources is often dirty and messy, we will also provide a short practical introduction to the process of transforming raw data into a clean and structured dataset using free and open-source software.

Digital Data Collection (Workshop)

Mon 28 Jan 2019 14:00 Finished

This session is a primer on digital data collection. The goal is to become familiar with online data sources and practices of internet-mediated data collection, including retrieving data from social media platforms.

Digital Data Legacy: Share, Disseminate, Preserve (Workshop)

Mon 18 Feb 2019 14:00 Finished

The shelf-life of your dataset dictates the longevity of your findings. Sharing your data and assuring its integrity is a fundamental part of a digital research project. In this session we will discuss the principles of open data, channels for data dissemination and the fundamentals of data preservation.

Digital Mapping for Historians

Wed 26 Jun 2019 09:30 Finished

This intensive workshop will provide an overview of a range of applications of digital mapping in historical research projects and introduce GIS tools and software.

Digital Research Design and Data Ethics

Tue 24 Nov 2020 10:00 Finished

This CDHBasics session explores the lifecycle of a digital research project across the stages of design;

data capture
transformation
analysis
presentation and preservation

it also introduces tactics for embedding ethical research principles and practices at each stage of the research process.

Digital Research Design, Methods and Ethics (Workshop)

Mon 21 Jan 2019 14:00 Finished

Find out how to shape a digital research project from scratch. This session will introduce the building blocks of online research design, from the several methodologies available to conduct the research to the ethical guidelines that should underpin our projects.

Doing Qualitative Research Online

Mon 1 Feb 2021 14:00 Finished

What happens to practices of qualitative research when interactions between researcher and research subject are largely mediated? From observations of users’ interactions on social media platforms, to interviews conducted through WhatsApp or Skype, digital communications offer both opportunities and challenges for qualitative research in a wide range of disciplines across the Social Sciences and Humanities. This methods workshop will explore a wide range of topics including:

Establishing trust and credibility
Engaging with digital gatekeepers
Navigating blurred boundaries between ‘private’ and ‘public’
Re-conceptualising ‘researchers’, the ‘research field’ and ‘ research subjects’
Identity, anonymity and visibility - implications for research practice
Mitigation strategies: from data parsimony to creative obfuscation
Self-care for researchers in online research
Embedding ethical research practice across the project lifecycle

The workshop will take place over two sessions, an introductory seminar and discussion led by Dr Anne Alexander on 1 February, after which participants will be asked to complete a short reflective piece of work assessing their own research design and identifying areas where they feel they need further help and advice. The second session on 8 February will be participant led including small group and plenary discussions exploring strategies for dealing with challenges identified by participants.

Participants should set aside around 1 hour between the two sessions to complete and submit their self-assessment.

Participants are strongly encouraged to attend the CDH Basics session Privacy, information security and consent: a guide for researchers with Dr Anne Alexander on 26 January in advance of the Methods Workshop.

Evolve your Python Code into a Workflow for Text-based Research [cancelled - Covid-19]

Wed 20 May 2020 13:00 CANCELLED

We are currently reformatting our Learning programme for remote teaching; this will require some rescheduling so bookings will reopen and new sessions will be created for online courses as soon as possible. In the interim we would encourage you to register your interest so as to be notified of the new schedule. Please be aware that we hope to run many of our courses online, but that this is dependent on staff availability and resources so please be aware we may have to postpone or cancel some sessions

This workshop will develop your coding practice from testing ideas to creating an efficient workflow for your code, data and analysis. If you are using Jupyter Notebooks (but even if you’re not) this workshop will demonstrate how to better manage your code using good programming practices, and package your code into a program that is easier and quicker to run for lots of data and more reliable.

Required preparation (instructions provided): Python 3 installed on laptop; a text editor or IDE installed on laptop; git installed on laptop and signed up for GitHub; a short internet-based exercise in working with the command line.

Explaining Complexity: Using Animation, Illustration and Interactive Media to Communicate Research

Mon 20 May 2019 13:00 Finished

Dr Nathan Crilly and Chih-Chun Chen explore the challenges of communicating complex ideas to diverse audiences through a variety of digital media formats. Three case studies will be reported from an EPSRC-funded research project which sought to clarify and communicate the nature of complex system design and its relationship to emerging technologies. For example, the project studied the way in which technologists working in Synthetic Biology and Swarm Robotics conceptualise and address the complexity of the systems they are designing. Outputs from the project include: • A 35-page ‘primer’ on the subject of complexity (now with over 6000 downloads) • A three-minute animated movie discussing the subjectivity of complexity (now with 2500 views) • An interactive website (implemented by Dr Chen since she has programming skills) that generates annotated bibliographies for complexity resources tailored to a user’s interests (launched in March 2019) Dr Crilly and Dr Chih-Chun will discuss the process of engaging with media partners, including working with science communication agencies, animators and film-makers, reflect on what they learned from the process and what they would do differently in future.

Film-making for Beginners

Sat 1 Dec 2018 09:30 Finished

Learn to think visually and communicate using sound and film: participants will be introduced to the language of film, shot types, camera movements, framing, basic rules of camera use, how to tell a story, and editing in the Phoenix Training Suite.

Film-making for Beginners (Level 2)

Mon 24 Jun 2019 09:30 Finished

Learn to think visually and to communicate using sound and film. Participants will be introduced to the language of film, shot types, camera movements, framing, basic rules of camera use, how to tell a story, and editing. Some prior knowledge of filming is required. Please see the CDH website for more details (www.cdh.cam.ac.uk).

First steps in coding with Jupyter Notebooks

Tue 9 Feb 2021 10:00 Finished

This CDH Basics session is aimed at researchers who have never done any coding before. We will explore basic principles and approaches to writing and adapting code, using the popular programming language Python as a case study. Participants will also gain familiarity with using Jupyter Notebooks, an open-source web application which allows users to create and share documents containing live code alongside visualisations and narrative text.

First Steps in Coding with R

Mon 19 Feb 2024 14:00 Finished

Convenor: Dr Estara Arrant (Cambridge University Library)

This session is aimed at researchers with minimal coding experience or who have not done any coding but have data they want to explore and visualise. However, you do not need to have a full set of data to benefit from this class. You will learn the fundamentals of conducting a basic analysis of Humanities-related data in the R language, including prepping and tidying data and generating useful graphs which communicate information about your research to others. You will also gain a basic overview of the R programming language, which will provide you with principles that you can take forward to learn more advanced data analysis methods. The software we will use (RStudio) is free to download and is compatible with most computers. We will provide installation support and guidance. You will need your own laptop.

First Steps in Version Control with GitHub

Mon 26 Feb 2024 14:00 Finished

Please note this workshop has limited spaces, and an pre-course questionnaire is in place. Please complete before the session.

Version control helps you to write code for your research more sustainably and collaboratively, in line with best practices for open research. You might use code for collecting, analysing or visualising your data or something else. Everyone who codes in some way can benefit from learning about version control for their daily workflow.

This workshop will cover the importance of version control when developing code and foster a culture of best practices in FAIR (Findable, Accessible, Interoperable, Reproducible) code development. We will take you through the basic use of GitHub to help you store, manage, and track changes to your code and develop code collaboratively with others.

Designed with beginners in mind, this workshop caters to those who have not yet delved into Git or GitHub. While prior knowledge of a programming language (e.g., R or Python) would be beneficial, it is not a prerequisite.

From Blog to Book

Thu 10 Oct 2019 14:00 Finished

Blogging as a digital means of research communication seems so simple: with free, easy-to-use platforms we’re all just a few clicks away from setting one up. But having set a blog up, the difficult work begins. Who are you talking to? What are you trying to achieve? How will you generate your content? How will the people you want to talk to find it? How are you going to keep it going alongside your research and teaching commitments? Will it make any difference to anything? And will you ever be able to transform any of this work into a scholarly publication that ‘counts’?

This session will be an interactive conversation between Julie Blake, Cambridge Digital Humanities Methods Fellow and Connie Ruzich, University Professor of English at Robert Morris University, Pittsburgh, USA. Connie’s Behind Their Lines blog started in 2014 during a Fulbright Scholarship at Exeter University to research First World War poetry in the context of the Centenary Commemorations. She became interested in the lost and neglected poetry of the First World War and began blogging about her ‘finds’. Five years later, she has had almost 400,000 visits to her blog, she maintains a lively dialogue with public and academic audiences including via Twitter and she is in the final stages of completing a monograph about this material with Bloomsbury Academic.

We’ll discuss the highs and lows of Connie’s research blogging experience, the surprises, the pitfalls and the lessons learned by hard won experience. We’ll try to answer all the questions listed above, and participants will be invited to join in with their own questions.

From Corpus to Context: Word Embeddings as a Digital Humanities Research Methodology

Wed 10 May 2023 13:00 Finished

Speaker: Mark Algee-Hewitt, Associate Professor of English and Director of the Stanford Literary Lab.

About this Methods workshop

At the heart of many of the current computational models of language usage, from generative A.I. to recommendation engines, are large language models that relate hundreds of thousands, or millions, of words to each other based on shared contexts. Mysterious products of complex modelling algorithms, these objects raise a number of practical (and ethical) questions for Humanities scholars: How are these language models created? What kinds of relationships does their math encode? How do biases in the corpus affect the model? And how can we effectively use them to answer humanities-based questions?

In this workshop, we will explore these questions using a medium-sized language embedding model trained on a corpus of novels. Using approachable code in the R software environment, participants will learn how to manipulate a model, assess similarities and difference within it, visualise relationships between words and even train their own embeddings.

Game Design: an introduction for researchers [remote delivery]

Wed 17 Jun 2020 16:00 Finished

Emma Reay is a third-year PhD researcher at the University of Cambridge and an associate lecturer at Anglia Ruskin University. Her current project explores depictions of children in videogames, and her research interests include representation studies, children's digital media, gaming and education, and playful activism.

Adam Dixon is a game designer and writer who makes both physical and digital games. He has worked on everything from big public games that involve running around cities to narrative video games about learning scientific skills. Much of his work has involved working with museums and research organisations such as the Wellcome Trust, Science Museum, Nottingham Trent University and the V&A. This has included designing games, using play for public research engagement and most recently, teaching teenagers to create digital games for Wellcome Collection’s Play Well exhibition. Outside of that he works and releases his own games including roleplaying games, LARPs and interactive fiction.

Applications https://www.cdh.cam.ac.uk/file/cdhgamedesign201920applicationdocx-0 should be returned to CDH Learning (learning@cdh.cam.ac.uk) by Wednesday 10 June 2020. Successful applicants will be notified by 15 June 2020.

This online course will introduce participants to the practice of game design. It will explore the different ways that digital and analogue games are designed, particularly how you can design with intent to communicate a mood, theme or message. Participants will learn game design skills - such as boxing-in, design documents and prototyping – alongside opportunities to test them out by creating their own short games. Examples will focus on game design in research-related contexts, including using games as part of your research process and to communicate research outcomes to diverse audiences.

The sessions focus on game design, how to shape mechanics and play experiences, so no technical skills are needed. Participants will create their short games using both non-digital tools and simple, free software that will be taught in the sessions.

Topics covered:

Game design basics
A chance to play and consider thoughtful games
Boxing in
Planning games
Making games
Bitsy and Twine
Playtesting and iteration

Format

The course will be delivered online, with live teaching sessions taking place on Zoom.

Weds 17 June, 4pm BST: Introduction (45 minutes)
Weds 24 June, 4pm BST: Game play feedback (45 minutes)
Weds 1 July, 4pm BST: Game design seminar (45 minutes)
Weds 15 July, 4pm BST: Final session (60 minutes with break)

A CRASSH blog post was created for the originally scheduled session which may be of interest to read and can be found here: http://www.crassh.cam.ac.uk/blog/post/Play-as-Research-Practice

Game Design Workshop [cancelled - industrial action]

Mon 2 Dec 2019 09:30 CANCELLED

This two-day intensive workshop will introduce participants to the practice of game design. It will explore the different ways that digital and analogue games are designed, particularly how you can design with intent to communicate a mood, theme or message. Participants will learn game design skills - such as boxing-in, design documents and prototyping – alongside opportunities to test them out by creating their own short games.

The sessions focus on game design, how to shape mechanics and play experiences, so no technical skills are needed. Participants will create their short games using both non-digital tools and simple, free software that will be taught in the session.

The course participants will be selected via an application process, once a provisional place is booked a call for application form will be issued for completion and return by 1 November 2019. Once the applications are reviewed, places will be confirmed directly in the week beginning 18 November 2019.

Generative Adversarial Networks Experimentation Lab

Tue 11 Dec 2018 11:30 Finished

This workshop will discuss prospective methods and approaches for critically engaging with the images of people created through Generative Adversarial Networks, using design experiments as provocations to expand debate about notions of ‘realism’ and ‘authenticity’ in an era where human and machine vision are ever more systematically intertwined.

Ghost fictions (Guided project)

Mon 26 Oct 2020 14:00 Finished

'Application forms should be returned to CDH Learning (learning@cdh.cam.ac.uk) by Tuesday 13 October 2020. Successful applicants will be notified by 15 October 2020.

This CDH Guided Project series which also includes a Methods Workshop will explore the generation of ‘synthetic’ texts using neural networks.

The release of OpenAI’s GPT-2 and GPT-3 language models in 2019 and 2020 has shown that predictive algorithms trained on very large general datasets can generate ‘synthetic’ texts, perform machine translation tasks, rudimentary reading comprehension, question answering and summarisation automatically without needing large amounts of task-specific training. These ‘ghostwritten’ texts have provoked wide attention in the media.

Researchers have experimented with prompting GPT-3 to write short stories, answer philosophical questions and apparently propose potential medical treatments -although GPT-3 had some difficulty with the question “how many eyes does a horse have?”. The Guardian ‘commissioned’ op-ed from GPT-3.

Through interactive hands-on sessions and demonstrations we will explore synthetic text production and look at how ideas about the distinction between ‘fact’, ‘fiction’ and ‘non-fiction’ are shaping the reception of this emerging technology. Our aim is to stimulate deeper critical engagement with machine learning by humanities researchers and to encourage more public debate about the role of AI in culture and society.

We invite applications from early career researchers and others at the University of Cambridge to join a small project team for four online sessions during the Guided Project phase in Oct-November. Participants will need to commit to joining the live sessions and to set aside at least 3-4 hours work on a small-scale individual project during the course. We are interested in assembling an interdisciplinary group of researchers drawing on insights from across humanities, social science and technology disciplines .Prior knowledge of programming, computer science or Machine Learning is not required.

Humanities Data: a basic introduction

Tue 13 Oct 2020 10:00 Finished

This CDHBasics session will explain what data is, and what ‘humanities data’ looks like (via a behind-the-scenes tour of the Digital Library). This session covers good practice around file formats, version control and the principles of data curation for individual researchers.

Interaction with Machine Learning

Mon 1 Feb 2021 10:00 Finished

Application forms should be returned to CDH Learning (learning@cdh.cam.ac.uk) by Thursday 7 January 2021. We will review applications on a rolling basis and applicants will be notified at the latest by the end of Monday 11 January.

This CDH Guided Project aims to provide humanities, arts and social science researchers with an overview of current theory and practice in the design of human-computer interaction in the age of AI and equip the participants with analytical tools necessary for a critical investigation of contemporary design with AI/ML. Looking closely at interactions between humans and emerging AI systems, the workshop will also explore the potential for interaction between humanities scholars and computer scientists in the process of development and assessment of new solutions.

Lectures and practical research design sessions in Interaction with Machine Learning taught by Professor Alan Blackwell and Advait Sarkar (Microsoft Research) as part of an optional course for Part III and MPhil Computer Science students will form the anchoring element of the Project. These will allow researchers without a Computer Science background to explore how key challenges in AI design are being addressed within the field of interaction design, as well as identify areas in which humanities methodologies and approaches could be adopted to improve the production process, by making it more fair, critical, and socially-aware.

Participants will also take part in three workshops specifically tailored to humanities and social science researchers and will be supported in developing a mini research project investigating how humans interact with systems based on computational models. The projects may include:

probing an already existing dataset, system, or user interface from a critical perspective
developing an idea for new interaction design based on critical applications of ML/AI.

Please note: no prior practical experience or knowledge of programming is required to take part in the Project, however some awareness of how AI systems work will be beneficial.

Minimum time commitment:

8 weekly online lectures led by Professor Alan Blackwell (Computer Science and Technology) and Advait Sarkar (Microsoft Research). Weekly from 26 January, 2-4pm (with the last hour as an optional session for Guided Project participants).
3 x 1.5 hour specialist workshops for humanities and social science participants led by Tomasz Hollanek and Anne Alexander (CDH)
1.5 hour project showcase and final discussion

Participants are encouraged to set aside additional time to work on their projects between sessions. A Moodle email forum and drop-in ‘clinic’ style support sessions will be available during the Guided Project.

Lecture topics and dates

Current research themes in intelligent user interfaces (26 January, 2pm)
Program synthesis (2 February, 2pm)
Mixed initiative interaction (9 February, 2pm)
Interpretability / explainable AI (16 February, 2pm)
Labelling as a fundamental problem (23 February, 2pm)
Machine learning risks and bias (2 March, 2pm)
Visualisation and visual analytics (9 March, 2pm)
Research presentations by Computer Science Students (16 March, 2pm)

Workshop themes

AI critique, humanities methodologies and user interface design (1 February, 10-11.30am)
Recommender systems (1 March 10-11.30am)
Machine vision (8 March 10-11.30am)
Project presentations and discussion (15 March 10-11.30am)

Objectives By the end of the course participants should:

be familiar with current state of the art in intelligent interactive systems
understand the human factors that are most critical in the design of such systems
be able to evaluate evidence for and against the utility of novel systems
be able to apply critical methodologies to current interaction design practices
understand the interplay between ML/AI research and humanities approaches

Introduction to Archival Photography workshop [cancelled re Covid-19]

Wed 10 Jun 2020 11:00 CANCELLED

We are currently reformatting our Learning programme for remote teaching; this will require some rescheduling so bookings will reopen and new sessions will be created for online courses as soon as possible. In the interim we would encourage you to register your interest so as to be notified of the new schedule. Please be aware that we hope to run many of our courses online, but that this is dependent on staff availability and resources so please be aware we may have to postpone or cancel some sessions

This session focusses on providing photography skills for those undertaking archival research. Dr Oliver Dunn has experience spanning more than 10 years digitising written and printed historical sources for major university research projects in the humanities and social sciences. The focus is very much on low-tech approaches and small budgets. We’ll consider best uses of smartphones, digital cameras and tripods.

Introduction to Exhibit.so platform

Thu 28 Jul 2022 10:00 Finished

In this workshop, you will learn about the various features of the exhibit.so platform, led by Ed Silverton, from Mnemoscene and introduced by Andy Corrigan from Cambridge Digital Library.

Cambridge Digital Humanities (CDH) is working with Mnemoscene to develop a local instance of the Exhibit tool that will be available to University of Cambridge users.

Exhibit is a tool for visual storytelling developed by Mnemoscene supported by the Esmée Fairbairn Collections Fund. It is an easy-to-use tool for creating captivating interactive stories and quizzes with Cultural Heritage content, also now publicly available at https://www.exhibit.so/. Built using the Universal Viewer it enables users to load images or 3D objects from any IIIF-supporting online catalogue to tell stories within and across collections.

No prior knowledge of IIIF or Exhibit required!

Outcomes

At the end of the workshop attendees will be able to:

Identify the key features of Exhibit
Identify how to source existing IIIF manifests or add new ones to Exhibit
Create stories, quizzes, and kiosks in Exhibit
Embed your Exhibit on your website

Introduction to MorphoSource

Thu 28 Jul 2022 14:00 Finished

Cambridge Digital Humanities is working with MorphoSource to offer an introduction to its platform. In this workshop you will be introduced to the MorphoSource platform, which is a repository for researchers, curators, and everybody to find, view, download, and upload 3D scans and data of natural history, scientific specimens, and cultural objects.

Contributions come from museums, researchers, scholars and specialists to share findings, increase impact, and improve access to material for scientific discovery, sharing, and the advancement of human knowledge.

The workshop will cover:

Highlight the main features
Focus on usage most relevant to the cultural heritage sector
Using the site - searching, exploring, referencing
Contributing data
Embedding content

The workshop has a GLAM focus and is more about safely storing & providing access to complex visual data content rather than story-telling, although still has aspects of engagement, but might also be of interest to STEM areas working with 3D/complex visual data or in the area of scholarly communications/data repositories.

Introduction to Text-Mining with Python 1

Tue 30 Apr 2019 11:00 Finished

This session will introduce basic methods for reading and processing text files in Python. We will walk through an example that reads in a large text corpus, splits it into tokens (words) and sentences, removes unwanted words (stopwords), counts the words (frequency analysis), and visualises results. We will talk about the 5 steps of text mining and what resources to use when learning text mining for your research in your own time. No prior knowledge of Python is required, and no installations will be needed. We will use web services available in your browser to follow along.

Introduction to Text-Mining with Python 2

Tue 7 May 2019 11:00 Finished

This session will introduce topic modelling. Topic modelling is looking for clusters of words that summarise the meaning of documents. We will talk about how to choose what sort of text mining you might want for your research. Some knowledge of Python is required, as gained from 'Introduction to Text-Mining with Python 1', or equivalent. No installations will be needed; we will use web services available in your browser to follow along with the examples.

Introduction to Text-mining with Python [remote delivery]

Thu 30 Apr 2020 11:00 Finished

This online session will introduce basic methods for reading and processing text files in Python with Jupyter Notebooks. We'll discuss why you might wish to do text-mining, and whether coding with Python is the right choice for you. We'll run through the 5 steps of text-mining, and start to walk through an example that reads in a text corpus, splits it into words and sentences (tokens), removes unwanted words (stopwords), counts the tokens (frequency analysis), and visualises results.

This initial session is one hour long and will be delivered remotely by video conferencing. During the session we will cover the essentials of working with the Jupyter Notebooks provided so that you can carry on working through the materials in your own time. The first session will be followed by a second, optional Q&A session for troubleshooting issues and recapping essentials.

Required preparation: A short internet-based exercise in working with variables and text in Python will be sent out one week prior to the session. You will also get instructions on how to find the materials we will be using and how to log onto the video conferencing platform. Please make sure you have some time to prepare properly so that we can concentrate on teaching during the remote session.

Introduction to the Command Line

Tue 5 Dec 2023 11:00 Finished

This session introduces the command line, sometimes also known as the shell or the terminal, to humanities researchers. No prior knowledge of the command line or programming of any kind is required or expected from attendees.

A basic understanding of how to use the command line provides a step change in how productive you can be when working with data or text files, particularly large number of files or very large files, which can be hard to manipulate in a graphical interface. Some tools and programs can only be used from the command line, and this session aims to give you the confidence to work with them. In the session we primarily look at seven George Eliot novels and a comparative set of seven Dickens novels (about 3.4 million words in total) but this session should be of use to any humanities researchers working with text collections and the principles have far broader applicability.

We'll focus on running programs which come pre-installed on Mac and Linux, and which can be easily added to Windows. We'll combine these programs in productive ways, discuss how to discover and use the options for each, how to send results to files, and how to work efficiently on the command line so you don't have to retype or remember everything you've done.

Machine Reading the Archive 2020 - end of programme workshop [see eventbrite]

Mon 15 Jun 2020 11:30 Finished

We are currently reformatting our Learning programme for remote teaching; this will require some rescheduling so bookings will reopen and new sessions will be created for online courses as soon as possible. In the interim we would encourage you to register your interest so as to be notified of the new schedule. Please be aware that we hope to run many of our courses online, but that this is dependent on staff availability and resources so please be aware we may have to postpone or cancel some sessions

This public workshop will mark the end of the 2020 programme of Machine Reading the Archive, a digital methods development programme organised by Cambridge Digital Humanities with the support of the Researcher Development Fund.

It will showcase the digital archive projects created by our cohort of project participants as well as invited contributions from leading experts in the field.

Mapping the Past [remote delivery]

Fri 22 May 2020 11:00 Finished

This intensive workshop is split into two online chats and two 1-hour sessions. Participants will first learn to collect and process geospatial data from historical sources and process it using geographical information systems from Google Earth to QGIS.

The first online session introduces research techniques for collecting, arranging and mapping geospatial data from historical sources, and is taught by Dr Oliver Dunn. His session is split into two parts: Part A will introduce both online sessions by showing some of our own research that makes use of Google Earth, 3D Maps in Excel, and historical GIS. In Part B you will be asked to locate a set of Scotland’s historical lighthouses on historical maps online and map their location and other attributes in Google earth and 3D Maps.

The second online session introduces students to mapping humanities data using Q-GIS which is a free GIS (Geographical Information System) software platform. Course participants will need to download and install QGIS on their laptops before 5th of June. On the 1st of June there will be further details concerning downloading QGIS, a chat forum where we can discuss why you might wish to use GIS, and whether GIS is the right choice for you, and a release of course teaching materials. On 5 June you will be taken through the map creation process step-by-step. This session will be taught by Max Satchell.

Methods Fellows Series | Database Design for NonProgrammers

Tue 19 Apr 2022 10:00 Finished

Do you need a database for your data? Or could you store the data in standalone files? Which database paradigm should you consider? What are the consequences of these choices on your work routine? How to navigate all of this with minimal or no programming experience?

These and more are the questions we will address in the course. We aim to provide a gentle introduction to databases and database paradigms, with examples that help explain the differences between the most common database packages and guide researchers to design suitable solutions for their data problems.

Methods Fellows Series | Digital Design of Musical Scores

Tue 21 Jun 2022 11:00 POSTPONED

These workshops will offer participants the ability to re-think the graphic design of a musical score and will work with a novel set of principles to modify the spacing, layout, and position of its notes and signs for intelligibility purposes and/or artistic purposes.

In previous experimental research, Arild has found that musical scores with modified engraving, spacing, and layout rules can —at least in certain practices and for certain repertoires— elicit more fluent and precise readings than conventional scores. The abstraction of informational units and of discourse structure from a score seems to be enhanced by his approach of separating and redistributing notation symbols and other visual materials using a digital (quantifiable, taxonomic) hierarchy of divisions comparable to what is nowadays conventionally applied in (Western) language texts. This seems to be facilitating the decoding and apprehension of information, affecting the conversion of notation into performance; it is also being investigated at present in terms of academic and artistic impact.

Participants will be able to use the flexibility and manageability of digital production to introduce a radically new conception of the visual structuring of a musical score: Arild proposes to go beyond the mere reproduction of analogical models with digital tools; for that, participants will be experimenting with novel flexible spacing, layout and visual structuring cues that could be enhancing, in music reading, the integrative and abstractive processes that fluent readers already use in language (we do not read sequentially letter by letter; good readers group, prioritise and predict the symbols presented to them). This approach is intrinsically digital, as it is based on being able to use the symbols of a score in a modular, movable, and experimental manner —and in this context 'experimental' would naturally include heuristic or intuitive manipulations by the score users. Arild's view is that a novel conception of music notation should include the possibility of re-organising the materials, allowing the user at either end (creator or reader) to group, separate, highlight and grade visually the symbols present in a score.

Methods Fellows Series | Digital Humanities: Exploring critical, intersectional and decolonial methods

Mon 30 May 2022 14:00 Finished

Isabelle Higgins, Methods Fellow - Cambridge Digital Humanities

This Methods Fellows' Workshop Series event aims to encourage participants to think critically and reflexively about the nature of digital humanities research. It will explore (both individually and collectively) the function and effect of critical, intersectional and decolonial research methods and their impact on research fields, participants and research outputs.

For each seminar, participants will be provided with a reading list that will contain both core introductory texts and additional readings. They will be expected to do 30 minutes of reading ahead of each seminar. The seminars themselves will be a mix of presentations, small group discussion and the study of specific empirical cases.

Throughout the seminars we will collectively assemble a shared bibliography of academic texts and other digital resources. Participants will also be encouraged to bring and share examples and challenges from their own research.

To increase space for discussion and critical reflection, participants will be encouraged to form small working groups, focused on the seminar theme they find most productive, and to connect with their working group for a 30-minute call to reflect on their chosen seminar outside of the scheduled four hours of teaching. There will be the option to feed back on these discussions to the wider group, deepening our shared understanding of the content covered in the course. Isabelle will also hold virtual office hours following the seminar series. In these ways and others, the series will aim to cater for those new to this area of research, as well as for scholars who are already working in digital humanities.

Key topics covered in the sessions will include:

Seminar 1: Digital Humanities in Social and Historical

Context: Considering what and how we research

We will focus on placing digital humanities, as a discipline, in the context of its emergence. Disciplinary Sociology, for example, is increasingly grappling with its colonial past (Meghji, 2020). What happens when we examine the history and context of digital humanities? McIlwain (2020) reminds us of the historical ties between the development of computational technology and the surveillance of Black bodies. Yet digital humanities research has also sought to challenge the legal, social and political power exercised through digital systems (Selwyn, 2019). Does contextualising our methods change how we approach them?

Seminar 2: Critical approaches to Digital Environments: Affordances, Interfaces, AI, Algorithms

We will draw on the vast range of work produced by race critical code scholars, which help us to explore the assumptions and inequalities that are coded into the software we study (or use to conduct our studies). Ruha Benjamin (2016a:150) reminds us to ask of digital technology: 'who and what is fixed in place – classified, corralled, and/or coerced, to enable innovation?' How does a consideration of encoded digital inequalities affect our methodologies?

Seminar 3: Critical Engagement with User Generated

Content: Beyond content & discourse analysis

We will draw on critical theories that draw attention to the digital and social constructs and conventions that shape the production of user-generated content, with Brock's (2018) Critical Techno-Cultural Discourse Analysis as one such methodological contribution. We'll explore what happens to our research when we broaden our methodological framing, considering the type of content produced by users and how it is produced, who is producing it, and what governs this production.

Seminar 4: Looking forward: Our roles as researchers in Digital Humanities

We will pay attention to the growing calls from a range of cross-disciplinary scholars who invite us to actively consider the impact of our methods on the future. We'll explore different notions of methodological responsibility and innovation, from the speculative (Benjamin, 2016b), to the caring (de la Bellacasa, 2011), to the adaptive and inductive (Markham & Buchanan, 2012). What happens when we place our research into its broader context and consider how our methods will shape the future of our discipline?

Methods Fellows Series | Give me five! Principles of Data Visualisation

Wed 23 Mar 2022 16:00 Finished

This course demystifies principles of data visualisation and practices of graph creation in Python to help trainees better understand and reflect how Good Data Visualisation under “5 Principles” can be achieved, and develop Python’s application in data visualisation beyond analysis. This course is aimed at students/staff who are interested in and/or use data visualisation in research or outreach and hope to explore data visualisation in Python with basic Python knowledge. It is delivered in a format of 4-hr workshop (on Zoom) + c. 2hr self-paced preparation and post-class exercises+ 1hr asynchronous question-shooting, combining theories, case learning, peer interactions and practical: we first present an introduction on key concepts of and problems in data visualisation, before case studies and group discussion on data visualisation principles and how to visualise data better in practices; then under a demonstration, we employ Python to visualise data and go through types of graphs.

Methods Fellows Series | Making Sense of Statistics: An Introduction to Understanding and Conducting Statistical Analyses

Thu 24 Feb 2022 14:00 Finished

Itamar Shatz - Methods Fellow CDH

This course will introduce participants to key concepts in statistical analyses, including statistical significance, effect sizes, and linear models. The goal is to give participants the basic tools that they need in order to understand the use of statistical methods by others and to use these methods effectively in their own research. We will focus on an intuitive and practical understanding of statistical analyses, rather than on the mathematical details underlying them. As such, the course will be accessible for those without a quantitative background, although it will help to have knowledge of basic descriptive statistics (e.g., mean and standard deviation).

The course will cover (approximately) the following topics:

Session 1: statistical significance and statistical tests (including hypothesis testing, p-values, statistical power, t-test, and chi-square test).
Session 2: effect sizes, correlation, confidence intervals, and outliers.
Session 3: linear regression (including simple/multiple regression, residuals, beta coefficients, and R-Squared).
Session 4: linear regression continued (including test statistics, standard errors, centering, interaction, categorical predictors, linear models, and assumption testing).

Methods Fellows Series | Medieval Logic and Computational Methods

Tue 29 Mar 2022 10:30 Finished

This course looks at how modern computational techniques in logic can be used to approach historical questions in the history of logic while also reflecting on the differences and similarities between historical and modern approaches to logic.

Historically, the course will focus on two authors’ approaches to modal logic, the branch of logic that deals with possibility, necessity, and contingency. Ibn Sina (9th century) and John Buridan (14th century). Using these two authors and their discussions of logic as a starting place, we will look at how their logical systems can be represented and formalised using contemporary computational methods, as well as reflecting on the similarities and differences between historical approaches to analysing validity and its relationship to modern notions of algorithms.

The overarching aim of the course is to develop the framework that allows us to computationally show that Buridan and Ibn Sina are working with the same modal logic under two different presentations.

Methods Fellows Series | Remote Capture: On-Site Archival Photography

Mon 16 May 2022 11:00 Finished

This course will be of interest to academics at all levels (including PhD students) who travel to remote locations (including small libraries worldwide) to access their primary material (often pamphlets and hand-written ephemera) which they are interested in digitising not only for their own scholarly appraisal, but also as a means of enabling access to the wider academic community. We will go step-by-step through preparation of materials, cataloguing systems, rigs and illumination, tethered photography using Lightroom, smartphone lenses and Halide, and packaging and checksums. We will also be discussing theoretical and ethical questions around decolonisation, reparation, and handling of Black and Indigenous heritage.

Methods Fellows Series | Social Network Analysis

Tue 8 Mar 2022 14:00 Finished

Thomas Cowhitt, Methods Fellow - Cambridge Digital Humanities

This Methods Fellow's Workshop Series event will introduce users to social network analysis in R. Participants will be asked to generate their own relational dataset. We will then use several R packages to visualize and interpret relational data. By the conclusion of this course, users will be able to construct a relational dataset, load and clean this dataset in R, and generate static network diagrams and reports on descriptive network statistics.

Methods Fellows Series | Visualising Data Clearly

Wed 4 May 2022 14:00 Finished

If you've ever collected some data but weren't sure how to go about visualising it in a way that could help you uncover new insights, or if you've struggled to present data in a way that helped others understand your findings, this course is intended for you.

We'll talk about how to select the right visualisation for your data, discuss the pros and cons of different approaches, and get hands-on experience displaying information in clear and compelling ways. We'll also discuss broader issues surrounding visualisation science, such as common ways that visualisations are misinterpreted and how to avoid them, and controversies around what counts as best practice in visual communication.

In addition to the weekly online sessions, participants are expected to spend around two hours per week applying the skills learnt to gain greater fluency and enable us to 'workshop' each other's visualisations.

Your participation will also benefit if you have the chance to take our "Give me 5! Principles of Data Visualisation", which is scheduled for 23rd & 30th March. However, attending this workshop is not a prerequisite, so please do not be deterred if you miss the dates.

Methods Fellow Workshop: Audible knowledge: soundscapes, podcasts and digital audio scholarship

Wed 9 Jun 2021 11:00 Finished

Methods Fellow Workshop: Audible knowledge: soundscapes, podcasts and digital audio scholarship

Dr Peter McMurray (CDH Methods Fellow)

With the rise of web-based scholarship and affordable digital audio equipment, artists and researchers are increasingly turning to audio formats as way to share their work with a larger audience and to cultivate new forms of knowledge rooted in listening. This workshop will offer an introduction to digital audio recording and editing (using Reaper, a digital audio workstation which can be downloaded/used for free on an extended trial basis). We will focus particularly on the editing choices for soundscape composition and podcasting, and participants will have the opportunity to produce a short audio piece over the course of the workshop.

Methods Fellow Workshop:Digital Interreligious Encounters in Urban Contexts (DIEUX)

Mon 22 Feb 2021 14:00 Finished

Applications for this workshop have now closed.

As religious services and communities have shifted online so too have scholars of religion. But at what cost? These sessions raise some of the epistemological and ethical issues of doing fieldwork in a digital environment from an inclusive anthropological perspective with a close-up on a particular case study in each session.

The first session considers conducting virtual ethnography, what is gained and what is lost, with a focus on ethnography with Orthodox Jewish populations; the second session assesses digital surveys of religious communities and their attitudes e.g. what the 'bean-counters' might miss (and strategies not to) and finally in the third session we problematize the ethical tensions in online studies of community media with a particular focus on French Muslim media, already heavily surveilled.

The sessions are intended to develop researcher knowledge and explore cross-cutting issues that concern a broad spectrum of humanities and social science-based scholarship serving as;

a forum for the critical discussion of digital methods and epistemologies,

a place to learn more about specific case studies particularly in the UK and France, and

an assembly of early research minds in the throes of a related or relevant project themselves who wish to share and learn from one another

Methods Fellow Workshop: Exploring (literary) texts with corpus linguistics tools

Mon 1 Mar 2021 14:00 Finished

Applications for this workshop have now closed.

Corpus linguistic approach to language is based on collections of electronic texts. It uses software to search and quantify various linguistic phenomena that make up patterns, which it then compares within and across texts based on their frequency. Corpus stylistics applies tools and methods from corpus linguistics to stylistic research. Corpus stylistics mainly focuses on literary texts, individual or corpora. Corpora are here, usually, principled collections of texts, for example a collection of texts by one author, or texts from a specific period. It focuses both on more general patterns and meanings that are observable across corpora and patterns and meanings in one individual text. In terms of quantitative approaches that corpus stylistics employs, it is in many ways similar to work that is referred to as ‘distant reading’ and also ‘cultural analytics’. These approaches emphasise the gains that we get from looking at texts from “distance”, i.e., in large quantities. For corpus stylistics, it is the relationship between quantitative and qualitative that is central. Therefore, research in corpus stylistics often deals with much smaller “cleaner” data sets, so that the qualitative step in the analysis is more manageable.

This workshop aims to introduce the basic corpus linguistic techniques and methods for working with literary and other texts. It aims:

To provide an introduction to corpus linguistics in relation to digital humanities approaches;
To develop critical understanding of how data representativeness used in quantitative research may influence results;
To critically examine the relationship between quantitative and qualitative textual analyses;
To provide a practical toolkit for computational textual analysis.

All Cambridge Digital Humanities courses

Contact training provider

Privacy policy
Cookie policy

Study at Cambridge

About the University

Research at Cambridge