Skip to the content.

Bioinformatics for T-Cell immunology

Table of contents


This site intends to be a repository for the group projects given during the course Bioinformatics for T-Cell immunology, 11-15/07/2022, at EMBL-EBI, Cambridge, UK.

The projects use publicly available and realist data sets of T cells to perform common tasks in the analysis of single-cell-RNA-seq data: integration, clustering and differential gene expression. The projects focus more on the demonstration of key methodological aspects on the analysis of single-cell data rather than addressing a particular biological question(s).


This repository hosts three standalone/independent projects:

  1. Integration of single-cell data from patients developing arthritis arAE under ICI

  2. Fine-grained clustering of single-cell data of melanoma immune/stroma cells

  3. Differential gene expression of stimulated CD4+ T single-cell data with single-cell and pseudobulk methods

The course material for these projects can be found in the following github repository (under the folder projects):

Download the github repository by typing in the terminal: git clone

or by clicking under the Download ZIP icon (decompress the folder).

The markdown text file under the folder projects explains the directory structure. The project notebooks are under the folder reports. The conda environment yaml file at projects/workflow/envs/tools.yaml describes the list of software packages and the respective versions required to reproduce the project notebooks. Such can be installed with conda (or mamba) by doing (from the root directory projects folder): conda env create -f workflow/envs/tools.yaml (you may need to add the tag -name: course to the beginning of the yaml file).


Each group just pick one of the projects.

The timeline for one project is highlighted below:

  1. 30min for project introduction on day 2

  2. 1.5h for group project work on day 3

  3. 1.5h for group project work on day 4

  4. 1.5 h for group project work and wrap-up on day 5

  5. 1h for the group presentation (all groups) on day 5

Target audience

Scientists who want to learn key concepts revolving around the analysis of single-cell-RNA-seq data such as: integration, clustering and differential gene expression.


The course projects are delivered as R markdown notebooks which can be reproduced with basic-level knowledge of R programming language. There are a few lines of python too being called directly from R using reticulate. The participants may benefit from medium-level knowledge of R to explore more in-depth some analyses and familiarity with Seurat and SingleCellExperiment objects and functionality.

Project lead

António Sousa (ENLIGHT-TEN+ PhD student at the Medical Bioinformatics Centre, TBC, University of Turku & Åbo Akademi)



All the data used along each project notebook was made public elsewhere by the respective authors and it has been properly referenced in each project (proper links were provided along each project notebook). The data and tools chosen to address the topic(s) of each project notebook reflect only my personal experience/knowledge and they were chosen to highlight particular aspects that I consider important. The results generated and explored within each project notebook have just the general purpose of give a brief introduction to the topics addressed in each project and do not aim, at any point, to reproduce or question neither the approaches taken nor the main findings published along with the data sets used herein.


This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No.: 955321

Shield: CC BY 4.0

This work is licensed under a Creative Commons Attribution 4.0 International License.

CC BY 4.0