This single-cell RNA-seq integration task describes an example of an integration analysis involving different biological conditions: resting (named CTRL
) versus interferon stimulated (named STIM
) human peripheral blood mononuclear cells (PBMCs). The data set with cells coming from both biological conditions was retrieved from the SeuratData
(v.0.2.2.9001):
ifnb (resting and ifnb stimulated data sets):
ifnb
SeuratData
data set (v.3.1.0): IFNB-Stimulated and Control PBMCs
no. of cells: 13,999 cells (6,548 in CTRL
and 7,451 in STIM
)
The identity of the data set - CTRL
or STIM
- for every cell was saved in the Seurat
meta.data
column variable stim
. The ground-truth cell identities were also provided in the column variable seurat_annotations
but avoid checking them until the end of this notebook to make these analyses more interesting.
The analyses performed in this notebook rely in the Seurat
R package (v.5.1.0).
Import the main packages used in this notebook: Seurat
(v.5.1.0), SeuratWrappers
(v.0.3.2 - integration wrappers for Seurat), dplyr
(v.1.1.4 - wrangling data), patchwork
(v.1.2.0 - visualization), scIntegrationMetrics
(v 1.1 - compute LISI integration metrics).
## Import packages
library("dplyr") # data wrangling
library("Seurat") # scRNA-seq analysis
library("patchwork") # viz
library("SeuratWrappers") # integration wrappers
library("scIntegrationMetrics") # compute LISI integration metrics
Create output directories to save intermediate results, figures, tables and R objects.
## Output directories
res.dir <- file.path("../results", "ifnb_stimulated_task", c("plots", "tables", "objects"))
for (folder in res.dir) if (!dir.exists(folder)) dir.create(path = folder, recursive = TRUE)
(5 min)
AIM: Import and explore the Seurat object data.
Import the ifnb stimulated Seurat
R object ifnb.rds
located in the folder data
.
# Import data
data.dir <- "../data"
seu <- readRDS(file = file.path(data.dir, "ifnb.rds"))
You may want to down sample this data set depending on the amount of RAM memory you have. The ifnb
data set has 13,999 cells.
## Downsample data set
downsample <- TRUE # replace to FALSE in case you don't want to down sample
prop.down <- 0.2 # proportion of cells to down sample per batch: 20% of the cells
if (downsample) {
no.cells.batch <- ceiling(table(seu$stim) * 0.2) # CTRL = 1310 and STIM = 1491
cell.idx.batch <- split(x = colnames(seu), f = seu$stim) # split into a list the cell names per batch
cell.idx.batch.down <- lapply(X = setNames(names(cell.idx.batch), names(cell.idx.batch)), FUN = function(x) {
set.seed(123)
sample(x = cell.idx.batch[[x]], size = no.cells.batch[[x]], replace = FALSE)
}) # downsample each batch cell names
cell.idx.downsample <- do.call(c, cell.idx.batch.down) # join cell name labels from the two batches into one character vector
seu <- subset(seu, cells = cell.idx.downsample)
}
gc()
## used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 3442951 183.9 6551764 350 6366457 340.1
## Vcells 11924832 91.0 44423802 339 45869572 350.0
cat("No. of cells downsampled for `CTRL` was:", table(seu$stim)[1], "\n")
## No. of cells downsampled for `CTRL` was: 1310
cat("No. of cells downsampled for `STIM` was:", table(seu$stim)[2], "\n")
## No. of cells downsampled for `STIM` was: 1491
Explore quickly the Seurat
seu
object.
## Explore Seurat object
# Print Seurat object
seu
## An object of class Seurat
## 14053 features across 2801 samples within 1 assay
## Active assay: RNA (14053 features, 0 variable features)
## 2 layers present: counts, data
# Structure
str(seu)
## Formal class 'Seurat' [package "SeuratObject"] with 13 slots
## ..@ assays :List of 1
## .. ..$ RNA:Formal class 'Assay5' [package "SeuratObject"] with 8 slots
## .. .. .. ..@ layers :List of 2
## .. .. .. .. ..$ counts:Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
## .. .. .. .. .. .. ..@ i : int [1:1969604] 20 27 37 64 65 83 87 131 139 175 ...
## .. .. .. .. .. .. ..@ p : int [1:2802] 0 877 1439 2190 2961 3541 4124 4743 5301 5833 ...
## .. .. .. .. .. .. ..@ Dim : int [1:2] 14053 2801
## .. .. .. .. .. .. ..@ Dimnames:List of 2
## .. .. .. .. .. .. .. ..$ : NULL
## .. .. .. .. .. .. .. ..$ : NULL
## .. .. .. .. .. .. ..@ x : num [1:1969604] 1 1 1 1 1 2 1 1 1 1 ...
## .. .. .. .. .. .. ..@ factors : list()
## .. .. .. .. ..$ data :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
## .. .. .. .. .. .. ..@ i : int [1:1969604] 20 27 37 64 65 83 87 131 139 175 ...
## .. .. .. .. .. .. ..@ p : int [1:2802] 0 877 1439 2190 2961 3541 4124 4743 5301 5833 ...
## .. .. .. .. .. .. ..@ Dim : int [1:2] 14053 2801
## .. .. .. .. .. .. ..@ Dimnames:List of 2
## .. .. .. .. .. .. .. ..$ : NULL
## .. .. .. .. .. .. .. ..$ : NULL
## .. .. .. .. .. .. ..@ x : num [1:1969604] 1 1 1 1 1 2 1 1 1 1 ...
## .. .. .. .. .. .. ..@ factors : list()
## .. .. .. ..@ cells :Formal class 'LogMap' [package "SeuratObject"] with 1 slot
## .. .. .. .. .. ..@ .Data: logi [1:2801, 1:2] TRUE TRUE TRUE TRUE TRUE TRUE ...
## .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. .. .. .. .. ..$ : chr [1:2801] "AAACATACATTTCC.1" "AAACGCTGTTCCAT.1" "AAACGGCTGGTCAT.1" "AAACTTGACTGGAT.1" ...
## .. .. .. .. .. .. .. ..$ : chr [1:2] "counts" "data"
## .. .. .. .. .. ..$ dim : int [1:2] 2801 2
## .. .. .. .. .. ..$ dimnames:List of 2
## .. .. .. .. .. .. ..$ : chr [1:2801] "AAACATACATTTCC.1" "AAACGCTGTTCCAT.1" "AAACGGCTGGTCAT.1" "AAACTTGACTGGAT.1" ...
## .. .. .. .. .. .. ..$ : chr [1:2] "counts" "data"
## .. .. .. ..@ features :Formal class 'LogMap' [package "SeuratObject"] with 1 slot
## .. .. .. .. .. ..@ .Data: logi [1:14053, 1:2] TRUE TRUE TRUE TRUE TRUE TRUE ...
## .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. .. .. .. .. ..$ : chr [1:14053] "AL627309.1" "RP11-206L10.2" "LINC00115" "NOC2L" ...
## .. .. .. .. .. .. .. ..$ : chr [1:2] "counts" "data"
## .. .. .. .. .. ..$ dim : int [1:2] 14053 2
## .. .. .. .. .. ..$ dimnames:List of 2
## .. .. .. .. .. .. ..$ : chr [1:14053] "AL627309.1" "RP11-206L10.2" "LINC00115" "NOC2L" ...
## .. .. .. .. .. .. ..$ : chr [1:2] "counts" "data"
## .. .. .. ..@ default : int 1
## .. .. .. ..@ assay.orig: chr(0)
## .. .. .. ..@ meta.data :'data.frame': 14053 obs. of 0 variables
## .. .. .. ..@ misc : list()
## .. .. .. ..@ key : chr "rna_"
## ..@ meta.data :'data.frame': 2801 obs. of 5 variables:
## .. ..$ orig.ident : chr [1:2801] "IMMUNE_CTRL" "IMMUNE_CTRL" "IMMUNE_CTRL" "IMMUNE_CTRL" ...
## .. ..$ nCount_RNA : num [1:2801] 3017 1269 1484 2185 1520 ...
## .. ..$ nFeature_RNA : int [1:2801] 877 562 751 771 580 583 619 558 532 666 ...
## .. ..$ stim : chr [1:2801] "CTRL" "CTRL" "CTRL" "CTRL" ...
## .. ..$ seurat_annotations: Factor w/ 13 levels "CD14 Mono","CD4 Naive T",..: 1 3 7 4 3 1 6 1 2 4 ...
## ..@ active.assay: chr "RNA"
## ..@ active.ident: Factor w/ 2 levels "IMMUNE_CTRL",..: 1 1 1 1 1 1 1 1 1 1 ...
## .. ..- attr(*, "names")= chr [1:2801] "AAACATACATTTCC.1" "AAACGCTGTTCCAT.1" "AAACGGCTGGTCAT.1" "AAACTTGACTGGAT.1" ...
## ..@ graphs : list()
## ..@ neighbors : list()
## ..@ reductions : list()
## ..@ images : list()
## ..@ project.name: chr "ifnb"
## ..@ misc : list()
## ..@ version :Classes 'package_version', 'numeric_version' hidden list of 1
## .. ..$ : int [1:3] 5 0 2
## ..@ commands : list()
## ..@ tools : list()
# Check meta.data
head(seu@meta.data)
## orig.ident nCount_RNA nFeature_RNA stim seurat_annotations
## AAACATACATTTCC.1 IMMUNE_CTRL 3017 877 CTRL CD14 Mono
## AAACGCTGTTCCAT.1 IMMUNE_CTRL 1269 562 CTRL CD4 Memory T
## AAACGGCTGGTCAT.1 IMMUNE_CTRL 1484 751 CTRL T activated
## AAACTTGACTGGAT.1 IMMUNE_CTRL 2185 771 CTRL CD16 Mono
## AAACTTGAGGAAAT.1 IMMUNE_CTRL 1520 580 CTRL CD4 Memory T
## AAAGACGAAACAGA.1 IMMUNE_CTRL 1667 583 CTRL CD14 Mono
# Check how many cells per data set
table(seu$stim)
##
## CTRL STIM
## 1310 1491
# Check no. of genes
nrow(seu)
## [1] 14053
# Check no. of cells
ncol(seu)
## [1] 2801
(7 min)
AIM: See how much the two data sets overlap each other in the low dimensional reductions.
Run the standard Seurat
upstream workflow to jointly compute a PCA and UMAP for the datasets:
NormalizeData()
: log1p-normalization with a scaling factor of 10K
FindVariableFeatures()
: identification of 2K HVG
ScaleData()
: standardization of the 2K HVG
RunPCA()
: computation of a PCA with the 2K HVG standardized
RunUMAP()
: computation of a UMAP using the first dims
of the previously computed PCA
## Joint analysis
# Standard Seurat upstream workflow
seu <- NormalizeData(seu)
seu <- FindVariableFeatures(seu)
seu <- ScaleData(seu)
seu <- RunPCA(seu)
seu <- RunUMAP(seu, dims = 1:30, reduction = "pca", reduction.name = "umap.unintegrated")
Plot the PCA and UMAP side-by-side below.
## Plot jointly dimreds
pca.unint <- DimPlot(seu, reduction = "pca", group.by = "stim")
umap.unint <- DimPlot(seu, reduction = "umap.unintegrated", group.by = "stim")
pca.unint + umap.unint
(5 min)
AIM: Check if cells from different datasets share well-known cell-specific markers.
Plot below some cell-specific PBMC cell type markers. Feel free to add other genes you might be interested in checking.
## Joint celltype markers
# List of PBMC cell markers
markers.plot <- list(
# "pbmc" = c("CD3D", "CREM", "HSPH1", "SELL", "GIMAP5", "CACYBP", "GNLY", "NKG7", "CCL5",
# "CD8A", "MS4A1", "CD79A", "MIR155HG", "NME1", "FCGR3A", "VMO1", "CCL2", "S100A9",
# "HLA-DQA1", "GPR183", "PPBP", "GNG11", "HBA2", "HBB", "TSPAN13", "IL3RA", "IGJ",
# "PRSS57"),
"pbmc" = c("CD3D", "NKG7", "CD8A", "MS4A1", "CD79A", "FCGR3A")
)
# Plot
pbmc.markers.unint.plot <- FeaturePlot(seu, features = markers.plot$pbmc, split.by = "stim",
max.cutoff = 3, cols = c("grey", "red"),
reduction = "umap.unintegrated", ncol = 4,
pt.size = 0.1)
## Plot jointly celltype markers
# Print
pbmc.markers.unint.plot
(15 min)
AIM: Check the number of differentially expressed genes for dataset-specific clusters shared between datasets.
Split the Seurat
object into a list of two Seurat
objects (one per dataset) and run the standard Seurat
workflow for each. After calculating the PCA, run FindNeighbors()
and FindClusters()
sequentially to perform graph-based clustering for each dataset, in order to determine the dataset-specific cluster markers.
## Independent sample analysis
# Split Seurat object into two batch on 'stim' label identity
seu.list <- SplitObject(object = seu, split.by = "stim")
# Standard Seurat upstream workflow
seu.list <- lapply(X = seu.list, FUN = function(x) {
x <- NormalizeData(x)
x <- FindVariableFeatures(x)
x <- ScaleData(x)
x <- RunPCA(x)
x <- FindNeighbors(x, dims = 1:15, reduction = "pca")
x <- FindClusters(x, resolution = 0.8, cluster.name = "unintegrated_clusters")
x <- RunUMAP(x, dims = 1:15, reduction = "pca", reduction.name = "umap.unintegrated")
})
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
##
## Number of nodes: 1310
## Number of edges: 48026
##
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.8103
## Number of communities: 10
## Elapsed time: 0 seconds
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
##
## Number of nodes: 1491
## Number of edges: 57060
##
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.8153
## Number of communities: 8
## Elapsed time: 0 seconds
Plot UMAPs for both datasets highlighting the Seurat
clusters found for each.
## Plot independent sample analysis clusters
umap.ind.samp.unint <- lapply(X = seu.list, FUN = function(x) {
DimPlot(x, reduction = "umap.unintegrated", group.by = "unintegrated_clusters", pt.size = 0.1, label = TRUE)
})
umap.ind.samp.unint$CTRL + umap.ind.samp.unint$STIM
Compute the differentially expressed genes for every cluster in every dataset and retrieve only the upregulated genes for every cluster. Then pick the top 50 upregulated genes per cluster based on log2 fold-change, among those you were statistically significant, i.e., FDR<0.05, and calculate the intersection of cluster genes between datasets.
## Independent sample analysis: DGE
# Differential gene expression analysis per cluster
dge.markers.unint <- lapply(X = seu.list, FUN = function(x) {
FindAllMarkers(object = x, assay = "RNA", slot = "data",
logfc.threshold = 0.25, min.pct = 0.25,
min.cells.feature = 10, only.pos = TRUE)
})
# Pick the top50 upregulated genes per cluster based on log2FC
top50.up.cluster <- lapply(X = dge.markers.unint, FUN = function(x) {
x %>%
filter(p_val_adj<0.05) %>%
group_by(cluster) %>%
arrange(desc(avg_log2FC)) %>%
slice_head(n=50) %>%
split(., .$cluster)
})
# Check intersection of top50 marker genes between clusters across batches
shared.genes <- list()
for (i in names(top50.up.cluster$CTRL)) {
for (ii in names(top50.up.cluster$STIM)) {
shared.genes[[paste0("CTRL", i)]][[paste0("STIM", ii)]] <- intersect(top50.up.cluster$CTRL[[i]]$gene,
top50.up.cluster$STIM[[ii]]$gene)
}
}
# Table with number of genes shared between CTRL vs STIM clusters for the top50 upregulated genes per cluster
shared.genes.table <- as.data.frame(
lapply(X = shared.genes, FUN = function(x) {
unlist(lapply(x, length))
})
)
Print the confusion matrix of cluster markers shared between datasets.
## Print table
knitr::kable(shared.genes.table)
CTRL0 | CTRL1 | CTRL2 | CTRL3 | CTRL4 | CTRL5 | CTRL6 | CTRL7 | CTRL8 | CTRL9 | |
---|---|---|---|---|---|---|---|---|---|---|
STIM0 | 22 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 10 |
STIM1 | 0 | 27 | 7 | 2 | 0 | 6 | 1 | 0 | 1 | 0 |
STIM2 | 0 | 14 | 30 | 6 | 0 | 5 | 3 | 0 | 0 | 0 |
STIM3 | 0 | 1 | 0 | 0 | 0 | 21 | 2 | 7 | 7 | 0 |
STIM4 | 0 | 3 | 8 | 30 | 0 | 2 | 1 | 2 | 0 | 0 |
STIM5 | 0 | 0 | 0 | 0 | 25 | 1 | 0 | 1 | 1 | 0 |
STIM6 | 0 | 3 | 2 | 0 | 0 | 0 | 21 | 1 | 0 | 0 |
STIM7 | 1 | 0 | 0 | 1 | 1 | 10 | 0 | 22 | 4 | 0 |
Plot the previous table as a heatmap.
## Plot independent sample analysis clusters
ComplexHeatmap::Heatmap(matrix = as.matrix(shared.genes.table), name = "Shared gene no.",
cluster_columns = FALSE, cluster_rows = FALSE)
(10 min)
AIM: Check if datasets share cell types by predicting cell type labels for both datasets.
This exercise requires to run CellTypist. CellTypist
is a python package that can be run using python, the command-line or online through their website. For convenience, run CellTypist
online.
First export the Seurat
R object as anndata
h5ad
python-compatible object with the function zellkonverter::writeH5AD()
by running the R code chunk below. This will create a file named ifnb_celltypist.h5ad
under the directory: results/cross_tissue_task/objects
. Next, go to the CellTypist
website: https://www.celltypist.org/. Put your own e-mail address. Select the model Immune_All_Low.pkl
which comprises a model for annotation of immune cells. Allow majority voting. Finally, upload the file ifnb_celltypist.h5ad
.
## Automatic cell annotation
file.name <- file.path(res.dir[3], "ifnb_celltypist.h5ad")
cat("Exporting Seurat object as '.h5ad' format to:", gsub("\\../", "", file.name), "\n")
## Exporting Seurat object as '.h5ad' format to: results/ifnb_stimulated_task/objects/ifnb_celltypist.h5ad
zellkonverter::writeH5AD(sce = as.SingleCellExperiment(seu), file = file.name, X_name = "logcounts")
You should receive an e-mail with a download link with the result. Download the result - predictions.tar.gz
- and put the result into the directory: results/cross_tissue_task/tables
. In alternative, you can substitute the url below (because it’s only valid for 7 days) by copying and pasting the link you received in your own e-mail and replace the variable FALSE
by TRUE
for the variable use.url
.
Plot the predicted labels for both data sets.
## Plot labels from CellTypist
# Download predictions
use.url <- FALSE # if you wanna use the url, replace the url by the url you received in your e-mail and replace FALSE by TRUE
if (use.url) { # download: url only valid for 7 days
url <- "https://celltypist.cog.sanger.ac.uk/uploads/17baab62-d10e-4de6-a4ed-b71073518329/predictions.tar.gz?AWSAccessKeyId=C068AUIY7F6SNEJUTEPA&Signature=Ss%2FNyQOcte7QS06j8tnl2s2YDVQ%3D&Expires=1720194680"
download.file(url = url, destfile = file.path(res.dir[2], "predictions.tar.gz"))
}
# Decompress the file with predictions
untar(tarfile = file.path(res.dir[2], "predictions.tar.gz"), exdir = res.dir[2])
# Add predictions to Seurat object
seu@meta.data[,c("predicted_labels", "over_clustering", "majority_voting")] <- read.table(file = file.path(res.dir[2],
"predicted_labels.csv"),
header = TRUE, sep = ",", row.names = 1)
# Plot predictions
DimPlot(object = seu, reduction = "umap.unintegrated", group.by = "majority_voting",
split.by = "stim", pt.size = 0.1, label = TRUE)
(10 min)
AIM: Compare different integration methods.
First, split
the layers of data by batch before performing integration. Then, apply the standard Seurat
workflow. Finally, call the function IntegrateLayers()
to integrate the datasets. In this function you can specify the method you want to run by providing the integration method function.
Seurat
provides three methods: CCA (CCAIntegration
), RPCA (RPCAIntegration
) and Harmony (HarmonyIntegration
). In addition, other methods can be called by using functions from SeuratWrappers
such as: FastMNN (FastMNNIntegration
) or scVI (scVIIntegration
) among others. Harmony (from the harmony
R package), FastMNN (from the batchelor
R package) and scVI (python package installed with conda) need to be installed independently from Seurat
.
Run the R chunk code below to run the integration methods: CCA, RPCA, Harmony and FastMNN (you can try to run scVI if you’ve it installed in your system). Join the layers back after integration to project the integrated data onto UMAP. The UMAP highlights the stim
and ground-truth seurat_annotations
labels.
## Perform integration
# Split layers for integration
seu[["RNA"]] <- split(x = seu[["RNA"]], f = seu$stim)
# Standard workflow
seu <- NormalizeData(seu)
seu <- FindVariableFeatures(seu)
seu <- ScaleData(seu)
seu <- RunPCA(seu)
# Integrate layers
int.methods <- c("CCA" = "CCAIntegration", "RPCA" = "RPCAIntegration",
"Harmony" = "HarmonyIntegration", "FastMNN" = "FastMNNIntegration",
"scVI" = "scVIIntegration")
for (m in names(int.methods)[1:4]) {
cat("\nRunning integration method", m, "...\n")
int.dimred <- paste0("integrated.", m)
umap.dimred <- paste0("umap.", m)
# Integration
if (m=="scVI") {
seu <- IntegrateLayers(object = seu, method = get(eval(substitute(int.methods[m]))),
orig.reduction = "pca",
new.reduction = int.dimred,
conda_env = "~/miniconda3/envs/scvi-env", # substitute this by your installation
verbose = TRUE)
} else {
seu <- IntegrateLayers(object = seu, method = get(eval(substitute(int.methods[m]))),
orig.reduction = "pca",
new.reduction = int.dimred,
verbose = TRUE)
}
}
##
## Running integration method CCA ...
##
## Running integration method RPCA ...
##
## Running integration method Harmony ...
##
## Running integration method FastMNN ...
# Re-join layers after integration
seu[["RNA"]] <- JoinLayers(seu[["RNA"]])
# Run UMAP for every integration method
int.umaps.plots <- list()
for (m in names(int.methods)[1:4]) {
cat("\nRunning UMAP for", m, "integrated result...\n")
int.dimred <- paste0("integrated.", m)
umap.dimred <- paste0("umap.", m)
seu <- RunUMAP(seu, dims = 1:30, reduction = int.dimred, reduction.name = umap.dimred)
int.umaps.plots[[m]] <- DimPlot(object = seu, reduction = umap.dimred, group.by = c("stim", "seurat_annotations"),
combine = FALSE, label.size = 2)
}
##
## Running UMAP for CCA integrated result...
##
## Running UMAP for RPCA integrated result...
##
## Running UMAP for Harmony integrated result...
##
## Running UMAP for FastMNN integrated result...
# Save Seurat object
saveRDS(object = seu, file = file.path(res.dir[3], "seu_integrated.rds"))
(15 min)
AIM: Assess integration qualitatively and quantitatively through dimensional reduction visualizations and LISI scores.
Plot the integrated embeddings below highlighting the stim
and ground-truth seurat_annotations
labels.
## Assess integration by printing the plots using the "batch" and "cell_type" (ground-truth) labels
wrap_plots(c(int.umaps.plots$CCA, int.umaps.plots$RPCA, int.umaps.plots$Harmony, int.umaps.plots$FastMNN),
ncol = 2, byrow = TRUE)
Run the code below to compute the i/cLISI scores for every integrated embedding with the function getIntegrationMetrics()
from the package scIntegrationMetrics
(read more about the meaning of these metrics here).
## Assess quantitatively integration with scIntegrationMetrics
# Calculate metrics
int.mthds.names <- paste0("integrated.", names(int.methods)[1:4])
names(int.mthds.names) <- int.mthds.names
metrics <- list()
for (m in int.mthds.names) {
key <- gsub("integrated.", "", m)
cat("Computing i/cLISI metrics for integration method:", gsub("integrated.", "", key), "\n")
metrics[[key]] <- getIntegrationMetrics(seu, meta.label = "seurat_annotations", meta.batch = "stim",
method.reduction = m, metrics = c("iLISI", "norm_iLISI",
#"CiLISI", "CiLISI_means",
"norm_cLISI", "norm_cLISI_means"))
}
## Computing i/cLISI metrics for integration method: CCA
## Computing i/cLISI metrics for integration method: RPCA
## Computing i/cLISI metrics for integration method: Harmony
## Computing i/cLISI metrics for integration method: FastMNN
# Join metrics
metrics <- as.data.frame(do.call(cbind, metrics))
Print the result below.
# Print table
knitr::kable(metrics)
CCA | RPCA | Harmony | FastMNN | |
---|---|---|---|---|
iLISI | 1.761276 | 1.805624 | 1.589445 | 1.329591 |
norm_iLISI | 0.7612756 | 0.8056235 | 0.5894454 | 0.329591 |
norm_cLISI | 0.9402503 | 0.9362085 | 0.9329568 | 0.9393513 |
norm_cLISI_means | 0.8956541 | 0.8914974 | 0.8859415 | 0.907111 |
## R packages and versions used in these analyses
sessionInfo()
## R version 4.1.0 (2021-05-18)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.5 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] scIntegrationMetrics_1.1 SeuratWrappers_0.3.2 patchwork_1.2.0
## [4] Seurat_5.1.0 SeuratObject_5.0.2 sp_2.1-4
## [7] dplyr_1.1.4
##
## loaded via a namespace (and not attached):
## [1] utf8_1.2.4 spatstat.explore_3.2-7
## [3] reticulate_1.38.0 R.utils_2.12.3
## [5] tidyselect_1.2.1 htmlwidgets_1.6.4
## [7] BiocParallel_1.28.3 grid_4.1.0
## [9] Rtsne_0.17 ScaledMatrix_1.2.0
## [11] zellkonverter_1.4.0 munsell_0.5.1
## [13] codetools_0.2-18 ica_1.0-3
## [15] future_1.33.2 miniUI_0.1.1.1
## [17] batchelor_1.10.0 withr_3.0.0
## [19] spatstat.random_3.2-3 colorspace_2.1-0
## [21] progressr_0.14.0 filelock_1.0.3
## [23] Biobase_2.54.0 highr_0.11
## [25] knitr_1.47 rstudioapi_0.13
## [27] stats4_4.1.0 SingleCellExperiment_1.16.0
## [29] ROCR_1.0-11 tensor_1.5
## [31] listenv_0.9.1 MatrixGenerics_1.6.0
## [33] labeling_0.4.3 harmony_1.2.0
## [35] GenomeInfoDbData_1.2.7 polyclip_1.10-6
## [37] farver_2.1.2 basilisk_1.6.0
## [39] parallelly_1.37.1 vctrs_0.6.5
## [41] generics_0.1.3 xfun_0.45
## [43] R6_2.5.1 doParallel_1.0.17
## [45] GenomeInfoDb_1.30.1 clue_0.3-65
## [47] rsvd_1.0.5 DelayedArray_0.20.0
## [49] bitops_1.0-7 spatstat.utils_3.0-5
## [51] cachem_1.1.0 assertthat_0.2.1
## [53] promises_1.3.0 scales_1.3.0
## [55] gtable_0.3.5 beachmat_2.10.0
## [57] globals_0.16.3 goftest_1.2-3
## [59] klippy_0.0.0.9500 spam_2.10-0
## [61] rlang_1.1.4 GlobalOptions_0.1.2
## [63] splines_4.1.0 lazyeval_0.2.2
## [65] spatstat.geom_3.2-9 BiocManager_1.30.23
## [67] yaml_2.3.8 reshape2_1.4.4
## [69] abind_1.4-5 httpuv_1.6.15
## [71] tools_4.1.0 ggplot2_3.5.1
## [73] jquerylib_0.1.4 RColorBrewer_1.1-3
## [75] BiocGenerics_0.40.0 ggridges_0.5.6
## [77] Rcpp_1.0.12 plyr_1.8.9
## [79] sparseMatrixStats_1.6.0 zlibbioc_1.40.0
## [81] purrr_1.0.2 RCurl_1.98-1.14
## [83] basilisk.utils_1.6.0 deldir_2.0-4
## [85] pbapply_1.7-2 GetoptLong_1.0.5
## [87] cowplot_1.1.3 S4Vectors_0.32.4
## [89] zoo_1.8-12 SummarizedExperiment_1.24.0
## [91] ggrepel_0.9.5 cluster_2.1.2
## [93] magrittr_2.0.3 data.table_1.15.4
## [95] RSpectra_0.16-1 scattermore_1.2
## [97] ResidualMatrix_1.4.0 circlize_0.4.16
## [99] lmtest_0.9-40 RANN_2.6.1
## [101] fitdistrplus_1.1-11 matrixStats_1.1.0
## [103] mime_0.12 evaluate_0.24.0
## [105] xtable_1.8-4 RhpcBLASctl_0.23-42
## [107] fastDummies_1.7.3 IRanges_2.28.0
## [109] gridExtra_2.3 shape_1.4.6.1
## [111] compiler_4.1.0 tibble_3.2.1
## [113] KernSmooth_2.23-20 crayon_1.5.3
## [115] R.oo_1.26.0 htmltools_0.5.8.1
## [117] mgcv_1.8-35 later_1.3.2
## [119] tidyr_1.3.1 DBI_1.2.3
## [121] ComplexHeatmap_2.15.4 MASS_7.3-54
## [123] Matrix_1.6-5 permute_0.9-7
## [125] cli_3.6.3 R.methodsS3_1.8.2
## [127] parallel_4.1.0 dotCall64_1.1-1
## [129] igraph_2.0.3 GenomicRanges_1.46.1
## [131] pkgconfig_2.0.3 dir.expiry_1.2.0
## [133] scuttle_1.4.0 plotly_4.10.4
## [135] spatstat.sparse_3.1-0 foreach_1.5.2
## [137] bslib_0.7.0 XVector_0.34.0
## [139] stringr_1.5.1 digest_0.6.36
## [141] sctransform_0.4.1 RcppAnnoy_0.0.22
## [143] vegan_2.6-6.1 spatstat.data_3.1-2
## [145] rmarkdown_2.27 leiden_0.4.3.1
## [147] uwot_0.2.2 DelayedMatrixStats_1.16.0
## [149] shiny_1.8.1.1 rjson_0.2.21
## [151] lifecycle_1.0.4 nlme_3.1-152
## [153] jsonlite_1.8.8 BiocNeighbors_1.12.0
## [155] viridisLite_0.4.2 limma_3.50.3
## [157] fansi_1.0.6 pillar_1.9.0
## [159] lattice_0.20-44 fastmap_1.2.0
## [161] httr_1.4.7 survival_3.2-11
## [163] glue_1.7.0 remotes_2.5.0
## [165] png_0.1-8 iterators_1.0.14
## [167] presto_1.0.0 stringi_1.8.4
## [169] sass_0.4.9 RcppHNSW_0.6.0
## [171] BiocSingular_1.10.0 irlba_2.3.5.1
## [173] future.apply_1.11.2