This single-cell RNA-seq integration task describes an example of cross-tissue integration: human peripheral blood mononuclear cells (PBMCs) with human pancreatic islets. Both data sets were retrieved from the SeuratData
(v.0.2.2.9001):
PBMCs:
pbmc3k
SeuratData
data set (v.3.1.4): 3k human PBMCs from 10X Genomics
no. of cells: 2,700
Pancreas:
panc8
SeuratData
data set (v.3.0.2): 8 human pancreas data sets across five technologies (only included the data set indrop1
)
no. of cells (for indrop1
): 1,937
The identity of the data set - pbmc
or pancreas
- for every cell was saved in the Seurat
meta.data
column variable batch
. The ground-truth cell identities were also provided in the column variable cell_type
but avoid checking them until the end of this notebook to make these analyses more interesting.
The analyses performed in this notebook rely in the Seurat
R package (v.5.1.0).
Import the main packages used in this notebook: Seurat
(v.5.1.0), SeuratWrappers
(v.0.3.2 - integration wrappers for Seurat), dplyr
(v.1.1.4 - wrangling data), patchwork
(v.1.2.0 - visualization), scIntegrationMetrics
(v 1.1 - compute LISI integration metrics).
## Import packages
library("dplyr") # data wrangling
library("Seurat") # scRNA-seq analysis
library("patchwork") # viz
library("SeuratWrappers") # integration wrappers
library("scIntegrationMetrics") # compute LISI integration metrics
Create output directories to save intermediate results, figures, tables and R objects.
## Output directories
res.dir <- file.path("../results", "cross_tissue_task", c("plots", "tables", "objects"))
for (folder in res.dir) if (!dir.exists(folder)) dir.create(path = folder, recursive = TRUE)
(3 min)
AIM: Import and explore the Seurat object data.
Import the cross-tissue Seurat
R object pbmc3k_panc8.rds
located in the folder data
.
# Import data
data.dir <- "../data"
seu <- readRDS(file = file.path(data.dir, "pbmc3k_panc8.rds"))
Explore quickly the Seurat
seu
object.
## Explore Seurat object
# Print Seurat object
seu
## An object of class Seurat
## 35686 features across 4637 samples within 1 assay
## Active assay: RNA (35686 features, 0 variable features)
## 2 layers present: data, counts
# Structure
str(seu)
## Formal class 'Seurat' [package "SeuratObject"] with 13 slots
## ..@ assays :List of 1
## .. ..$ RNA:Formal class 'Assay5' [package "SeuratObject"] with 8 slots
## .. .. .. ..@ layers :List of 2
## .. .. .. .. ..$ data :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
## .. .. .. .. .. .. ..@ i : int [1:6009421] 29 73 80 148 163 184 186 227 229 230 ...
## .. .. .. .. .. .. ..@ p : int [1:4638] 0 779 2131 3260 4220 4741 5522 6304 7094 7626 ...
## .. .. .. .. .. .. ..@ Dim : int [1:2] 35686 4637
## .. .. .. .. .. .. ..@ Dimnames:List of 2
## .. .. .. .. .. .. .. ..$ : NULL
## .. .. .. .. .. .. .. ..$ : NULL
## .. .. .. .. .. .. ..@ x : num [1:6009421] 1 1 2 1 1 1 1 41 1 1 ...
## .. .. .. .. .. .. ..@ factors : list()
## .. .. .. .. ..$ counts:Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
## .. .. .. .. .. .. ..@ i : int [1:6009421] 29 73 80 148 163 184 186 227 229 230 ...
## .. .. .. .. .. .. ..@ p : int [1:4638] 0 779 2131 3260 4220 4741 5522 6304 7094 7626 ...
## .. .. .. .. .. .. ..@ Dim : int [1:2] 35686 4637
## .. .. .. .. .. .. ..@ Dimnames:List of 2
## .. .. .. .. .. .. .. ..$ : NULL
## .. .. .. .. .. .. .. ..$ : NULL
## .. .. .. .. .. .. ..@ x : num [1:6009421] 1 1 2 1 1 1 1 41 1 1 ...
## .. .. .. .. .. .. ..@ factors : list()
## .. .. .. ..@ cells :Formal class 'LogMap' [package "SeuratObject"] with 1 slot
## .. .. .. .. .. ..@ .Data: logi [1:4637, 1:2] TRUE TRUE TRUE TRUE TRUE TRUE ...
## .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. .. .. .. .. ..$ : chr [1:4637] "pbmc_AAACATACAACCAC" "pbmc_AAACATTGAGCTAC" "pbmc_AAACATTGATCAGC" "pbmc_AAACCGTGCTTCCG" ...
## .. .. .. .. .. .. .. ..$ : chr [1:2] "counts" "data"
## .. .. .. .. .. ..$ dim : int [1:2] 4637 2
## .. .. .. .. .. ..$ dimnames:List of 2
## .. .. .. .. .. .. ..$ : chr [1:4637] "pbmc_AAACATACAACCAC" "pbmc_AAACATTGAGCTAC" "pbmc_AAACATTGATCAGC" "pbmc_AAACCGTGCTTCCG" ...
## .. .. .. .. .. .. ..$ : chr [1:2] "counts" "data"
## .. .. .. ..@ features :Formal class 'LogMap' [package "SeuratObject"] with 1 slot
## .. .. .. .. .. ..@ .Data: logi [1:35686, 1:2] TRUE TRUE TRUE TRUE TRUE TRUE ...
## .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. .. .. .. .. ..$ : chr [1:35686] "AL627309.1" "AP006222.2" "RP11-206L10.2" "RP11-206L10.9" ...
## .. .. .. .. .. .. .. ..$ : chr [1:2] "counts" "data"
## .. .. .. .. .. ..$ dim : int [1:2] 35686 2
## .. .. .. .. .. ..$ dimnames:List of 2
## .. .. .. .. .. .. ..$ : chr [1:35686] "AL627309.1" "AP006222.2" "RP11-206L10.2" "RP11-206L10.9" ...
## .. .. .. .. .. .. ..$ : chr [1:2] "counts" "data"
## .. .. .. ..@ default : int 1
## .. .. .. ..@ assay.orig: chr(0)
## .. .. .. ..@ meta.data :'data.frame': 35686 obs. of 0 variables
## .. .. .. ..@ misc : list()
## .. .. .. ..@ key : chr "rna_"
## ..@ meta.data :'data.frame': 4637 obs. of 5 variables:
## .. ..$ orig.ident : chr [1:4637] "pbmc3k" "pbmc3k" "pbmc3k" "pbmc3k" ...
## .. ..$ nCount_RNA : num [1:4637] 2419 4903 3147 2639 980 ...
## .. ..$ nFeature_RNA: int [1:4637] 779 1352 1129 960 521 781 782 790 532 550 ...
## .. ..$ cell_type : chr [1:4637] "Memory CD4 T" "B" "Memory CD4 T" "CD14+ Mono" ...
## .. ..$ batch : chr [1:4637] "pbmc" "pbmc" "pbmc" "pbmc" ...
## ..@ active.assay: chr "RNA"
## ..@ active.ident: Factor w/ 2 levels "pbmc3k","indrop": 1 1 1 1 1 1 1 1 1 1 ...
## .. ..- attr(*, "names")= chr [1:4637] "pbmc_AAACATACAACCAC" "pbmc_AAACATTGAGCTAC" "pbmc_AAACATTGATCAGC" "pbmc_AAACCGTGCTTCCG" ...
## ..@ graphs : list()
## ..@ neighbors : list()
## ..@ reductions : list()
## ..@ images : list()
## ..@ project.name: chr "pbmc3k_panc8"
## ..@ misc : list()
## ..@ version :Classes 'package_version', 'numeric_version' hidden list of 1
## .. ..$ : int [1:3] 5 0 2
## ..@ commands : list()
## ..@ tools : list()
# Check meta.data
head(seu@meta.data)
## orig.ident nCount_RNA nFeature_RNA cell_type batch
## pbmc_AAACATACAACCAC pbmc3k 2419 779 Memory CD4 T pbmc
## pbmc_AAACATTGAGCTAC pbmc3k 4903 1352 B pbmc
## pbmc_AAACATTGATCAGC pbmc3k 3147 1129 Memory CD4 T pbmc
## pbmc_AAACCGTGCTTCCG pbmc3k 2639 960 CD14+ Mono pbmc
## pbmc_AAACCGTGTATGCG pbmc3k 980 521 NK pbmc
## pbmc_AAACGCACTGGTAC pbmc3k 2163 781 Memory CD4 T pbmc
# Check how many cells per data set
table(seu$batch)
##
## pancreas pbmc
## 1937 2700
# Check no. of genes
nrow(seu)
## [1] 35686
# Check no. of cells
ncol(seu)
## [1] 4637
(7 min)
AIM: See how much the two data sets overlap each other in the low dimensional reductions.
Run the standard Seurat
upstream workflow to jointly compute a PCA and UMAP for the datasets:
NormalizeData()
: log1p-normalization with a scaling factor of 10K
FindVariableFeatures()
: identification of 2K HVG
ScaleData()
: standardization of the 2K HVG
RunPCA()
: computation of a PCA with the 2K HVG standardized
RunUMAP()
: computation of a UMAP using the first dims
of the previously computed PCA
## Joint analysis
# Standard Seurat upstream workflow
seu <- NormalizeData(seu)
seu <- FindVariableFeatures(seu)
seu <- ScaleData(seu)
seu <- RunPCA(seu)
seu <- RunUMAP(seu, dims = 1:30, reduction = "pca", reduction.name = "umap.unintegrated")
Plot the PCA and UMAP side-by-side below.
## Plot jointly dimreds
pca.unint <- DimPlot(seu, reduction = "pca", group.by = "batch")
umap.unint <- DimPlot(seu, reduction = "umap.unintegrated", group.by = "batch")
pca.unint + umap.unint
(5 min)
AIM: Check if cells from different datasets share well-known cell-specific markers.
Plot below some cell-specific PBMC or pancreatic cell type markers. Feel free to add other genes you might be interested in checking.
## Joint celltype markers
# List of PBMC and (some) pancreatic cell markers
markers.plot <- list(
# "pbmc" = c("CD3D", "CREM", "HSPH1", "SELL", "GIMAP5", "CACYBP", "GNLY", "NKG7", "CCL5",
# "CD8A", "MS4A1", "CD79A", "MIR155HG", "NME1", "FCGR3A", "VMO1", "CCL2", "S100A9",
# "HLA-DQA1", "GPR183", "PPBP", "GNG11", "HBA2", "HBB", "TSPAN13", "IL3RA", "IGJ",
# "PRSS57"),
"pbmc" = c("CD3D", "NKG7", "CD8A", "MS4A1", "CD79A", "FCGR3A"),
"pancreas" = c("REG1A", "PPY", "SST", "GHRL", "VWF", "SOX10")
)
# Plot
pbmc.markers.unint.plot <- FeaturePlot(seu, features = markers.plot$pbmc, split.by = "batch",
max.cutoff = 3, cols = c("grey", "red"),
reduction = "umap.unintegrated", ncol = 4, pt.size = 0.1)
pancreas.markers.unint.plot <- FeaturePlot(seu, features = markers.plot$pancreas, split.by = "batch",
max.cutoff = 3, cols = c("grey", "red"),
reduction = "umap.unintegrated", ncol = 4, pt.size = 0.1)
## Plot jointly celltype markers
# Print
pbmc.markers.unint.plot
pancreas.markers.unint.plot
(15 min)
AIM: Check the number of differentially expressed genes for dataset-specific clusters shared between datasets.
Split the Seurat
object into a list of two Seurat
objects (one per dataset) and run the standard Seurat
workflow for each. After calculating the PCA, run FindNeighbors()
and FindClusters()
sequentially to perform graph-based clustering for each dataset, in order to determine the dataset-specific cluster markers.
## Independent sample analysis
# Split Seurat object into two batch on 'batch' label identity
seu.list <- SplitObject(object = seu, split.by = "batch")
# Standard Seurat upstream workflow
seu.list <- lapply(X = seu.list, FUN = function(x) {
x <- NormalizeData(x)
x <- FindVariableFeatures(x)
x <- ScaleData(x)
x <- RunPCA(x)
x <- FindNeighbors(x, dims = 1:15, reduction = "pca")
x <- FindClusters(x, resolution = 0.8, cluster.name = "unintegrated_clusters")
x <- RunUMAP(x, dims = 1:15, reduction = "pca", reduction.name = "umap.unintegrated")
})
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
##
## Number of nodes: 2700
## Number of edges: 108625
##
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.8180
## Number of communities: 9
## Elapsed time: 0 seconds
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
##
## Number of nodes: 1937
## Number of edges: 65621
##
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.8659
## Number of communities: 12
## Elapsed time: 0 seconds
Plot UMAPs for both datasets highlighting the Seurat
clusters found for each.
## Plot independent sample analysis clusters
umap.ind.samp.unint <- lapply(X = seu.list, FUN = function(x) {
DimPlot(x, reduction = "umap.unintegrated", group.by = "unintegrated_clusters", pt.size = 0.1, label = TRUE)
})
umap.ind.samp.unint$pbmc + umap.ind.samp.unint$pancreas
Compute the differentially expressed genes for every cluster in every dataset and retrieve only the upregulated genes for every cluster. Then pick the top 50 upregulated genes per cluster based on log2 fold-change, among those you were statistically significant, i.e., FDR<0.05, and calculate the intersection of cluster genes between datasets.
## Independent sample analysis: DGE
# Differential gene expression analysis per cluster
dge.markers.unint <- lapply(X = seu.list, FUN = function(x) {
FindAllMarkers(object = x, assay = "RNA", slot = "data",
logfc.threshold = 0.25, min.pct = 0.25,
min.cells.feature = 10, only.pos = TRUE)
})
# Pick the top50 upregulated genes per cluster based on log2FC
top50.up.cluster <- lapply(X = dge.markers.unint, FUN = function(x) {
x %>%
filter(p_val_adj<0.05) %>%
group_by(cluster) %>%
arrange(desc(avg_log2FC)) %>%
slice_head(n=50) %>%
split(., .$cluster)
})
# Check intersection of top50 marker genes between clusters across batches
shared.genes <- list()
for (i in names(top50.up.cluster$pbmc)) {
for (ii in names(top50.up.cluster$pancreas)) {
shared.genes[[paste0("pbmc", i)]][[paste0("pancreas", ii)]] <- intersect(top50.up.cluster$pbmc[[i]]$gene,
top50.up.cluster$pancreas[[ii]]$gene)
}
}
# Table with number of genes shared between pbmc vs pancreas clusters for the top50 upregulated genes per cluster
shared.genes.table <- as.data.frame(
lapply(X = shared.genes, FUN = function(x) {
unlist(lapply(x, length))
})
)
Print the confusion matrix of cluster markers shared between datasets.
## Print table
knitr::kable(shared.genes.table)
pbmc0 | pbmc1 | pbmc2 | pbmc3 | pbmc4 | pbmc5 | pbmc6 | pbmc7 | pbmc8 | |
---|---|---|---|---|---|---|---|---|---|
pancreas0 | 0 | 0 | 0 | 0 | 1 | 1 | 2 | 0 | 0 |
pancreas1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 |
pancreas2 | 0 | 0 | 0 | 1 | 1 | 1 | 2 | 0 | 0 |
pancreas3 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
pancreas4 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
pancreas5 | 1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
pancreas6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
pancreas7 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
pancreas8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
pancreas9 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 2 |
pancreas10 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 0 |
pancreas11 | 0 | 0 | 2 | 2 | 2 | 3 | 7 | 3 | 0 |
Plot the previous table as a heatmap.
## Plot independent sample analysis clusters
ComplexHeatmap::Heatmap(matrix = as.matrix(shared.genes.table), name = "Shared gene no.",
cluster_columns = FALSE, cluster_rows = FALSE)
(10 min)
AIM: Check if datasets share cell types by predicting cell type labels for both datasets.
This exercise requires to run CellTypist. CellTypist
is a python package that can be run using python, the command-line or online through their website. For convenience, run CellTypist
online.
First export the Seurat
R object as anndata
h5ad
python-compatible object with the function zellkonverter::writeH5AD()
by running the R code chunk below. This will create a file named pbmc3k_panc8_celltypist.h5ad
under the directory: results/cross_tissue_task/objects
. Next, go to the CellTypist
website: https://www.celltypist.org/. Put your own e-mail address. Select the model Immune_All_Low.pkl
which comprises a model for annotation of immune cells. Allow majority voting. Finally, upload the file pbmc3k_panc8_celltypist.h5ad
.
## Automatic cell annotation
file.name <- file.path(res.dir[3], "pbmc3k_panc8_celltypist.h5ad")
cat("Exporting Seurat object as '.h5ad' format to:", gsub("\\../", "", file.name), "\n")
## Exporting Seurat object as '.h5ad' format to: results/cross_tissue_task/objects/pbmc3k_panc8_celltypist.h5ad
zellkonverter::writeH5AD(sce = as.SingleCellExperiment(seu), file = file.name, X_name = "logcounts")
You should receive an e-mail with a download link with the result. Download the result - predictions.tar.gz
- and put the result into the directory: results/cross_tissue_task/tables
. In alternative, you can substitute the url below (because it’s only valid for 7 days) by copying and pasting the link you received in your own e-mail and replace the variable FALSE
by TRUE
for the variable use.url
.
Plot the predicted labels for both data sets.
## Plot labels from CellTypist
# Download predictions
use.url <- FALSE # if you wanna use the url, replace the url by the url you received in your e-mail and replace FALSE by TRUE
if (use.url) { # download: url only valid for 7 days
url <- "https://celltypist.cog.sanger.ac.uk/uploads/9cd807d8-69de-4b7c-a827-70e51c4a8b4a/predictions.tar.gz?AWSAccessKeyId=C068AUIY7F6SNEJUTEPA&Signature=XY4lL4D%2FWnuuBhV5%2BDOCimkyxSk%3D&Expires=1720176188"
download.file(url = url, destfile = file.path(res.dir[2], "predictions.tar.gz"))
}
# Decompress the file with predictions
untar(tarfile = file.path(res.dir[2], "predictions.tar.gz"), exdir = res.dir[2])
# Add predictions to Seurat object
seu@meta.data[,c("predicted_labels", "over_clustering", "majority_voting")] <- read.table(file = file.path(res.dir[2],
"predicted_labels.csv"),
header = TRUE, sep = ",", row.names = 1)
# Plot predictions
DimPlot(object = seu, reduction = "umap.unintegrated", group.by = "majority_voting",
split.by = "batch", pt.size = 0.1, label = TRUE)
(10 min)
AIM: Compare different integration methods.
First, split
the layers of data by batch before performing integration. Then, apply the standard Seurat
workflow. Finally, call the function IntegrateLayers()
to integrate the datasets. In this function you can specify the method you want to run by providing the integration method function.
Seurat
provides three methods: CCA (CCAIntegration
), RPCA (RPCAIntegration
) and Harmony (HarmonyIntegration
). In addition, other methods can be called by using functions from SeuratWrappers
such as: FastMNN (FastMNNIntegration
) or scVI (scVIIntegration
) among others. Harmony (from the harmony
R package), FastMNN (from the batchelor
R package) and scVI (python package installed with conda) need to be installed independently from Seurat
.
Run the R chunk code below to run the integration methods: CCA, RPCA, Harmony and FastMNN (you can try to run scVI if you’ve it installed in your system). Join the layers back after integration to project the integrated data onto UMAP. The UMAP highlights the batch
and ground-truth cell_type
labels.
## Perform integration
# Split layers for integration
seu[["RNA"]] <- split(x = seu[["RNA"]], f = seu$batch)
# Standard workflow
seu <- NormalizeData(seu)
seu <- FindVariableFeatures(seu)
seu <- ScaleData(seu)
seu <- RunPCA(seu)
# Integrate layers
int.methods <- c("CCA" = "CCAIntegration", "RPCA" = "RPCAIntegration",
"Harmony" = "HarmonyIntegration", "FastMNN" = "FastMNNIntegration",
"scVI" = "scVIIntegration")
for (m in names(int.methods)[1:4]) {
cat("\nRunning integration method", m, "...\n")
int.dimred <- paste0("integrated.", m)
umap.dimred <- paste0("umap.", m)
# Integration
if (m=="RPCA") {
seu <- IntegrateLayers(object = seu, method = get(eval(substitute(int.methods[m]))),
orig.reduction = "pca",
new.reduction = int.dimred,
k.weight = 50, # otherwise it aborts
verbose = TRUE)
} else if (m=="scVI") {
seu <- IntegrateLayers(object = seu, method = get(eval(substitute(int.methods[m]))),
orig.reduction = "pca",
new.reduction = int.dimred,
conda_env = "~/miniconda3/envs/scvi-env", # substitute this by your installation
verbose = TRUE)
} else {
seu <- IntegrateLayers(object = seu, method = get(eval(substitute(int.methods[m]))),
orig.reduction = "pca",
new.reduction = int.dimred,
verbose = TRUE)
}
}
##
## Running integration method CCA ...
##
## Running integration method RPCA ...
##
## Running integration method Harmony ...
##
## Running integration method FastMNN ...
# Re-join layers after integration
seu[["RNA"]] <- JoinLayers(seu[["RNA"]])
# Run UMAP for every integration method
int.umaps.plots <- list()
for (m in names(int.methods)[1:4]) {
cat("\nRunning UMAP for", m, "integrated result...\n")
int.dimred <- paste0("integrated.", m)
umap.dimred <- paste0("umap.", m)
seu <- RunUMAP(seu, dims = 1:30, reduction = int.dimred, reduction.name = umap.dimred)
int.umaps.plots[[m]] <- DimPlot(object = seu, reduction = umap.dimred, group.by = c("batch", "cell_type"),
combine = FALSE, label.size = 2, pt.size = 0.1)
}
##
## Running UMAP for CCA integrated result...
##
## Running UMAP for RPCA integrated result...
##
## Running UMAP for Harmony integrated result...
##
## Running UMAP for FastMNN integrated result...
# Save Seurat object
saveRDS(object = seu, file = file.path(res.dir[3], "seu_integrated.rds"))
(15 min)
AIM: Assess integration qualitatively and quantitatively through dimensional reduction visualizations and LISI scores.
Plot the integrated embeddings below highlighting the batch
and ground-truth cell_type
labels.
## Assess integration by printing the plots using the "batch" and "cell_type" (ground-truth) labels
wrap_plots(c(int.umaps.plots$CCA, int.umaps.plots$RPCA, int.umaps.plots$Harmony, int.umaps.plots$FastMNN),
ncol = 2, byrow = TRUE)
Run the code below to compute the i/cLISI scores for every integrated embedding with the function getIntegrationMetrics()
from the package scIntegrationMetrics
(read more about the meaning of these metrics here).
## Assess quantitatively integration with scIntegrationMetrics
# Calculate metrics
int.mthds.names <- paste0("integrated.", names(int.methods)[1:4])
names(int.mthds.names) <- int.mthds.names
metrics <- list()
for (m in int.mthds.names) {
key <- gsub("integrated.", "", m)
cat("Computing i/cLISI metrics for integration method:", gsub("integrated.", "", key), "\n")
metrics[[key]] <- getIntegrationMetrics(seu, meta.label = "cell_type", meta.batch = "batch",
method.reduction = m, metrics = c("iLISI", "norm_iLISI",
#"CiLISI", "CiLISI_means",
"norm_cLISI", "norm_cLISI_means"))
}
## Computing i/cLISI metrics for integration method: CCA
## Computing i/cLISI metrics for integration method: RPCA
## Computing i/cLISI metrics for integration method: Harmony
## Computing i/cLISI metrics for integration method: FastMNN
# Join metrics
metrics <- as.data.frame(do.call(cbind, metrics))
Print the result below.
# Print table
knitr::kable(metrics)
CCA | RPCA | Harmony | FastMNN | |
---|---|---|---|---|
iLISI | 1.100594 | 1.02563 | 1.004797 | 1.035199 |
norm_iLISI | 0.1005943 | 0.02562982 | 0.004796592 | 0.03519914 |
norm_cLISI | 0.9452894 | 0.9424586 | 0.9591123 | 0.9403612 |
norm_cLISI_means | 0.917104 | 0.9022 | 0.932591 | 0.9084225 |
## R packages and versions used in these analyses
sessionInfo()
## R version 4.1.0 (2021-05-18)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.5 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] scIntegrationMetrics_1.1 SeuratWrappers_0.3.2 patchwork_1.2.0
## [4] Seurat_5.1.0 SeuratObject_5.0.2 sp_2.1-4
## [7] dplyr_1.1.4
##
## loaded via a namespace (and not attached):
## [1] utf8_1.2.4 spatstat.explore_3.2-7
## [3] reticulate_1.38.0 R.utils_2.12.3
## [5] tidyselect_1.2.1 htmlwidgets_1.6.4
## [7] BiocParallel_1.28.3 grid_4.1.0
## [9] Rtsne_0.17 ScaledMatrix_1.2.0
## [11] zellkonverter_1.4.0 munsell_0.5.1
## [13] codetools_0.2-18 ica_1.0-3
## [15] future_1.33.2 miniUI_0.1.1.1
## [17] batchelor_1.10.0 withr_3.0.0
## [19] spatstat.random_3.2-3 colorspace_2.1-0
## [21] progressr_0.14.0 filelock_1.0.3
## [23] Biobase_2.54.0 highr_0.11
## [25] knitr_1.47 rstudioapi_0.13
## [27] stats4_4.1.0 SingleCellExperiment_1.16.0
## [29] ROCR_1.0-11 tensor_1.5
## [31] listenv_0.9.1 MatrixGenerics_1.6.0
## [33] labeling_0.4.3 harmony_1.2.0
## [35] GenomeInfoDbData_1.2.7 polyclip_1.10-6
## [37] farver_2.1.2 basilisk_1.6.0
## [39] parallelly_1.37.1 vctrs_0.6.5
## [41] generics_0.1.3 xfun_0.45
## [43] R6_2.5.1 doParallel_1.0.17
## [45] GenomeInfoDb_1.30.1 clue_0.3-65
## [47] rsvd_1.0.5 DelayedArray_0.20.0
## [49] bitops_1.0-7 spatstat.utils_3.0-5
## [51] cachem_1.1.0 assertthat_0.2.1
## [53] promises_1.3.0 scales_1.3.0
## [55] gtable_0.3.5 beachmat_2.10.0
## [57] globals_0.16.3 goftest_1.2-3
## [59] klippy_0.0.0.9500 spam_2.10-0
## [61] rlang_1.1.4 GlobalOptions_0.1.2
## [63] splines_4.1.0 lazyeval_0.2.2
## [65] spatstat.geom_3.2-9 BiocManager_1.30.23
## [67] yaml_2.3.8 reshape2_1.4.4
## [69] abind_1.4-5 httpuv_1.6.15
## [71] tools_4.1.0 ggplot2_3.5.1
## [73] jquerylib_0.1.4 RColorBrewer_1.1-3
## [75] BiocGenerics_0.40.0 ggridges_0.5.6
## [77] Rcpp_1.0.12 plyr_1.8.9
## [79] sparseMatrixStats_1.6.0 zlibbioc_1.40.0
## [81] purrr_1.0.2 RCurl_1.98-1.14
## [83] basilisk.utils_1.6.0 deldir_2.0-4
## [85] pbapply_1.7-2 GetoptLong_1.0.5
## [87] cowplot_1.1.3 S4Vectors_0.32.4
## [89] zoo_1.8-12 SummarizedExperiment_1.24.0
## [91] ggrepel_0.9.5 cluster_2.1.2
## [93] magrittr_2.0.3 data.table_1.15.4
## [95] RSpectra_0.16-1 scattermore_1.2
## [97] ResidualMatrix_1.4.0 circlize_0.4.16
## [99] lmtest_0.9-40 RANN_2.6.1
## [101] fitdistrplus_1.1-11 matrixStats_1.1.0
## [103] mime_0.12 evaluate_0.24.0
## [105] xtable_1.8-4 RhpcBLASctl_0.23-42
## [107] fastDummies_1.7.3 IRanges_2.28.0
## [109] gridExtra_2.3 shape_1.4.6.1
## [111] compiler_4.1.0 tibble_3.2.1
## [113] KernSmooth_2.23-20 crayon_1.5.3
## [115] R.oo_1.26.0 htmltools_0.5.8.1
## [117] mgcv_1.8-35 later_1.3.2
## [119] tidyr_1.3.1 DBI_1.2.3
## [121] ComplexHeatmap_2.15.4 MASS_7.3-54
## [123] Matrix_1.6-5 permute_0.9-7
## [125] cli_3.6.3 R.methodsS3_1.8.2
## [127] parallel_4.1.0 dotCall64_1.1-1
## [129] igraph_2.0.3 GenomicRanges_1.46.1
## [131] pkgconfig_2.0.3 dir.expiry_1.2.0
## [133] scuttle_1.4.0 plotly_4.10.4
## [135] spatstat.sparse_3.1-0 foreach_1.5.2
## [137] bslib_0.7.0 XVector_0.34.0
## [139] stringr_1.5.1 digest_0.6.36
## [141] sctransform_0.4.1 RcppAnnoy_0.0.22
## [143] vegan_2.6-6.1 spatstat.data_3.1-2
## [145] rmarkdown_2.27 leiden_0.4.3.1
## [147] uwot_0.2.2 DelayedMatrixStats_1.16.0
## [149] shiny_1.8.1.1 rjson_0.2.21
## [151] lifecycle_1.0.4 nlme_3.1-152
## [153] jsonlite_1.8.8 BiocNeighbors_1.12.0
## [155] viridisLite_0.4.2 limma_3.50.3
## [157] fansi_1.0.6 pillar_1.9.0
## [159] lattice_0.20-44 fastmap_1.2.0
## [161] httr_1.4.7 survival_3.2-11
## [163] glue_1.7.0 remotes_2.5.0
## [165] png_0.1-8 iterators_1.0.14
## [167] presto_1.0.0 stringi_1.8.4
## [169] sass_0.4.9 RcppHNSW_0.6.0
## [171] BiocSingular_1.10.0 irlba_2.3.5.1
## [173] future.apply_1.11.2