Principal Component Analysis — RunPCA • Coralysis

Perform principal component analysis using assays or the joint probability matrix as input.

Usage

RunPCA.SingleCellExperiment(
  object,
  assay.name,
  p,
  scale,
  center,
  threshold,
  pca.method,
  return.model,
  select.icp.tables,
  features,
  dimred.name
)

# S4 method for class 'SingleCellExperiment'
RunPCA(
  object,
  assay.name = "joint.probability",
  p = 50,
  scale = TRUE,
  center = TRUE,
  threshold = 0,
  pca.method = "irlba",
  return.model = FALSE,
  select.icp.tables = NULL,
  features = NULL,
  dimred.name = "PCA"
)

Arguments

object: A SingleCellExperiment object.
assay.name: Name of the assay to compute PCA. One of assayNames(object) or joint.probability. By default joint.probability is used. Use joint.probability to obtain an integrated embedding after running RunParallelDivisiveICP. One of the assays in assayNames(object) can be provided before performing integration to assess if data requires integration.
p: A positive integer denoting the number of principal components to calculate and select. Default is 50.
scale: A logical specifying whether the probabilities should be standardized to unit-variance before running PCA. Default is TRUE.
center: A logical specifying whether the probabilities should be centered before running PCA. Default is TRUE.
threshold: A threshold for filtering out ICP runs before PCA with the lower terminal projection accuracy below the threshold. Default is 0.
pca.method: A character specifying the PCA method. One of "irlba" (default), "RSpectra" or "stats". Set seed before, if the method is "irlba" to ensure reproducibility.
return.model: A logical specifying if the PCA model should or not be retrieved. By default FALSE. Only implemented for pca.method = "stats". If TRUE, the pca.method is coerced to "stats".
select.icp.tables: Select the ICP cluster probability tables to perform PCA. By default NULL, i.e., all are used, except if the ICP tables were obtained with the function RunParallelDivisiveICP, in which the ICP tables correspond to the last round of divisive clustering for every epoch. A vector of integers should be given otherwise.
features: A character of feature names matching row.names(object) to select from before computing PCA. Only used if assay.name is one of the assays in assayNames(object), otherwise it is ignored.
dimred.name: Dimensional reduction name given to the returned PCA. By default "PCA".

Value

object of SingleCellExperiment class

Examples

# Import package
suppressPackageStartupMessages(library("SingleCellExperiment"))

# Create toy SCE data
batches <- c("b1", "b2")
set.seed(239)
batch <- sample(x = batches, size = nrow(iris), replace = TRUE)
sce <- SingleCellExperiment(assays = list(logcounts = t(iris[,1:4])),  
                            colData = DataFrame("Species" = iris$Species, 
                                               "Batch" = batch))
colnames(sce) <- paste0("samp", 1:ncol(sce))

# Prepare SCE object for analysis
sce <- PrepareData(sce)
#> Converting object of `matrix` class into `dgCMatrix`. Please note that Coralysis has been designed to work with sparse data, i.e. data with a high proportion of zero values! Dense data will likely increase run time and memory usage drastically!
#> 4/4 features remain after filtering features with only zero values.

# Multi-level integration (just for highlighting purposes; use default parameters)
set.seed(123)
sce <- RunParallelDivisiveICP(object = sce, batch.label = "Batch", 
                              k = 2, L = 25, C = 1, train.k.nn = 10, 
                              train.k.nn.prop = NULL, use.cluster.seed = FALSE,
                              build.train.set = FALSE, ari.cutoff = 0.1, 
                             threads = 2)
#> 
#> Initializing divisive ICP clustering...
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |======                                                                |   8%
  |                                                                            
  |=========                                                             |  12%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |==================                                                    |  25%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |==========================                                            |  38%
  |                                                                            
  |=============================                                         |  42%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |=========================================                             |  58%
  |                                                                            
  |============================================                          |  62%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |====================================================                  |  75%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |=============================================================         |  88%
  |                                                                            
  |================================================================      |  92%
  |                                                                            
  |===================================================================   |  96%
  |                                                                            
  |======================================================================| 100%
#> 
#> Divisive ICP clustering completed successfully.
#> 
#> Predicting cell cluster probabilities using ICP models...
#> Prediction of cell cluster probabilities completed successfully.
#> 
#> Multi-level integration completed successfully.

# Integrated PCA
set.seed(125) # to ensure reproducibility for the default 'irlba' method
sce <- RunPCA(object = sce, assay.name = "joint.probability", p = 10)
#> Divisive ICP: selecting ICP tables multiple of 1

# Plot result 
cowplot::plot_grid(PlotDimRed(object = sce, color.by = "Batch", 
                              legend.nrow = 1),
                   PlotDimRed(object = sce, color.by = "Species", 
                             legend.nrow = 1), ncol = 2)