Guided start of single-cell TI with Totem

GuidedStart

This tutorial is a detailed starting guide with Totem, a user-friendly tree-shaped single-cell trajectory inference (TI) method (Smolander, Junttila, and Elo 2022). In the quick start guide it is shown the application of Totem’s TI method under a scenario where one is interested in inferring cell trajectory after running an integration and/or clustering analysis and the data has already been projected onto the low-dimensional space, such as PCA and UMAP. In this detailed starting guide, the aim is to start from the count matrix to trajectory inference providing along the way a better description and exploration of the intermediate results generated.

To this end, Totem’s TI was applied to the same data set analyzed in the quick start guide, a CD34+ bone marrow cells (replicate 1) data set that was used as an example in the Palantir’s TI publication (Setty et al. 2019) and publicly available at Human Cell Atlas Portal (link: https://data.humancellatlas.org/explore/projects/091cf39b-01bc-42e5-9437-f419a66c8a45). The data provided comprises tSNE projection, cluster identities,palantir pseudotime, transformed log-normalized and raw data (the anndata object can be found at human_cd34_bm_rep1.h5ad). The human_cd34_bm_rep1.h5ad anndata h5ad object was downloaded, parsed and distributed as a RDS SingleCellExperiment class object at data/human_cd34_bm_rep1.rds. The R script used for this object is provided at scripts/download_h5ad_to_SCE_rds_script.R.

The content of this tutorial is enumerated and described below.

Preprocessing: importing data to R and preparing it to Totem
Feature selection: selecting 2K highly variable features for dimensional reduction with scran
Dimensionality reduction: dimensionality reduction for clustering (PCA) and visualization (UMAP) with Totem
Clustering and MST: running CLARA k-medoids clustering algorithm and multiple spanning trees (MSTs) with Totem
Cell connectivity: visualize and interpret cell connectivity estimated by Totem
Select clustering: selecting and visualizing top six clustering and MST results with Totem
Smooth MST: smoothing the selected MSTs with the principal curves algorithm with Totem
Define a root: define the most probable root for the inferred trajectory
Pseudotime: visualize Totem’s pseudotime and compare it with Palantir’s

(1) Preprocessing

Load the following R packages.

# Load packages
library("dplyr")
library("Totem")
library("scater")
library("ggplot2")
library("SingleCellExperiment")

What does every package?

dplyr v.1.0.10 (Wickham et al. 2022): data wrangling, e.g., piping data with %>%
ggplot2 v.3.3.6 (Wickham 2016): data visualization
Totem v.0.99.2 (Smolander, Junttila, and Elo 2022): trajectory inference
scater v.1.26.1 (McCarthy et al. 2017): Bioconductor scRNA toolkit - used for low-dimensional visualization
SingleCellExperiment v.1.20.0 (Amezquita et al. 2020): Bioconductor single-cell data object - R class object used to interact with Totem

Set seed to keep the reproducibility of the analyses generated.

# Set seed
set.seed(1204)

Import the RDS SingleCellExperiment class object storing the scRNA-seq data set with the function readRDS (file available at: data/human_cd34_bm_rep1.rds).

Totem conveniently uses the SingleCellExperiment (SCE for short) R package and class to store all the information that generates in its workflow. This can be accessed through: metadata(sce)$totem (which returns a list of objects generated in the Totem workflow).

A SCE object can store several gene expression assays (raw, log counts, etc), metadata (cell/gene) and dimensional reduction data. Please have a look into the Chapter 4 The SingleCellExperiment class of the OSCA book to learn more about the SingleCellExperiment class and functionality.

## Import the scRNA-seq SCE object
sce <- readRDS(file = "../data/human_cd34_bm_rep1.rds")

Totem requires only log normalized data as input which can be obtained by using common single-cell processing software tools such as Seurat (Hao et al. 2021), scanpy (Wolf, Angerer, and Theis 2018) or directly calculated in R or python.

How do I get log normalized data from counts?

Assuming you have counts from single-cell gene expression data, you have at least the following three alternatives to perform log normalization:

R without external tools: applying basic R functions
- apply() R function can be used to apply the formula log1p(x/sum(x)*10000) (natural log, with 1 pseudocount, of the relative gene expression scaled by 1e4) to every column, i.e., cell, in the counts matrix

## Log normalization
counts <- assay(altExp(sce, "raw"), "X")
log.counts <- apply(counts, 2, function(x) log1p(x/sum(x)*10000)) # log1p normalization w/ 10K scaling factor

Seurat: a R package available from https://satijalab.org/seurat/
- use the Seurat function NormalizeData(..., normalization.method="LogNormalize", scale.factor = 10000) (see docs)
scanpy: a python package available from https://scanpy.readthedocs.io/en/stable/
- use the scanpy function scanpy.pp.normalize_total(..., target_sum=10000) (see docs)

Before starting the workflow, we remove the tSNE from the SCE object and we exclude any potential non expressed gene from our SCE object.

## Remove tSNE from SCE object
reducedDim(sce, "tsne") <- NULL

## Prepare data for Totem: remove non-expressed genes
sce <- PrepareTotem(object = sce)

##

## The data matrix has 14651 features.

## The data matrix has 5780 cells.

## Please note that the filtering of poor-quality cells needs to be 
##           done before Totem analysis. In addition, the cells need to be part of 
##           a single, connected, tree-shaped trajectory 
##           (not multiple disconnected trajectories.

##

## 14624/14651 features remain after filtering genes with only zero values.

##

The sce object contains the assays counts, logcounts, scaled, the cell metadata clusters, palantir_pseudotime, palantir_diff_potential, cell_types_long, cell_types_short with 14624 genes and 5780 cells.

(2) Feature selection

The next step consists in feature selection. This is not mandatory but highly recommended due to the following reasons:

the differential expression of some genes may reflect technical or bias effects rather than biological signal which might be masked upon selection of top n highly variable genes
removes uninformative genes such as lowly abundant or showing an invariant gene expression across different cells
increases the computational speed of downstream steps due to the reduced dimensionality

This step can be performed with any software tool of your choice, such as Seurat or scanpy. The only required information is a list of highly variable genes (HVG) that needs to be provided to Totem. Below we provide a solution for the selection of 2K HVG using the software R package scran.

## Selection of HVG w/ scran R package
var.genes <- scran::modelGeneVar(sce)
hvg <- scran::getTopHVGs(var.genes, n = 2000)

The object hvg is a character vector with the names of top 2K HVG (MPO, IGLL1, HIST1H4C, ELANE, SPINK2, LYZ, …).

(3) Dimensionality reduction

The next step in the workflow consists in dimensionality reduction for clustering purposes downstream of this stage.

The user can use any low-dimensional representation method of its choice that represents the data, such as PCA (Principal Component Analysis), UMAP (Uniform Manifold Approximation and Projection for Dimension Reduction), t-SNE ( t-distributed Stochastic Neighbourhood Embedding) or LMDS (landmark Multi-Dimensional Scaling).

Next We will use PCA, but the Totem vignette provides some tips if you want to use other method (see vignette).

You can skip this step if you have your own low-dimensional representation already (see the next paragraph). In this step, We perform PCA for the HVG determined in the previous step (by giving dim.red.features = hvg). In case you do not want to use HVG, just change dim.red.features = hvg to dim.red.features = NULL.

## Dimensionality reduction
sce <- RunDimRed(object = sce,
                 dim.red.method = "pca",
                 dim.red.features = hvg,
                 dim.reduction.par.list = list(ndim=50))

## using features in the 'dim.red.features' argument

## using 'pca' as the dimensionality reduction method

In case you have already your own low-dimensional representation, just add it to the SCE object

## Add low-dimensional representation to SCE object
own_dim_red <- reducedDim(sce) # substitute this line by importing your own dimensional result (define class as matrix (rows x cols: cells x latent dimensions))
reducedDim(sce, type = "pca") <- own_dim_red # type can be 'pca', 'umap' whatever you want - this is the name given to the dimensional result

The PCA can be visualized together with the elbow plot highlighting the standard deviation of every component to help us to decide the numbers of components that should be selected for clustering and UMAP projection (which can rely on the PCA).

The code below calculates the standard deviation for each principal component (PC) and, then, it uses the dplyr and ggplot2 functions to plot the results. In addition is also plotted the top 20 PCs with the cells highlighted by cell type.

## Inspect PCA variance

# Elbow plot
elbow.plt <- reducedDim(sce, "pca") %>% 
  apply(X = ., MARGIN = 2, FUN = function(x) sd(x)) %>%
  as.data.frame(.) %>% 
  `colnames<-`("Standard Deviation") %>% 
  mutate("PCs" = factor(1:nrow(.), levels = 1:nrow(.))) %>% 
  ggplot(data = ., mapping = aes(x = PCs, y = `Standard Deviation`)) + 
  geom_point() + 
  theme_bw()

# PCA plot
pca.plt <- reducedDim(sce, "pca") %>% 
  as.data.frame(.) %>%
  mutate("Cell_ID" = row.names(.)) %>% 
  cbind(., colData(sce)) %>% 
  ggplot(data = ., mapping = aes(x=comp_1, y=comp_2, color=cell_types_short)) + 
  geom_point() + 
  labs(x = "PC1", y = "PC2") + 
  scale_color_manual(values=as.character(unlist(metadata(sce)$cluster_colors)[!duplicated(levels(sce$cell_types_short))])) + 
  theme_bw()

# Box plot: top 20 PCs w/ cell types highlight by PC
pca.scores.plt <- reducedDim(sce, "pca") %>% 
  as.data.frame(.) %>% 
  mutate("Cell_ID"=row.names(.)) %>% 
  cbind(., colData(sce)) %>% 
  tidyr::pivot_longer(., cols=comp_1:comp_20, names_to="PCs", values_to ="Scores") %>% 
  mutate("PCs" = factor(PCs, levels = paste0("comp_", 1:20))) %>% 
  ggplot(data= ., aes(x=PCs, y=Scores)) + 
  geom_boxplot(outlier.shape = NA) + 
  geom_jitter(aes(color=cell_types_short), size=0.25) + 
  scale_color_manual(values=as.character(unlist(metadata(sce)$cluster_colors)[!duplicated(levels(sce$cell_types_short))])) + 
  theme_bw() + 
  theme(axis.text.x = element_text(angle = 35, hjust = 1, vjust = 1))

# Plot altogether 
cowplot::plot_grid(cowplot::plot_grid(elbow.plt, pca.plt, ncol=2), pca.scores.plt, ncol=1)

Let’s select the first 6 PCs which seem to hold most of the biological variation present in this data set.

## Pick PCs
pick.pcs <- 1:6
reducedDim(sce, "pca") <- reducedDim(sce, "pca")[, pick.pcs ]

We need a 2-dimensional representation rather than PCA for visualization purposes. Adequate methods are UMAP, t-SNE or MDS. Any of these can be directly provided by the user in case they generated previously any or generated de novo as follows with the Totem’s function RunDimRed(). Here it is computed the popular UMAP (dim.red.method = "umap") using the 2K HVG (dim.red.features = hvg) as it was done for the PCA. In addition, it was provided a list of parameters for the dyndimred package: ndim=2 (compute only two UMAP dimensions) and pca_components = 6 (use only the first 6 PCs to compute the UMAP). The function returns the UMAP dimensional reduction to the sce object at reducedDim(sce, "umap").

## UMAP dimensional reduction for visualization
set.seed(123)
sce <- RunDimRed(object = sce, 
                 dim.red.method = "umap", 
                 dim.red.features = hvg, 
                 dim.reduction.par.list = list(ndim=2, pca_components = 6))

## using features in the 'dim.red.features' argument

## using 'umap' as the dimensionality reduction method

## Loading required namespace: uwot

dim_red <- reducedDim(sce, "umap")

The result can be visualized below with the scran function plotReducedDim().

## Visualize 'Group' in UMAP projection
plotReducedDim(object = sce, dimred = "umap", colour_by = "cell_types_short", point_size=0.5) + 
  scale_color_manual(name= "Cell Types", 
                     values=as.character(unlist(metadata(sce)$cluster_colors)[!duplicated(levels(sce$cell_types_short))])) + 
  theme_void()

## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.

reducedDim(sce, "umap") <- NULL # remove UMAP from 'sce' object

(4) Clustering (& MST)

Clustering is one of the main steps in the Totem workflow and the one that takes more time and computational resources. Totem is use the 6-dimensional latent representations of PCA are used to cluster the data set with the k-medoids (CLARA) algorithm using an expected number of clusters varying between 3-20 (k.range = 3:20), removing clusters with <5 cells (min.cluster.size = 5) and running the clustering algorithm 10K times (N.clusterings = 10000). These are the default parameters. After filtering out clustering results due to min.cluster.size, the total number of clustering results will be lower than N.clusterings. At this stage, it is also estimated a MST (Minimum Spanning Tree) for each clustering result.

## Clustering scRNA-seq data w/ CLARA (k-medoids)
set.seed(123)
sce <- RunClustering(object = sce, k.range = 3:20,
                     min.cluster.size = 5, N.clusterings = 10000)

## using pcaumap as the dim.reduced data

## number of clustering results: 9975

(5) Cell connectivity

Cell connectivity is a novel concept introduced in Totem. For a given clustering result and the respective MST (Minimum Spanning Trees) , the connectivity of a cell cluster consists in the ratio of its connections to other clusters by the total number of clusters. The cell connectivity is scaled by the maximum value and averaged across a set of clustering and MST results (by default 10K). Higher the value farther the distance for the leafs/ending points of the trajectory and, thus, more likely to represent, e.g., branching points (see Fig.4 (Smolander, Junttila, and Elo 2022)).

The cell connectivity unit can be used for two purposes:

support the user decision about the most likely trajectory
selecting clustering results method (see next section)

## Visualization of cell connectivity
VizCellConnectivity(object = sce, viz.dim.red = dim_red)

## Coloring by milestone

## Using milestone_percentages from trajectory

## Coloring by expression

According to the expected low values of cell connectivity corresponds to terminally differentiated cell states, i.e., CLP, EP, MoP, DCP, and high values to HSC, HMP or MyP, imature cell populations that are branching into other populations.

(6) Select clusterings

The clustering can be selected based on 5 methods described in detail in the Totem vignette (see vignette). You can also see the function documentation by typing the following in the R console: ?SelectClusterings.

Here, We selected the top 6 models with the method 3 which was demonstrated to perform well compared with other methods (Smolander, Junttila, and Elo 2022). Method 3 relies on the average of two metrics to select the top 6, i.e., selection.N.models=6, best clusterings, i.e., the clustering results that are more congruent and agree better with the tree structure:

VRC (Variance Ratio Criterion) or Calinski-Harabasz score: it measures the within- and between-cluster dispersion
cell connectivity: see above

## Select best clusters for MST calculation
sce <- SelectClusterings(sce, selection.method = 3,
                         selection.N.models = 6,
                         selection.stratified = FALSE,
                         prior.clustering = NULL)

Below it appears highlighted the 6 selected clusterings and their respective MSTs.

## Visualize selected clusters
select.clusters <- ReturnTrajNames(sce)
VizMST(object = sce, clustering.names = select.clusters, viz.dim.red = dim_red)

## Coloring by milestone

## Using milestone_percentages from trajectory

## Coloring by milestone

## Using milestone_percentages from trajectory

## Coloring by milestone

## Using milestone_percentages from trajectory

## Coloring by milestone

## Using milestone_percentages from trajectory

## Coloring by milestone

## Using milestone_percentages from trajectory

## Coloring by milestone

## Using milestone_percentages from trajectory

(7) Smoothing MSTs

Smoothing MSTs using the principle curves slingshot algorithm (Street et al. 2018) resulting in directory trajectories randomly rooted and discrete pseudotime along the lineage. The user can provide in the next step the root.

## Smoothing MSTs selected w/ Slingshot
sce <- RunSmoothing(sce)

Visualize smoothed MSTs below.

## Visualize smoothed MSTs
smooth.msts.names <- ReturnTrajNames(sce)
VizSmoothedTraj(object = sce,
                traj.names = smooth.msts.names,
                viz.dim.red = dim_red,plot.pseudotime = FALSE)

## Coloring by milestone

## Using milestone_percentages from trajectory

## Coloring by milestone

## Using milestone_percentages from trajectory

## Coloring by milestone

## Using milestone_percentages from trajectory

## Coloring by milestone

## Using milestone_percentages from trajectory

## Coloring by milestone

## Using milestone_percentages from trajectory

## Coloring by milestone

## Using milestone_percentages from trajectory

From the top 6 smoothed MST trajectories presented, the only difference relies on the MoP and DCP lineages and the branching from these lineages to the CLP.

The CLP needs to diverge before the MyP and MoP and DCP should diverge from MyP. Thus the clustering/MST result 8.135 seems to meet these expectations. I will be selected below as the elected trajectory.

(8) Define a root

At this stage the root can be defined be giving the cluster number. Let’s define cluster 2 as root.

## Define the root of the cell trajectory
select.traj <- "8.135"
root.cluster <- 2
sce <- ChangeTrajRoot(object = sce, traj.name = select.traj, root.cluster = root.cluster)

(9) Pseudotime

Pseudotime can be highlighted in the low dimensional projection below.

cowplot::plot_grid(
  VizSmoothedTraj(object = sce,
                  traj.names = select.traj,
                  viz.dim.red = dim_red, plot.pseudotime = FALSE),
  VizSmoothedTraj(object = sce,
                  traj.names = select.traj,
                  viz.dim.red = dim_red, plot.pseudotime = TRUE), 
  ncol=2
)

## Coloring by milestone

## Using milestone_percentages from trajectory

## Pseudotime not provided, will calculate pseudotime from root milestone

## Warning in dynwrap::calculate_pseudotime(trajectory): Trajectory is not rooted.
## Add a root to the trajectory using dynwrap::add_root(). This will result in an
## error in future releases.

## root cell or milestone not provided, trying first outgoing milestone_id

## Using '2' as root

Pseudotime comparison

Totem’s pseudotime (on the left side) can be compared with Palantir’s pseudotime (on the right side). The Pearson’s correlation between pseudotimes was 0.88.

## Compare Totem pseudotime against ground-truth
reducedDim(sce, "umap") <- dim_red
cowplot::plot_grid(
  (VizSmoothedTraj(object = sce,
                traj.names = select.traj,
                viz.dim.red = dim_red, plot.pseudotime = TRUE) + 
     ggtitle("Totem")),
  (plotReducedDim(sce, dimred = "umap", colour_by = "palantir_pseudotime") + 
      ggtitle("Palantir's pseudotime")) + 
    theme_void() + 
    theme(legend.position = "bottom"), 
   ncol=2
   )

## Pseudotime not provided, will calculate pseudotime from root milestone

## Warning in dynwrap::calculate_pseudotime(trajectory): Trajectory is not rooted.
## Add a root to the trajectory using dynwrap::add_root(). This will result in an
## error in future releases.

## root cell or milestone not provided, trying first outgoing milestone_id

## Using '2' as root

The pseudotime and trajectory obtained with Totem agrees well with Palantir. Contrary to the inferred trajectory with the tSNE projection in quick start guide, by using a more adequate dimensional reduction projection such as PCA, Totem was able to predict correctly the trajectory related with the CLP.

Finally, the sce object with all the results stored on it is exported.

## Export sce object
saveRDS(object = sce, file = "../results/sce_guidedstart.rds")

R packages used and respective versions

## R packages and versions used in these analyses
sessionInfo()

## R version 4.2.1 (2022-06-23)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.5 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] scater_1.26.1               ggplot2_3.3.6              
##  [3] scuttle_1.8.4               SingleCellExperiment_1.20.0
##  [5] SummarizedExperiment_1.28.0 Biobase_2.58.0             
##  [7] GenomicRanges_1.50.2        GenomeInfoDb_1.34.9        
##  [9] IRanges_2.32.0              S4Vectors_0.36.2           
## [11] BiocGenerics_0.44.0         MatrixGenerics_1.10.0      
## [13] matrixStats_0.62.0          Totem_0.99.2               
## [15] dplyr_1.0.10               
## 
## loaded via a namespace (and not attached):
##   [1] dyndimred_1.0.4           babelwhale_1.1.0         
##   [3] plyr_1.8.7                igraph_1.3.5             
##   [5] proxyC_0.3.3              dynwrap_1.2.2            
##   [7] BiocParallel_1.32.5       digest_0.6.30            
##   [9] foreach_1.5.2             htmltools_0.5.3          
##  [11] viridis_0.6.2             fansi_1.0.3              
##  [13] magrittr_2.0.3            ScaledMatrix_1.6.0       
##  [15] carrier_0.1.0             cluster_2.1.3            
##  [17] limma_3.54.2              tzdb_0.3.0               
##  [19] remotes_2.4.2             readr_2.1.3              
##  [21] graphlayouts_0.8.3        RcppParallel_5.1.5       
##  [23] dynutils_1.0.11           princurve_2.1.6          
##  [25] colorspace_2.0-3          ggrepel_0.9.1            
##  [27] xfun_0.34                 crayon_1.5.2             
##  [29] RCurl_1.98-1.9            jsonlite_1.8.3           
##  [31] iterators_1.0.14          ape_5.6-2                
##  [33] glue_1.6.2                polyclip_1.10-4          
##  [35] gtable_0.3.1              zlibbioc_1.44.0          
##  [37] XVector_0.38.0            DelayedArray_0.24.0      
##  [39] BiocSingular_1.14.0       kernlab_0.9-31           
##  [41] prabclus_2.3-2            DEoptimR_1.0-11          
##  [43] dynparam_1.0.2            scales_1.2.1             
##  [45] edgeR_3.40.2              DBI_1.1.3                
##  [47] Rcpp_1.0.9                TrajectoryUtils_1.6.0    
##  [49] viridisLite_0.4.1         aricode_1.0.1            
##  [51] dqrng_0.3.0               rsvd_1.0.5               
##  [53] mclust_5.4.10             metapod_1.6.0            
##  [55] RColorBrewer_1.1-3        fpc_2.2-9                
##  [57] ellipsis_0.3.2            modeltools_0.2-23        
##  [59] pkgconfig_2.0.3           flexmix_2.3-18           
##  [61] farver_2.1.1              uwot_0.1.14              
##  [63] nnet_7.3-17               sass_0.4.2               
##  [65] locfit_1.5-9.6            utf8_1.2.2               
##  [67] labeling_0.4.2            tidyselect_1.2.0         
##  [69] rlang_1.0.6               reshape2_1.4.4           
##  [71] munsell_0.5.0             tools_4.2.1              
##  [73] cachem_1.0.6              cli_3.4.1                
##  [75] generics_0.1.3            evaluate_0.17            
##  [77] stringr_1.4.1             fastmap_1.1.0            
##  [79] yaml_2.3.6                processx_3.8.0           
##  [81] knitr_1.40                tidygraph_1.2.2          
##  [83] robustbase_0.95-0         lmds_0.1.0               
##  [85] purrr_0.3.5               ggraph_2.1.0             
##  [87] nlme_3.1-157              sparseMatrixStats_1.10.0 
##  [89] scran_1.26.2              GA_3.2.3                 
##  [91] compiler_4.2.1            rstudioapi_0.14          
##  [93] beeswarm_0.4.0            statmod_1.4.37           
##  [95] tibble_3.1.8              tweenr_2.0.2             
##  [97] bslib_0.4.0               stringi_1.7.8            
##  [99] highr_0.9                 ps_1.7.2                 
## [101] desc_1.4.2                bluster_1.8.0            
## [103] lattice_0.20-45           Matrix_1.5-1             
## [105] vctrs_0.5.0               pillar_1.8.1             
## [107] lifecycle_1.0.3           jquerylib_0.1.4          
## [109] RcppAnnoy_0.0.20          BiocNeighbors_1.16.0     
## [111] cowplot_1.1.1             bitops_1.0-7             
## [113] irlba_2.3.5.1             patchwork_1.1.2          
## [115] R6_2.5.1                  gridExtra_2.3            
## [117] vipor_0.4.5               codetools_0.2-18         
## [119] MASS_7.3-57               assertthat_0.2.1         
## [121] rprojroot_2.0.3           withr_2.5.0              
## [123] GenomeInfoDbData_1.2.9    diptest_0.76-0           
## [125] parallel_4.2.1            hms_1.1.2                
## [127] grid_4.2.1                beachmat_2.14.0          
## [129] tidyr_1.2.1               class_7.3-20             
## [131] rmarkdown_2.17            DelayedMatrixStats_1.20.0
## [133] slingshot_2.6.0           ggforce_0.4.1            
## [135] ggbeeswarm_0.6.0          dynplot_1.1.2

References

Amezquita, Robert, Aaron Lun, Etienne Becht, Vince Carey, Lindsay Carpp, Ludwig Geistlinger, Federico Marini, et al. 2020. “Orchestrating Single-Cell Analysis with Bioconductor.” Nature Methods 17: 137–45. https://www.nature.com/articles/s41592-019-0654-x.

Hao, Yuhan, Stephanie Hao, Erica Andersen-Nissen, William M Mauck, Shiwei Zheng, Andrew Butler, Maddie J Lee, et al. 2021. “Integrated Analysis of Multimodal Single-Cell Data.” Cell 184 (13): 3573–87.

McCarthy, Davis J., Kieran R. Campbell, Aaron T. L. Lun, and Quin F. Willis. 2017. “Scater: Pre-Processing, Quality Control, Normalisation and Visualisation of Single-Cell RNA-Seq Data in R.” Bioinformatics 33: 1179–86. https://doi.org/10.1093/bioinformatics/btw777.

Setty, Manu, Vaidotas Kiseliovas, Jacob Levine, Adam Gayoso, Linas Mazutis, and Dana Pe’Er. 2019. “Characterization of Cell Fate Probabilities in Single-Cell Data with Palantir.” Nature Biotechnology 37 (4): 451–60.

Smolander, Johannes, Sini Junttila, and Laura L Elo. 2022. “Totem: A User-Friendly Tool for Clustering-Based Inference of Tree-Shaped Trajectories from Single-Cell Data.” bioRxiv, 2022–09.

Street, Kelly, Davide Risso, Russell B Fletcher, Diya Das, John Ngai, Nir Yosef, Elizabeth Purdom, and Sandrine Dudoit. 2018. “Slingshot: Cell Lineage and Pseudotime Inference for Single-Cell Transcriptomics.” BMC Genomics 19: 1–16.

Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.

Wickham, Hadley, Romain François, Lionel Henry, and Kirill Müller. 2022. Dplyr: A Grammar of Data Manipulation.

Wolf, F Alexander, Philipp Angerer, and Fabian J Theis. 2018. “SCANPY: Large-Scale Single-Cell Gene Expression Data Analysis.” Genome Biology 19: 1–5.