Focusing on 3 key methods (UMAP, t-SNE, scVI) by using scRNA-seq data
Jupyter Notebook 100.00%
clustering-algorithms's Introduction
Clustering-algorithms
Focusing on 3 key methods (UMAP, t-SNE, scVI) by using scRNA-seq data
Introduction
Single cell RNA sequencing technologies developed as advances in sequencing technologies and microfluidics enabled measurement of gene expression in individual cells (Eberwine et al., 2014). Previously, researchers were only able to collect whole population-level data, but now techniques can dissociate heterogeneous tissues into single cell samples. These single cells can be individually sequenced, then read-aligned, to produce a matrix of data ( ๐ฅ๐๐ ) which includes counts for the expression of an individual gene ( ๐ ) in each cell ( ๐ ).
Using this scRNA-seq data, there are many available clustering algorithms available that can be applied, but here we focus on 3 key methods (UMAP, t-SNE, scVI). t-SNE is a popular method that appears to be a field-standard, and it was initially published in 2008. UMAP was released in early 2018 and is similar to t-SNE in that it provides quality visualisation. However, UMAP is argued to be a development on t-SNE due to its speed and ability to preserve a higher degree of the global structure. scVI is a comparatively recent method released in late 2018, and it significantly differs from UMAP and t-SNE by taking a probabilistic approach based on a hierarchical Bayesian model with conditional distributions specified by deep neural networks.
This tutorial assumes you will initially follow installation procedures for each of the 3 methods and will download all datasets directly; in order to run the methods, you will need to adjust the file references to match their new locations. After describing how each method works individually and comparing results produced to a 'gold standard' set of labels provided from the original paper, we compare the techniques in terms of sensitivity to parameter choice, robustness of algorithms, speed of execution and scalability.