Please wait a minute...
Quantitative Biology

ISSN 2095-4689

ISSN 2095-4697(Online)

CN 10-1028/TM

Postal Subscription Code 80-971

Quant. Biol.    2019, Vol. 7 Issue (4) : 247-254    https://doi.org/10.1007/s40484-019-0189-2
MINI REVIEW
Emerging deep learning methods for single-cell RNA-seq data analysis
Jie Zheng(), Ke Wang
School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
 Download: PDF(149 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Deep learning is making major breakthrough in several areas of bioinformatics. Anticipating that this will occur soon for the single-cell RNA-seq data analysis, we review newly published deep learning methods that help tackle computational challenges. Autoencoders are found to be the dominant approach. However, methods based on deep generative models such as generative adversarial networks (GANs) are also emerging in this area.

Keywords single-cell      RNA-seq      deep learning      autoencoder     
Corresponding Author(s): Jie Zheng   
Just Accepted Date: 15 November 2019   Online First Date: 17 December 2019    Issue Date: 31 December 2019
 Cite this article:   
Jie Zheng,Ke Wang. Emerging deep learning methods for single-cell RNA-seq data analysis[J]. Quant. Biol., 2019, 7(4): 247-254.
 URL:  
https://academic.hep.com.cn/qb/EN/10.1007/s40484-019-0189-2
https://academic.hep.com.cn/qb/EN/Y2019/V7/I4/247
Names References Years Methods Goals
Lin’s method Lin et al. [48] 2017 PCA-based dimensionality reduction with denoising autoencoders Dimensionality reduction, cell grouping, inference of cell type or state
AutoImpute Talwar et al. [46] 2018 ?Autoencoder-based sparse gene expression matrix imputation ?Deal with dropout events
scVI Lopez et al. [52] 2018 Hierarchical Bayesian model and variational autoencoder Tackle batch correction, library-size bias, dropout, imputation, and visualization, etc.
VASC Wang & Gu [50] 2018 Variational autoencoder ?Model the dropout events and find the nonlinear hierarchical feature representations of the original data
scvis Ding et al. [49] 2018 Variational autoencoder ?Model and visualize structures in scRNA-seq data
scScope Deng et al. [54] 2019 Autoencoder with recurrent structure Conduct batch effect removal, dropout imputation, cell subpopulation identification
DCA ?Eraslan et al. [47] 2019 ?Deep count autoencoder ?with a ZINB loss function ?Remove technical variation to improve downstream analyses
SAVER-X Wang et al. [53] 2019 ?Bayesian hierarchical model and deep autoencoder with transfer learning Leverage existing data to improve the quality of new scRNA-seq datasets
Tab.1  Different deep learning methods for scRNA-seq data analysis
1 T. Ching, , D. S. Himmelstein, , B. K. Beaulieu-Jones, , A. A. Kalinin, , B. T. Do, , G. P. Way, , E. Ferrero, , P. M. Agapow, , M. Zietz, , M. M. Hoffman, , et al. (2018) Opportunities and obstacles for deep learning in biology and medicine. J. R. Soc. Interface, 15, 20170387
https://doi.org/10.1098/rsif.2017.0387. pmid: 29618526
2 F. Tang, , K. Lao, and M. A. Surani, (2011) Development and applications of single-cell transcriptome analysis. Nat. Methods, 8, S6–S11
https://doi.org/10.1038/nmeth.1557. pmid: 21451510
3 J. Berg, (2018) Exploring organisms cell by cell. Science, 362, 1333
https://doi.org/10.1126/science.aaw3633. pmid: 30573601
4 A. Regev, , S. A. Teichmann, , E. S. Lander, , I. Amit, , C. Benoist, , E. Birney, , B. Bodenmiller, , P. Campbell, , P. Carninci, , M. Clatworthy, , et al. (2017) The Human Cell Atlas. eLife, 6, e270416
https://doi.org/10.7554/eLife.27041. pmid: 29206104
5 O. Franzén, , L.-M. Gan, and J.L. Björkegren, (2019) PanglaoDB: a web server for exploration of mouse and human single-cell RNA sequencing data. Database, 2019, baz046
6 L. Yan, , M. Yang, , H. Guo, , L. Yang, , J. Wu, , R. Li, , P. Liu, , Y. Lian, , X. Zheng, , J. Yan, , et al. (2013) Single-cell RNA-Seq profiling of human preimplantation embryos and embryonic stem cells. Nat. Struct. Mol. Biol., 20, 1131–1139
https://doi.org/10.1038/nsmb.2660. pmid: 23934149
7 Q. Deng, , D. Ramsköld, , B. Reinius, and R. Sandberg, (2014) Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells. Science, 343, 193–196
https://doi.org/10.1126/science.1245316. pmid: 24408435
8 A. A. Kolodziejczyk, , J. K. Kim, , J. C. Tsang, , T. Ilicic, , J. Henriksson, , K. N. Natarajan, , A. C. Tuck, , X. Gao, , M. Bühler, , P. Liu, , et al. (2015) Single cell RNA-sequencing of pluripotent states unlocks modular transcriptional variation. Cell Stem Cell, 17, 471–485
https://doi.org/10.1016/j.stem.2015.09.011. pmid: 26431182
9 V. Y. Kiselev, , T. S. Andrews, and M. Hemberg, (2019) Challenges in unsupervised clustering of single-cell RNA-seq data. Nat. Rev. Genet., 20, 273–282
https://doi.org/10.1038/s41576-018-0088-9. pmid: 30617341
10 O. Stegle, , S. A. Teichmann, and J. C. Marioni, (2015) Computational and analytical challenges in single-cell transcriptomics. Nat. Rev. Genet., 16, 133–145
https://doi.org/10.1038/nrg3833. pmid: 25628217
11 O. B. Poirion, , X. Zhu, , T. Ching, and L. Garmire, (2016) Single-cell transcriptomics bioinformatics and computational challenges. Front. Genet., 7, 163
https://doi.org/10.3389/fgene.2016.00163. pmid: 27708664
12 A. Butler, , P. Hoffman, , P. Smibert, , E. Papalexi, and R. Satija, (2018) Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol., 36, 411–420
https://doi.org/10.1038/nbt.4096. pmid: 29608179
13 L. Haghverdi, , F. Buettner, and F. J. Theis, (2015) Diffusion maps for high-dimensional single-cell analysis of differentiation data. Bioinformatics, 31, 2989–2998
https://doi.org/10.1093/bioinformatics/btv325. pmid: 26002886
14 P. V. Kharchenko, , L. Silberstein, and D. T. Scadden, (2014) Bayesian approach to single-cell differential expression analysis. Nat. Methods, 11, 740–742
https://doi.org/10.1038/nmeth.2967. pmid: 24836921
15 L. Zhang, and S. Zhang, (2018) Comparison of computational methods for imputing single-cell RNA-sequencing data. IEEE/ACM Trans. Comput. Biol. Bioinform.
https://doi.org/10.1109/TCBB.2018.2848633
16 Z. Miao, , K. Deng, , X. Wang, and X. Zhang, (2018) DEsingle for detecting three types of differential expression in single-cell RNA-seq data. Bioinformatics, 34, 3223–3224
https://doi.org/10.1093/bioinformatics/bty332. pmid: 29688277
17 G.E. Hinton, , N. Srivastava,, A. Krizhevsky,, I. Sutskever, and R.R Salakhutdinov, (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580v1
18 N. Srivastava, , G. Hinton,, A. Krizhevsky,, I. Sutskever, and R. Salakhutdinov, (2014) Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res., 15, 1929–1958
19 P. Brennecke, , S. Anders, , J. K. Kim, , A. A. Kołodziejczyk, , X. Zhang, , V. Proserpio, , B. Baying, , V. Benes, , S. A. Teichmann, , J. C. Marioni, , et al. (2013) Accounting for technical noise in single-cell RNA-seq experiments. Nat. Methods, 10, 1093–1095
https://doi.org/10.1038/nmeth.2645. pmid: 24056876
20 L. Jiang, , F. Schlesinger, , C. A. Davis, , Y. Zhang, , R. Li, , M. Salit, , T. R. Gingeras, and B. Oliver, (2011) Synthetic spike-in standards for RNA-seq experiments. Genome Res., 21, 1543–1551
https://doi.org/10.1101/gr.121095.111. pmid: 21816910
21 S. Islam, , A. Zeisel, , S. Joost, , G. La Manno, , P. Zajac, , M. Kasper, , P. Lönnerberg, and S. Linnarsson, (2014) Quantitative single-cell RNA-seq with unique molecular identifiers. Nat. Methods, 11, 163–166
https://doi.org/10.1038/nmeth.2772. pmid: 24363023
22 B. Ding, , L. Zheng, , Y. Zhu, , N. Li, , H. Jia, , R. Ai, , A. Wildberg, and W. Wang, (2015) Normalization and noise reduction for single cell RNA-seq experiments. Bioinformatics, 31, 2225–2227
https://doi.org/10.1093/bioinformatics/btv122. pmid: 25717193
23 B. Ding, , L. Zheng, and W. Wang, (2017) Assessment of single cell RNA-Seq normalization methods. G3 (Bethesda), 7, 2039–2045
https://doi.org/10.1534/g3.117.040683. pmid: 28468817
24 J. K. Kim, , A. A. Kolodziejczyk, , T. Ilicic, , S. A. Teichmann, and J. C. Marioni, (2015) Characterizing noise structure in single-cell RNA-seq distinguishes genuine from technical stochastic allelic expression. Nat. Commun., 6, 8687
https://doi.org/10.1038/ncomms9687. pmid: 26489834
25 R. Bellman, and R. Corporation, (1957) Dynamic programming. Princeton: Princeton University Press
26 L. Van Der Maaten, , E. Postma, and J. Van den Herik, (2009) Dimensionality reduction: a comparative review. J. Mach. Learn. Res., 10, 13
27 A. K. Shalek, , R. Satija, , J. Shuga, , J. J. Trombetta, , D. Gennert, , D. Lu, , P. Chen, , R. S. Gertner, , J. T. Gaublomme, , N. Yosef, , et al. (2014) Single-cell RNA-seq reveals dynamic paracrine control of cellular variation. Nature, 510, 363–369
https://doi.org/10.1038/nature13437. pmid: 24919153
28 L. van der Maaten, and G. Hinton, (2008) Visualizing data using t-SNE. J. Mach. Learn. Res., 9, 2579–2605
29 A. D. Amir, , K. L. Davis, , M. D. Tadmor, , E. F. Simonds, , J. H. Levine, , S. C. Bendall, , D. K. Shenfeld, , S. Krishnaswamy, , G. P. Nolan, and D. Pe’er, (2013) viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia. Nat. Biotechnol., 31, 545–552
https://doi.org/10.1038/nbt.2594. pmid: 23685480
30 N. D. Lawrence, (2004) Gaussian process latent variable models for visualisation of high dimensional data. Adv. in Neural Inf. Proc. Sys., 16, 329–336
31 F. Buettner, and F. J. Theis, (2012) A novel approach for resolving differences in single-cell gene expression patterns from zygote to blastocyst. Bioinformatics, 28, i626–i632
https://doi.org/10.1093/bioinformatics/bts385. pmid: 22962491
32 B. Wang, , J. Zhu, , E. Pierson, , D. Ramazzotti, and S. Batzoglou, (2017) Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning. Nat. Methods, 14, 414–416
https://doi.org/10.1038/nmeth.4207. pmid: 28263960
33 E. Becht, , L. McInnes, , J. Healy, , C. A. Dutertre, , I. W. H. Kwok, , L. G. Ng, , F. Ginhoux, and E. W. Newell, (2018) Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol., 37, 38–44
https://doi.org/10.1038/nbt.4314. pmid: 30531897
34 E. Pierson, and C. Yau, (2015) ZIFA: Dimensionality reduction for zero-inflated single-cell gene expression analysis. Genome Biol., 16, 241
https://doi.org/10.1186/s13059-015-0805-z. pmid: 26527291
35 E. Z. Macosko, , A. Basu, , R. Satija, , J. Nemesh, , K. Shekhar, , M. Goldman, , I. Tirosh, , A. R. Bialas, , N. Kamitaki, , E. M. Martersteck, , et al. (2015) Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell, 161, 1202–1214
https://doi.org/10.1016/j.cell.2015.05.002. pmid: 26000488
36 M. S. Nobile, , P. Cazzaniga, , A. Tangherloni, and D. Besozzi, (2017) Graphics processing units in bioinformatics, computational biology and systems biology. Brief. Bioinformatics, 18, 870–885
pmid: 27402792.
37 H. Bourlard, and Y. Kamp, (1988) Auto-association by multilayer perceptrons and singular value decomposition. Biol. Cybern., 59, 291–294
https://doi.org/10.1007/BF00332918. pmid: 3196773
38 G. E. Hinton, and R. R. Salakhutdinov, (2006) Reducing the dimensionality of data with neural networks. Science, 313, 504–507
https://doi.org/10.1126/science.1127647. pmid: 16873662
39 P. Vincent, , H. Larochelle,, Y. Bengio, and P.A. Manzagol, (2008) Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine learning, pp. 1096–1103. ACM: Helsinki, Finland
40 D.P. Kingma, and M. Welling, (2013) Auto-encoding variational bayes. arXiv:1312.6114v10
41 U. Shaham, , K. P. Stanton, , J. Zhao, , H. Li, , K. Raddassi, , R. Montgomery, and Y. Kluger, (2017) Removal of batch effects using distribution-matching residual networks. Bioinformatics, 33, 2539–2546
https://doi.org/10.1093/bioinformatics/btx196. pmid: 28419223
42 X. Li, , Y. Lyu,, J. Park,, J. Zhang,, D. Stambolian,, K. Susztak,, G. Hu,, M. Li, (2019) Deep learning enables accurate clustering and batch effect removal in single-cell RNA-seq analysis. bioRxiv, 530378
43 D. van Dijk, , D. Dijk,, R. Sharma,, J. Nainys,, K. Yim,, P. Kathail,, A.J. Carr,, C. Burdziak,, K. R. Moon,, C. L. Chaffer,, et al. (2018) Recovering gene interactions from single-cell data using data diffusion. Cell, 174, 716–729 e27
44 W. V. Li, and J. J. Li, (2018) An accurate and robust imputation method scImpute for single-cell RNA-seq data. Nat. Commun., 9, 997
https://doi.org/10.1038/s41467-018-03405-7. pmid: 29520097
45 W. Gong, , I. Y. Kwak, , P. Pota, , N. Koyano-Nakagawa, and D. J. Garry, (2018) DrImpute: imputing dropout events in single cell RNA sequencing data. BMC Bioinformatics, 19, 220
https://doi.org/10.1186/s12859-018-2226-y. pmid: 29884114
46 D. Talwar, , A. Mongia, , D. Sengupta, and A. Majumdar, (2018) AutoImpute: Autoencoder based imputation of single-cell RNA-seq data. Sci. Rep., 8, 16329
https://doi.org/10.1038/s41598-018-34688-x. pmid: 30397240
47 G. Eraslan, , L. M. Simon, , M. Mircea, , N. S. Mueller, and F. J. Theis, (2019) Single-cell RNA-seq denoising using a deep count autoencoder. Nat. Commun., 10, 390
https://doi.org/10.1038/s41467-018-07931-2. pmid: 30674886
48 C. Lin, , S. Jain, , H. Kim, and Z. Bar-Joseph, (2017) Using neural networks for reducing the dimensions of single-cell RNA-Seq data. Nucleic Acids Res., 45, e156
https://doi.org/10.1093/nar/gkx681. pmid: 28973464
49 J. Ding, , A. Condon, and S. P. Shah, (2018) Interpretable dimensionality reduction of single cell transcriptome data with deep generative models. Nat. Commun., 9, 2002
https://doi.org/10.1038/s41467-018-04368-5. pmid: 29784946
50 D. Wang, and J. Gu, (2018) VASC: Dimension reduction and visualization of single-cell RNA-seq data by deep variational autoencoder. Genom. Proteom. Bioinf., 16, 320–331
https://doi.org/10.1016/j.gpb.2018.08.003. pmid: 30576740
51 J. Peng, , X. Wang, and X. Shang, (2019) Combining gene ontology with deep neural networks to enhance the clustering of single cell RNA-Seq data. BMC Bioinformatics, 20, 284
https://doi.org/10.1186/s12859-019-2769-6. pmid: 31182005
52 R. Lopez, , J. Regier, , M. B. Cole, , M. I. Jordan, and N. Yosef, (2018) Deep generative modeling for single-cell transcriptomics. Nat. Methods, 15, 1053–1058
https://doi.org/10.1038/s41592-018-0229-2. pmid: 30504886
53 J. Wang, , D. Agarwal, , M. Huang, , G. Hu, , Z. Zhou, , C. Ye, and N. R. Zhang, (2019) Data denoising with transfer learning in single-cell transcriptomics. Nat. Methods, 16, 875–878
https://doi.org/10.1038/s41592-019-0537-1. pmid: 31471617
54 Y. Deng, , F. Bao, , Q. Dai, , L. F. Wu, and S. J. Altschuler, (2019) Scalable analysis of cell-type composition from single-cell transcriptomics using deep recurrent learning. Nat. Methods, 16, 311–314
https://doi.org/10.1038/s41592-019-0353-7. pmid: 30886411
55 Q. Hu, and C. S. Greene, (2019) Parameter tuning is a key part of dimensionality reduction via deep variational autoencoders for single cell RNA transcriptomics. Pac. Symp. Biocomput., 24, 362–373
pmid: 30963075.
56 T. Stuart,, A. Butler,, P. Hoffman,, C. Hafemeister,, E. Papalexi,, W. M. MauckIII,, Y. Hao,, M. Stoeckius,, P. Smibert,, R. Satija, (2019) Comprehensive integration of single-cell data. Cell, 177, 1888–1902 e21
57 V. Bhardwaj, , S. Heyne, , K. Sikora, , L. Rabbani, , M. Rauer, , F. Kilpert, , A. S. Richter, , D. P. Ryan, and T. Manke, (2019) snakePipes: facilitating flexible, scalable and integrative epigenomic analysis. Bioinformatics, btz436
https://doi.org/10.1093/bioinformatics/btz436. pmid: 31134269
58 M. Stoeckius, , C. Hafemeister, , W. Stephenson, , B. Houck-Loomis, , P. K. Chattopadhyay, , H. Swerdlow, , R. Satija, and P. Smibert, (2017) Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods, 14, 865–868
https://doi.org/10.1038/nmeth.4380. pmid: 28759029
59 M. Marouf, , P. Machart,, V. Bansal,, C. Kilian,, D.S. Magruder,, C.F. Krebs, and S. Bonn, (2018) Realistic in silico generation and augmentation of single cell RNA-seq data using Generative Adversarial Neural Networks. bioRxiv, 390153
60 G. Eraslan, , Ž. Avsec, , J. Gagneur, and F. J. Theis, (2019) Deep learning: new computational modelling techniques for genomics. Nat. Rev. Genet., 20, 389–403
https://doi.org/10.1038/s41576-019-0122-6. pmid: 30971806
61 A. Ghahramani, , F. M. Watt, and N. M. Luscombe, (2018) Generative adversarial networks uncover epidermal regulators and predict single cell perturbations. bioRxiv, 262501
62 M. Amodio, and S. Krishnaswamy, (2018) MAGAN: Aligning biological manifolds. arXiv,1803.00385
[1] Naoki Matsuda, Ken-ichi Hironaka, Masashi Fujii, Takumi Wada, Katsuyuki Kunida, Haruki Inoue, Miki Eto, Daisuke Hoshino, Yasuro Furuichi, Yasuko Manabe, Nobuharu L. Fujii, Hiroyuki Noji, Hiromi Imamura, Shinya Kuroda. Monitoring and mathematical modeling of mitochondrial ATP in myotubes at single-cell level reveals two distinct population with different kinetics[J]. Quant. Biol., 2020, 8(3): 228-237.
[2] Xing Chen, Yinglei Lai. A censored-Poisson model based approach to the analysis of RNA-seq data[J]. Quant. Biol., 2020, 8(2): 155-171.
[3] Jianan Lin, Zhengqing Ouyang. Large-scale analysis of the position-dependent binding and regulation of human RNA binding proteins[J]. Quant. Biol., 2020, 8(2): 119-129.
[4] Md. Bahadur Badsha, Rui Li, Boxiang Liu, Yang I. Li, Min Xian, Nicholas E. Banovich, Audrey Qiuyan Fu. Imputation of single-cell gene expression with an autoencoder neural network[J]. Quant. Biol., 2020, 8(1): 78-94.
[5] Jie Ren, Kai Song, Chao Deng, Nathan A. Ahlgren, Jed A. Fuhrman, Yi Li, Xiaohui Xie, Ryan Poplin, Fengzhu Sun. Identifying viruses from metagenomic data using deep learning[J]. Quant. Biol., 2020, 8(1): 64-77.
[6] Zhixin Ma, Pan M. Chu, Yingtong Su, Yue Yu, Hui Wen, Xiongfei Fu, Shuqiang Huang. Applications of single-cell technology on bacterial analysis[J]. Quant. Biol., 2019, 7(3): 171-181.
[7] Xingyu Liao, Min Li, You Zou, Fang-Xiang Wu, Yi-Pan, Jianxin Wang. Current challenges and solutions of de novo assembly[J]. Quant. Biol., 2019, 7(2): 90-109.
[8] Sheng Wang, Zhen Li, Yizhou Yu, Xin Gao. WaveNano: a signal-level nanopore base-caller via simultaneous prediction of nucleotide labels and move labels through bi-directional WaveNets[J]. Quant. Biol., 2018, 6(4): 359-368.
[9] Tanlin Sun, Luhua Lai, Jianfeng Pei. Analysis of protein features and machine learning algorithms for prediction of druggable proteins[J]. Quant. Biol., 2018, 6(4): 334-343.
[10] Aysegul Guvenek, Bin Tian. Analysis of alternative cleavage and polyadenylation in mature and differentiating neurons using RNA-seq data[J]. Quant. Biol., 2018, 6(3): 253-266.
[11] Wei Vivian Li, Jingyi Jessica Li. Modeling and analysis of RNA-seq data: a review from a statistical perspective[J]. Quant. Biol., 2018, 6(3): 195-209.
[12] Saad M Khan, Jason E Denney, Michael X Wang, Dong Xu. Whole-exome sequencing and microRNA profiling reveal PI3K/AKT pathway’s involvement in juvenile myelomonocytic leukemia[J]. Quant. Biol., 2018, 6(1): 85-97.
[13] Lu Wang, Lipi Acharya, Changxin Bai, Dongxiao Zhu. Transcriptome assembly strategies for precision medicine[J]. Quant. Biol., 2017, 5(4): 280-290.
[14] Ganlu Hu, Guang-Zhong Wang. Decoding nervous system by single-cell RNA sequencing[J]. Quant. Biol., 2017, 5(3): 210-214.
[15] Jingwen Guan, Xu Shi, Roberto Burgos, Lanying Zeng. Visualization of phage DNA degradation by a type I CRISPR-Cas system at the single-cell level[J]. Quant. Biol., 2017, 5(1): 67-75.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed