|
|
|
Emerging deep learning methods for single-cell RNA-seq data analysis |
Jie Zheng( ), Ke Wang |
| School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China |
|
|
|
|
Abstract Deep learning is making major breakthrough in several areas of bioinformatics. Anticipating that this will occur soon for the single-cell RNA-seq data analysis, we review newly published deep learning methods that help tackle computational challenges. Autoencoders are found to be the dominant approach. However, methods based on deep generative models such as generative adversarial networks (GANs) are also emerging in this area.
|
| Keywords
single-cell
RNA-seq
deep learning
autoencoder
|
|
Corresponding Author(s):
Jie Zheng
|
|
Just Accepted Date: 15 November 2019
Online First Date: 17 December 2019
Issue Date: 31 December 2019
|
|
| 1 |
T. Ching, , D. S. Himmelstein, , B. K. Beaulieu-Jones, , A. A. Kalinin, , B. T. Do, , G. P. Way, , E. Ferrero, , P. M. Agapow, , M. Zietz, , M. M. Hoffman, , et al. (2018) Opportunities and obstacles for deep learning in biology and medicine. J. R. Soc. Interface, 15, 20170387
https://doi.org/10.1098/rsif.2017.0387.
pmid: 29618526
|
| 2 |
F. Tang, , K. Lao, and M. A. Surani, (2011) Development and applications of single-cell transcriptome analysis. Nat. Methods, 8, S6–S11
https://doi.org/10.1038/nmeth.1557.
pmid: 21451510
|
| 3 |
J. Berg, (2018) Exploring organisms cell by cell. Science, 362, 1333
https://doi.org/10.1126/science.aaw3633.
pmid: 30573601
|
| 4 |
A. Regev, , S. A. Teichmann, , E. S. Lander, , I. Amit, , C. Benoist, , E. Birney, , B. Bodenmiller, , P. Campbell, , P. Carninci, , M. Clatworthy, , et al. (2017) The Human Cell Atlas. eLife, 6, e270416
https://doi.org/10.7554/eLife.27041.
pmid: 29206104
|
| 5 |
O. Franzén, , L.-M. Gan, and J.L. Björkegren, (2019) PanglaoDB: a web server for exploration of mouse and human single-cell RNA sequencing data. Database, 2019, baz046
|
| 6 |
L. Yan, , M. Yang, , H. Guo, , L. Yang, , J. Wu, , R. Li, , P. Liu, , Y. Lian, , X. Zheng, , J. Yan, , et al. (2013) Single-cell RNA-Seq profiling of human preimplantation embryos and embryonic stem cells. Nat. Struct. Mol. Biol., 20, 1131–1139
https://doi.org/10.1038/nsmb.2660.
pmid: 23934149
|
| 7 |
Q. Deng, , D. Ramsköld, , B. Reinius, and R. Sandberg, (2014) Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells. Science, 343, 193–196
https://doi.org/10.1126/science.1245316.
pmid: 24408435
|
| 8 |
A. A. Kolodziejczyk, , J. K. Kim, , J. C. Tsang, , T. Ilicic, , J. Henriksson, , K. N. Natarajan, , A. C. Tuck, , X. Gao, , M. Bühler, , P. Liu, , et al. (2015) Single cell RNA-sequencing of pluripotent states unlocks modular transcriptional variation. Cell Stem Cell, 17, 471–485
https://doi.org/10.1016/j.stem.2015.09.011.
pmid: 26431182
|
| 9 |
V. Y. Kiselev, , T. S. Andrews, and M. Hemberg, (2019) Challenges in unsupervised clustering of single-cell RNA-seq data. Nat. Rev. Genet., 20, 273–282
https://doi.org/10.1038/s41576-018-0088-9.
pmid: 30617341
|
| 10 |
O. Stegle, , S. A. Teichmann, and J. C. Marioni, (2015) Computational and analytical challenges in single-cell transcriptomics. Nat. Rev. Genet., 16, 133–145
https://doi.org/10.1038/nrg3833.
pmid: 25628217
|
| 11 |
O. B. Poirion, , X. Zhu, , T. Ching, and L. Garmire, (2016) Single-cell transcriptomics bioinformatics and computational challenges. Front. Genet., 7, 163
https://doi.org/10.3389/fgene.2016.00163.
pmid: 27708664
|
| 12 |
A. Butler, , P. Hoffman, , P. Smibert, , E. Papalexi, and R. Satija, (2018) Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol., 36, 411–420
https://doi.org/10.1038/nbt.4096.
pmid: 29608179
|
| 13 |
L. Haghverdi, , F. Buettner, and F. J. Theis, (2015) Diffusion maps for high-dimensional single-cell analysis of differentiation data. Bioinformatics, 31, 2989–2998
https://doi.org/10.1093/bioinformatics/btv325.
pmid: 26002886
|
| 14 |
P. V. Kharchenko, , L. Silberstein, and D. T. Scadden, (2014) Bayesian approach to single-cell differential expression analysis. Nat. Methods, 11, 740–742
https://doi.org/10.1038/nmeth.2967.
pmid: 24836921
|
| 15 |
L. Zhang, and S. Zhang, (2018) Comparison of computational methods for imputing single-cell RNA-sequencing data. IEEE/ACM Trans. Comput. Biol. Bioinform.
https://doi.org/10.1109/TCBB.2018.2848633
|
| 16 |
Z. Miao, , K. Deng, , X. Wang, and X. Zhang, (2018) DEsingle for detecting three types of differential expression in single-cell RNA-seq data. Bioinformatics, 34, 3223–3224
https://doi.org/10.1093/bioinformatics/bty332.
pmid: 29688277
|
| 17 |
G.E. Hinton, , N. Srivastava,, A. Krizhevsky,, I. Sutskever, and R.R Salakhutdinov, (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580v1
|
| 18 |
N. Srivastava, , G. Hinton,, A. Krizhevsky,, I. Sutskever, and R. Salakhutdinov, (2014) Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res., 15, 1929–1958
|
| 19 |
P. Brennecke, , S. Anders, , J. K. Kim, , A. A. Kołodziejczyk, , X. Zhang, , V. Proserpio, , B. Baying, , V. Benes, , S. A. Teichmann, , J. C. Marioni, , et al. (2013) Accounting for technical noise in single-cell RNA-seq experiments. Nat. Methods, 10, 1093–1095
https://doi.org/10.1038/nmeth.2645.
pmid: 24056876
|
| 20 |
L. Jiang, , F. Schlesinger, , C. A. Davis, , Y. Zhang, , R. Li, , M. Salit, , T. R. Gingeras, and B. Oliver, (2011) Synthetic spike-in standards for RNA-seq experiments. Genome Res., 21, 1543–1551
https://doi.org/10.1101/gr.121095.111.
pmid: 21816910
|
| 21 |
S. Islam, , A. Zeisel, , S. Joost, , G. La Manno, , P. Zajac, , M. Kasper, , P. Lönnerberg, and S. Linnarsson, (2014) Quantitative single-cell RNA-seq with unique molecular identifiers. Nat. Methods, 11, 163–166
https://doi.org/10.1038/nmeth.2772.
pmid: 24363023
|
| 22 |
B. Ding, , L. Zheng, , Y. Zhu, , N. Li, , H. Jia, , R. Ai, , A. Wildberg, and W. Wang, (2015) Normalization and noise reduction for single cell RNA-seq experiments. Bioinformatics, 31, 2225–2227
https://doi.org/10.1093/bioinformatics/btv122.
pmid: 25717193
|
| 23 |
B. Ding, , L. Zheng, and W. Wang, (2017) Assessment of single cell RNA-Seq normalization methods. G3 (Bethesda), 7, 2039–2045
https://doi.org/10.1534/g3.117.040683.
pmid: 28468817
|
| 24 |
J. K. Kim, , A. A. Kolodziejczyk, , T. Ilicic, , S. A. Teichmann, and J. C. Marioni, (2015) Characterizing noise structure in single-cell RNA-seq distinguishes genuine from technical stochastic allelic expression. Nat. Commun., 6, 8687
https://doi.org/10.1038/ncomms9687.
pmid: 26489834
|
| 25 |
R. Bellman, and R. Corporation, (1957) Dynamic programming. Princeton: Princeton University Press
|
| 26 |
L. Van Der Maaten, , E. Postma, and J. Van den Herik, (2009) Dimensionality reduction: a comparative review. J. Mach. Learn. Res., 10, 13
|
| 27 |
A. K. Shalek, , R. Satija, , J. Shuga, , J. J. Trombetta, , D. Gennert, , D. Lu, , P. Chen, , R. S. Gertner, , J. T. Gaublomme, , N. Yosef, , et al. (2014) Single-cell RNA-seq reveals dynamic paracrine control of cellular variation. Nature, 510, 363–369
https://doi.org/10.1038/nature13437.
pmid: 24919153
|
| 28 |
L. van der Maaten, and G. Hinton, (2008) Visualizing data using t-SNE. J. Mach. Learn. Res., 9, 2579–2605
|
| 29 |
A. D. Amir, , K. L. Davis, , M. D. Tadmor, , E. F. Simonds, , J. H. Levine, , S. C. Bendall, , D. K. Shenfeld, , S. Krishnaswamy, , G. P. Nolan, and D. Pe’er, (2013) viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia. Nat. Biotechnol., 31, 545–552
https://doi.org/10.1038/nbt.2594.
pmid: 23685480
|
| 30 |
N. D. Lawrence, (2004) Gaussian process latent variable models for visualisation of high dimensional data. Adv. in Neural Inf. Proc. Sys., 16, 329–336
|
| 31 |
F. Buettner, and F. J. Theis, (2012) A novel approach for resolving differences in single-cell gene expression patterns from zygote to blastocyst. Bioinformatics, 28, i626–i632
https://doi.org/10.1093/bioinformatics/bts385.
pmid: 22962491
|
| 32 |
B. Wang, , J. Zhu, , E. Pierson, , D. Ramazzotti, and S. Batzoglou, (2017) Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning. Nat. Methods, 14, 414–416
https://doi.org/10.1038/nmeth.4207.
pmid: 28263960
|
| 33 |
E. Becht, , L. McInnes, , J. Healy, , C. A. Dutertre, , I. W. H. Kwok, , L. G. Ng, , F. Ginhoux, and E. W. Newell, (2018) Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol., 37, 38–44
https://doi.org/10.1038/nbt.4314.
pmid: 30531897
|
| 34 |
E. Pierson, and C. Yau, (2015) ZIFA: Dimensionality reduction for zero-inflated single-cell gene expression analysis. Genome Biol., 16, 241
https://doi.org/10.1186/s13059-015-0805-z.
pmid: 26527291
|
| 35 |
E. Z. Macosko, , A. Basu, , R. Satija, , J. Nemesh, , K. Shekhar, , M. Goldman, , I. Tirosh, , A. R. Bialas, , N. Kamitaki, , E. M. Martersteck, , et al. (2015) Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell, 161, 1202–1214
https://doi.org/10.1016/j.cell.2015.05.002.
pmid: 26000488
|
| 36 |
M. S. Nobile, , P. Cazzaniga, , A. Tangherloni, and D. Besozzi, (2017) Graphics processing units in bioinformatics, computational biology and systems biology. Brief. Bioinformatics, 18, 870–885
pmid: 27402792.
|
| 37 |
H. Bourlard, and Y. Kamp, (1988) Auto-association by multilayer perceptrons and singular value decomposition. Biol. Cybern., 59, 291–294
https://doi.org/10.1007/BF00332918.
pmid: 3196773
|
| 38 |
G. E. Hinton, and R. R. Salakhutdinov, (2006) Reducing the dimensionality of data with neural networks. Science, 313, 504–507
https://doi.org/10.1126/science.1127647.
pmid: 16873662
|
| 39 |
P. Vincent, , H. Larochelle,, Y. Bengio, and P.A. Manzagol, (2008) Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine learning, pp. 1096–1103. ACM: Helsinki, Finland
|
| 40 |
D.P. Kingma, and M. Welling, (2013) Auto-encoding variational bayes. arXiv:1312.6114v10
|
| 41 |
U. Shaham, , K. P. Stanton, , J. Zhao, , H. Li, , K. Raddassi, , R. Montgomery, and Y. Kluger, (2017) Removal of batch effects using distribution-matching residual networks. Bioinformatics, 33, 2539–2546
https://doi.org/10.1093/bioinformatics/btx196.
pmid: 28419223
|
| 42 |
X. Li, , Y. Lyu,, J. Park,, J. Zhang,, D. Stambolian,, K. Susztak,, G. Hu,, M. Li, (2019) Deep learning enables accurate clustering and batch effect removal in single-cell RNA-seq analysis. bioRxiv, 530378
|
| 43 |
D. van Dijk, , D. Dijk,, R. Sharma,, J. Nainys,, K. Yim,, P. Kathail,, A.J. Carr,, C. Burdziak,, K. R. Moon,, C. L. Chaffer,, et al. (2018) Recovering gene interactions from single-cell data using data diffusion. Cell, 174, 716–729 e27
|
| 44 |
W. V. Li, and J. J. Li, (2018) An accurate and robust imputation method scImpute for single-cell RNA-seq data. Nat. Commun., 9, 997
https://doi.org/10.1038/s41467-018-03405-7.
pmid: 29520097
|
| 45 |
W. Gong, , I. Y. Kwak, , P. Pota, , N. Koyano-Nakagawa, and D. J. Garry, (2018) DrImpute: imputing dropout events in single cell RNA sequencing data. BMC Bioinformatics, 19, 220
https://doi.org/10.1186/s12859-018-2226-y.
pmid: 29884114
|
| 46 |
D. Talwar, , A. Mongia, , D. Sengupta, and A. Majumdar, (2018) AutoImpute: Autoencoder based imputation of single-cell RNA-seq data. Sci. Rep., 8, 16329
https://doi.org/10.1038/s41598-018-34688-x.
pmid: 30397240
|
| 47 |
G. Eraslan, , L. M. Simon, , M. Mircea, , N. S. Mueller, and F. J. Theis, (2019) Single-cell RNA-seq denoising using a deep count autoencoder. Nat. Commun., 10, 390
https://doi.org/10.1038/s41467-018-07931-2.
pmid: 30674886
|
| 48 |
C. Lin, , S. Jain, , H. Kim, and Z. Bar-Joseph, (2017) Using neural networks for reducing the dimensions of single-cell RNA-Seq data. Nucleic Acids Res., 45, e156
https://doi.org/10.1093/nar/gkx681.
pmid: 28973464
|
| 49 |
J. Ding, , A. Condon, and S. P. Shah, (2018) Interpretable dimensionality reduction of single cell transcriptome data with deep generative models. Nat. Commun., 9, 2002
https://doi.org/10.1038/s41467-018-04368-5.
pmid: 29784946
|
| 50 |
D. Wang, and J. Gu, (2018) VASC: Dimension reduction and visualization of single-cell RNA-seq data by deep variational autoencoder. Genom. Proteom. Bioinf., 16, 320–331
https://doi.org/10.1016/j.gpb.2018.08.003.
pmid: 30576740
|
| 51 |
J. Peng, , X. Wang, and X. Shang, (2019) Combining gene ontology with deep neural networks to enhance the clustering of single cell RNA-Seq data. BMC Bioinformatics, 20, 284
https://doi.org/10.1186/s12859-019-2769-6.
pmid: 31182005
|
| 52 |
R. Lopez, , J. Regier, , M. B. Cole, , M. I. Jordan, and N. Yosef, (2018) Deep generative modeling for single-cell transcriptomics. Nat. Methods, 15, 1053–1058
https://doi.org/10.1038/s41592-018-0229-2.
pmid: 30504886
|
| 53 |
J. Wang, , D. Agarwal, , M. Huang, , G. Hu, , Z. Zhou, , C. Ye, and N. R. Zhang, (2019) Data denoising with transfer learning in single-cell transcriptomics. Nat. Methods, 16, 875–878
https://doi.org/10.1038/s41592-019-0537-1.
pmid: 31471617
|
| 54 |
Y. Deng, , F. Bao, , Q. Dai, , L. F. Wu, and S. J. Altschuler, (2019) Scalable analysis of cell-type composition from single-cell transcriptomics using deep recurrent learning. Nat. Methods, 16, 311–314
https://doi.org/10.1038/s41592-019-0353-7.
pmid: 30886411
|
| 55 |
Q. Hu, and C. S. Greene, (2019) Parameter tuning is a key part of dimensionality reduction via deep variational autoencoders for single cell RNA transcriptomics. Pac. Symp. Biocomput., 24, 362–373
pmid: 30963075.
|
| 56 |
T. Stuart,, A. Butler,, P. Hoffman,, C. Hafemeister,, E. Papalexi,, W. M. MauckIII,, Y. Hao,, M. Stoeckius,, P. Smibert,, R. Satija, (2019) Comprehensive integration of single-cell data. Cell, 177, 1888–1902 e21
|
| 57 |
V. Bhardwaj, , S. Heyne, , K. Sikora, , L. Rabbani, , M. Rauer, , F. Kilpert, , A. S. Richter, , D. P. Ryan, and T. Manke, (2019) snakePipes: facilitating flexible, scalable and integrative epigenomic analysis. Bioinformatics, btz436
https://doi.org/10.1093/bioinformatics/btz436.
pmid: 31134269
|
| 58 |
M. Stoeckius, , C. Hafemeister, , W. Stephenson, , B. Houck-Loomis, , P. K. Chattopadhyay, , H. Swerdlow, , R. Satija, and P. Smibert, (2017) Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods, 14, 865–868
https://doi.org/10.1038/nmeth.4380.
pmid: 28759029
|
| 59 |
M. Marouf, , P. Machart,, V. Bansal,, C. Kilian,, D.S. Magruder,, C.F. Krebs, and S. Bonn, (2018) Realistic in silico generation and augmentation of single cell RNA-seq data using Generative Adversarial Neural Networks. bioRxiv, 390153
|
| 60 |
G. Eraslan, , Ž. Avsec, , J. Gagneur, and F. J. Theis, (2019) Deep learning: new computational modelling techniques for genomics. Nat. Rev. Genet., 20, 389–403
https://doi.org/10.1038/s41576-019-0122-6.
pmid: 30971806
|
| 61 |
A. Ghahramani, , F. M. Watt, and N. M. Luscombe, (2018) Generative adversarial networks uncover epidermal regulators and predict single cell perturbations. bioRxiv, 262501
|
| 62 |
M. Amodio, and S. Krishnaswamy, (2018) MAGAN: Aligning biological manifolds. arXiv,1803.00385
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
| |
Shared |
|
|
|
|
| |
Discussed |
|
|
|
|