|
|
Detecting differential expression from RNA-seq data with expression measurement uncertainty |
Li ZHANG,Songcan CHEN,Xuejun LIU( ) |
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China |
|
|
Abstract High-throughput RNA sequencing (RNA-seq) has emerged as a revolutionary and powerful technology for expression profiling. Most proposed methods for detecting differentially expressed (DE) genes from RNA-seq are based on statistics that compare normalized read counts between conditions. However, there are few methods considering the expression measurement uncertainty into DE detection. Moreover, most methods are only capable of detecting DE genes, and few methods are available for detecting DE isoforms. In this paper, a Bayesian framework (BDSeq) is proposed to detect DE genes and isoforms with consideration of expression measurement uncertainty. This expression measurement uncertainty provides useful information which can help to improve the performance of DE detection. Three real RAN-seq data sets are used to evaluate the performance of BDSeq and results show that the inclusion of expression measurement uncertainty improves accuracy in detection of DE genes and isoforms. Finally, we develop a GamSeq-BDSeq RNA-seq analysis pipeline to facilitate users.
|
Keywords
RNA-seq
Bayesian method
differentially expressed genes/isoforms
expression measurement uncertainty
analysis pipeline
|
Corresponding Author(s):
Xuejun LIU
|
Just Accepted Date: 31 December 2014
Issue Date: 07 September 2015
|
|
1 |
Mortazavi A, Williams A, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods, 2008, 5(7): 621―628
https://doi.org/10.1038/nmeth.1226
|
2 |
Marioni J, Mason C, Mane S, Stephens M, Gilad Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Research, 2008, 18: 1509―1517
https://doi.org/10.1101/gr.079558.108
|
3 |
Marguerat S, B?hler J. RNA-seq: from technology to biology. Cellular and Molecular Life Sciences, 2010, 67(4): 569―579
https://doi.org/10.1007/s00018-009-0180-6
|
4 |
Rapaport F, Khanin R, Liang Y, Pirun M, Krek A, Zumbo P, Mason C E, Socci N D, Betel D. Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data. Genome Biology, 2013, 14(9): R95
https://doi.org/10.1186/gb-2013-14-9-r95
|
5 |
Zhang Z H, Jhaveri D J, Marshall VM, Bauer D C, Edson J, Narayanan R K, Zhao Q. A comparative study of techniques for differential expression analysis on RNA-Seq data. PLoS ONE, 2014, 9: e103207
https://doi.org/10.1371/journal.pone.0103207
|
6 |
Ozsolak F, Milos P. RNA sequencing: advances, challenges and opportunities. Nature Reviews Genetics, 2011, 12(2): 87―98
https://doi.org/10.1038/nrg2934
|
7 |
Soneson C, Delorenzi M. A comparison of methods for differential expression analysis of RNA-seq data. BMC Bioinformatics, 2013, 14(1): 9
https://doi.org/10.1186/1471-2105-14-91
|
8 |
Kvam V, Lu P, Si Y. A comparison of statistical methods for detecting differentially expressed genes from Rna-Seq data. American Journal of Botany, 2012, 99(2): 248―256
https://doi.org/10.3732/ajb.1100340
|
9 |
Seyednasrollah F, Laiho A, Elo L L. Comparison of software packages for detecting differential expression in RNA-seq studies. Briefings in bioinformatics, 2013, bbt086
|
10 |
Anders S, McCarthy D J, Chen Y, Okoniewski M, Smyth G K, Huber W, Robinson M D. Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nature Protocols, 2013, 8(9): 1765―1786
https://doi.org/10.1038/nprot.2013.099
|
11 |
Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biology, 2010, 11(10): R106
https://doi.org/10.1186/gb-2010-11-10-r106
|
12 |
Hardcastle T, Kelly K. baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics, 2010, 11(1): 422
https://doi.org/10.1186/1471-2105-11-422
|
13 |
Di Y, Schafer D, Cumbie J, Chang J. The NBP negative binomial model for assessing differential gene expression from RNA-Seq. Statistical Applications in Genetics and Molecular Biology, 2011, 10(1): 1―28
https://doi.org/10.2202/1544-6115.1637
|
14 |
Yu D, Huber W, Vitek O. Shrinkage estimation of dispersion in negative binomial models for RNA-seq experiments with small sample size. Bioinformatics, 2013, 29(10): 1275―1282
https://doi.org/10.1093/bioinformatics/btt143
|
15 |
Robinson M, Smyth G. Moderated statistical tests for assessing differences in tag abundance. Bioinformatics, 2007, 23(21): 2881―2887
https://doi.org/10.1093/bioinformatics/btm453
|
16 |
Wu H, Wang C, Wu Z. A new shrinkage estimator for dispersion improves differential expression detection in RNA-seq data. Biostatistics, 2013, 14(2): 232―243
https://doi.org/10.1093/biostatistics/kxs033
|
17 |
Law CW, Chen Y, Shi W, Smyth G K. Voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biology, 2014, 15: R29
https://doi.org/10.1186/gb-2014-15-2-r29
|
18 |
Bi Y, Davuluri R V. NPEBseq: nonparametric empirical bayesianbased procedure for differential expression analysis of RNA-seq data. BMC bioinformatics, 2013, 14(1): 262
https://doi.org/10.1186/1471-2105-14-262
|
19 |
Sandmann T, Vogg M, Owlarn S, Boutros M, Bartscherer K. The headregeneration transcriptome of the planarian Schmidtea mediterranea. Genome Biol, 2011, 12(8): R76
https://doi.org/10.1186/gb-2011-12-8-r76
|
20 |
Jiang H, Wong W. Statistical inferences for isoform expression in RNA-Seq. Bioinformatics, 2009, 25(8): 1026―1032
https://doi.org/10.1093/bioinformatics/btp113
|
21 |
Li B, Dewey C. RSEM: accurate transcript quantification from RNASeq data with or without a reference genome. BMC Bioinformatics, 2011, 12(1): 323
https://doi.org/10.1186/1471-2105-12-323
|
22 |
Trapnell C, Williams B, Pertea G, Mortazavi A, Kwan G, Baren M, Salzberg S, Wold B, Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature Biotechnology, 2010, 28(5): 211―215
https://doi.org/10.1038/nbt.1621
|
23 |
Glaus P, Honkela A, Rattray M. Identifying differentially expressed transcripts from RNA-seq data with biological variation. Bioinformatics, 2011, 28(13): 1721―1728
https://doi.org/10.1093/bioinformatics/bts260
|
24 |
Leng N, Dawson J, Thomson A, Ruotti V, Rissman A, Smits B M G, Haag J D, Gould M N, Stewart R M, Kendziorski C. EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments. Bioinformatics, 2013, 29(8): 1035―1043
https://doi.org/10.1093/bioinformatics/btt087
|
25 |
Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley D, Pimentel H, Salzberg S L, Rinn J L, Pachter L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nature Protocols, 2012, 7(3): 562―578
https://doi.org/10.1038/nprot.2012.016
|
26 |
Hein A, Richardson S, Causton H, Ambler G, Green P. BGX: a fully Bayesian integrated approach to the analysis of Affymetrix GeneChip data. Biostatistics, 2005, 6(3): 349―373
https://doi.org/10.1093/biostatistics/kxi016
|
27 |
Liu X, Milo M, Lawrence D, Rattray M. Probe-level measurement error improv<?Pub Caret?>es accuracy in detecting differential gene expression. Bioinformatics, 2006, 22(17): 2107―2113
https://doi.org/10.1093/bioinformatics/btl361
|
28 |
Zhang L, Liu X. An improved probabilistic model for finding differential gene expression. In: Proceedings of the 2nd International Conference on Biomedical Engineering and Informatics. 2009, 1-4: 1566―1571
|
29 |
Zhang L, Liu X. A Gamma-based method of RNA-seq analysis. Journal of Nanjing University (Natural Sciences), 2013, 49: 465―474(in Chinese)
|
30 |
Jordan M, Ghahramani Z, Jaakkola T, Saul L. An introduction to variational methods for graphical models. Machine Learning, 1999, 37(2): 183―233
https://doi.org/10.1023/A:1007665907178
|
31 |
Sun J, Kaban A. A fast algorithm for robust mixtures in the presence of measurement errors. IEEE Transactions on Neural Networks, 2010, 21(8): 1206―1220
https://doi.org/10.1109/TNN.2010.2048219
|
32 |
MAQC Consortium. TheMicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nature Biotechnology, 2006, 24(9): 1151―1161
https://doi.org/10.1038/nbt1239
|
33 |
Canales R D, Luo Y L, Willey J C, Austermiller B, Barbacioru C C, Boysen C, Hunkapiller K, Jensen R V, Knight C R, Lee K Y, Ma Y Q, Maqsodi B, Papallo A, Peters E H, Poulter K, Ruppel P L, Samaha R R, Shi L M, Yang W, Zhang L, Goodsaid F M. Evaluation of DNA microarray results with quantitative gene expression platforms. Nature Biotechnology, 2006, 24(9): 1115―1122
https://doi.org/10.1038/nbt1236
|
34 |
Griffith M, Griffith OL, Mwenifumbo J, Goya R, Morrissy AS, Morin R D, Corbett R, Tang M J, Hou Y C, Pugh T J, Robertson G, Chittaranjan S, Ally A, Asano J K, Chan S Y, Li H Y I, McDonald H, Teague K, Zhao Y J, Zeng T, Delaney A, Hirst M, Morin G B, Jones S GM, Tai I T, Marra M A. Alternative expression analysis by RNA sequencing. Nature Methods, 2010, 7(10): 843―847
https://doi.org/10.1038/nmeth.1503
|
35 |
Wang E, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore S F, Schroth G P, Burge C B. Alternative isoform regulation in human tissue transcriptomes. Nature, 2008, 456(7221): 470―476
https://doi.org/10.1038/nature07509
|
[1] |
Supplementary Material-Highlights in 3-page ppt
|
Download
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|