Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2015, Vol. 9 Issue (4) : 652-663    https://doi.org/10.1007/s11704-015-4308-6
RESEARCH ARTICLE
Detecting differential expression from RNA-seq data with expression measurement uncertainty
Li ZHANG,Songcan CHEN,Xuejun LIU()
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
 Download: PDF(709 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

High-throughput RNA sequencing (RNA-seq) has emerged as a revolutionary and powerful technology for expression profiling. Most proposed methods for detecting differentially expressed (DE) genes from RNA-seq are based on statistics that compare normalized read counts between conditions. However, there are few methods considering the expression measurement uncertainty into DE detection. Moreover, most methods are only capable of detecting DE genes, and few methods are available for detecting DE isoforms. In this paper, a Bayesian framework (BDSeq) is proposed to detect DE genes and isoforms with consideration of expression measurement uncertainty. This expression measurement uncertainty provides useful information which can help to improve the performance of DE detection. Three real RAN-seq data sets are used to evaluate the performance of BDSeq and results show that the inclusion of expression measurement uncertainty improves accuracy in detection of DE genes and isoforms. Finally, we develop a GamSeq-BDSeq RNA-seq analysis pipeline to facilitate users.

Keywords RNA-seq      Bayesian method      differentially expressed genes/isoforms      expression measurement uncertainty      analysis pipeline     
Corresponding Author(s): Xuejun LIU   
Just Accepted Date: 31 December 2014   Issue Date: 07 September 2015
 Cite this article:   
Li ZHANG,Songcan CHEN,Xuejun LIU. Detecting differential expression from RNA-seq data with expression measurement uncertainty[J]. Front. Comput. Sci., 2015, 9(4): 652-663.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-015-4308-6
https://academic.hep.com.cn/fcs/EN/Y2015/V9/I4/652
1 Mortazavi A, Williams A, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods, 2008, 5(7): 621―628
https://doi.org/10.1038/nmeth.1226
2 Marioni J, Mason C, Mane S, Stephens M, Gilad Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Research, 2008, 18: 1509―1517
https://doi.org/10.1101/gr.079558.108
3 Marguerat S, B?hler J. RNA-seq: from technology to biology. Cellular and Molecular Life Sciences, 2010, 67(4): 569―579
https://doi.org/10.1007/s00018-009-0180-6
4 Rapaport F, Khanin R, Liang Y, Pirun M, Krek A, Zumbo P, Mason C E, Socci N D, Betel D. Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data. Genome Biology, 2013, 14(9): R95
https://doi.org/10.1186/gb-2013-14-9-r95
5 Zhang Z H, Jhaveri D J, Marshall VM, Bauer D C, Edson J, Narayanan R K, Zhao Q. A comparative study of techniques for differential expression analysis on RNA-Seq data. PLoS ONE, 2014, 9: e103207
https://doi.org/10.1371/journal.pone.0103207
6 Ozsolak F, Milos P. RNA sequencing: advances, challenges and opportunities. Nature Reviews Genetics, 2011, 12(2): 87―98
https://doi.org/10.1038/nrg2934
7 Soneson C, Delorenzi M. A comparison of methods for differential expression analysis of RNA-seq data. BMC Bioinformatics, 2013, 14(1): 9
https://doi.org/10.1186/1471-2105-14-91
8 Kvam V, Lu P, Si Y. A comparison of statistical methods for detecting differentially expressed genes from Rna-Seq data. American Journal of Botany, 2012, 99(2): 248―256
https://doi.org/10.3732/ajb.1100340
9 Seyednasrollah F, Laiho A, Elo L L. Comparison of software packages for detecting differential expression in RNA-seq studies. Briefings in bioinformatics, 2013, bbt086
10 Anders S, McCarthy D J, Chen Y, Okoniewski M, Smyth G K, Huber W, Robinson M D. Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nature Protocols, 2013, 8(9): 1765―1786
https://doi.org/10.1038/nprot.2013.099
11 Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biology, 2010, 11(10): R106
https://doi.org/10.1186/gb-2010-11-10-r106
12 Hardcastle T, Kelly K. baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics, 2010, 11(1): 422
https://doi.org/10.1186/1471-2105-11-422
13 Di Y, Schafer D, Cumbie J, Chang J. The NBP negative binomial model for assessing differential gene expression from RNA-Seq. Statistical Applications in Genetics and Molecular Biology, 2011, 10(1): 1―28
https://doi.org/10.2202/1544-6115.1637
14 Yu D, Huber W, Vitek O. Shrinkage estimation of dispersion in negative binomial models for RNA-seq experiments with small sample size. Bioinformatics, 2013, 29(10): 1275―1282
https://doi.org/10.1093/bioinformatics/btt143
15 Robinson M, Smyth G. Moderated statistical tests for assessing differences in tag abundance. Bioinformatics, 2007, 23(21): 2881―2887
https://doi.org/10.1093/bioinformatics/btm453
16 Wu H, Wang C, Wu Z. A new shrinkage estimator for dispersion improves differential expression detection in RNA-seq data. Biostatistics, 2013, 14(2): 232―243
https://doi.org/10.1093/biostatistics/kxs033
17 Law CW, Chen Y, Shi W, Smyth G K. Voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biology, 2014, 15: R29
https://doi.org/10.1186/gb-2014-15-2-r29
18 Bi Y, Davuluri R V. NPEBseq: nonparametric empirical bayesianbased procedure for differential expression analysis of RNA-seq data. BMC bioinformatics, 2013, 14(1): 262
https://doi.org/10.1186/1471-2105-14-262
19 Sandmann T, Vogg M, Owlarn S, Boutros M, Bartscherer K. The headregeneration transcriptome of the planarian Schmidtea mediterranea. Genome Biol, 2011, 12(8): R76
https://doi.org/10.1186/gb-2011-12-8-r76
20 Jiang H, Wong W. Statistical inferences for isoform expression in RNA-Seq. Bioinformatics, 2009, 25(8): 1026―1032
https://doi.org/10.1093/bioinformatics/btp113
21 Li B, Dewey C. RSEM: accurate transcript quantification from RNASeq data with or without a reference genome. BMC Bioinformatics, 2011, 12(1): 323
https://doi.org/10.1186/1471-2105-12-323
22 Trapnell C, Williams B, Pertea G, Mortazavi A, Kwan G, Baren M, Salzberg S, Wold B, Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature Biotechnology, 2010, 28(5): 211―215
https://doi.org/10.1038/nbt.1621
23 Glaus P, Honkela A, Rattray M. Identifying differentially expressed transcripts from RNA-seq data with biological variation. Bioinformatics, 2011, 28(13): 1721―1728
https://doi.org/10.1093/bioinformatics/bts260
24 Leng N, Dawson J, Thomson A, Ruotti V, Rissman A, Smits B M G, Haag J D, Gould M N, Stewart R M, Kendziorski C. EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments. Bioinformatics, 2013, 29(8): 1035―1043
https://doi.org/10.1093/bioinformatics/btt087
25 Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley D, Pimentel H, Salzberg S L, Rinn J L, Pachter L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nature Protocols, 2012, 7(3): 562―578
https://doi.org/10.1038/nprot.2012.016
26 Hein A, Richardson S, Causton H, Ambler G, Green P. BGX: a fully Bayesian integrated approach to the analysis of Affymetrix GeneChip data. Biostatistics, 2005, 6(3): 349―373
https://doi.org/10.1093/biostatistics/kxi016
27 Liu X, Milo M, Lawrence D, Rattray M. Probe-level measurement error improv<?Pub Caret?>es accuracy in detecting differential gene expression. Bioinformatics, 2006, 22(17): 2107&horbar;2113
https://doi.org/10.1093/bioinformatics/btl361
28 Zhang L, Liu X. An improved probabilistic model for finding differential gene expression. In: Proceedings of the 2nd International Conference on Biomedical Engineering and Informatics. 2009, 1-4: 1566&horbar;1571
29 Zhang L, Liu X. A Gamma-based method of RNA-seq analysis. Journal of Nanjing University (Natural Sciences), 2013, 49: 465&horbar;474(in Chinese)
30 Jordan M, Ghahramani Z, Jaakkola T, Saul L. An introduction to variational methods for graphical models. Machine Learning, 1999, 37(2): 183&horbar;233
https://doi.org/10.1023/A:1007665907178
31 Sun J, Kaban A. A fast algorithm for robust mixtures in the presence of measurement errors. IEEE Transactions on Neural Networks, 2010, 21(8): 1206&horbar;1220
https://doi.org/10.1109/TNN.2010.2048219
32 MAQC Consortium. TheMicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nature Biotechnology, 2006, 24(9): 1151&horbar;1161
https://doi.org/10.1038/nbt1239
33 Canales R D, Luo Y L, Willey J C, Austermiller B, Barbacioru C C, Boysen C, Hunkapiller K, Jensen R V, Knight C R, Lee K Y, Ma Y Q, Maqsodi B, Papallo A, Peters E H, Poulter K, Ruppel P L, Samaha R R, Shi L M, Yang W, Zhang L, Goodsaid F M. Evaluation of DNA microarray results with quantitative gene expression platforms. Nature Biotechnology, 2006, 24(9): 1115&horbar;1122
https://doi.org/10.1038/nbt1236
34 Griffith M, Griffith OL, Mwenifumbo J, Goya R, Morrissy AS, Morin R D, Corbett R, Tang M J, Hou Y C, Pugh T J, Robertson G, Chittaranjan S, Ally A, Asano J K, Chan S Y, Li H Y I, McDonald H, Teague K, Zhao Y J, Zeng T, Delaney A, Hirst M, Morin G B, Jones S GM, Tai I T, Marra M A. Alternative expression analysis by RNA sequencing. Nature Methods, 2010, 7(10): 843&horbar;847
https://doi.org/10.1038/nmeth.1503
35 Wang E, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore S F, Schroth G P, Burge C B. Alternative isoform regulation in human tissue transcriptomes. Nature, 2008, 456(7221): 470&horbar;476
https://doi.org/10.1038/nature07509
[1] Supplementary Material-Highlights in 3-page ppt
Download
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed