Please wait a minute...
Quantitative Biology

ISSN 2095-4689

ISSN 2095-4697(Online)

CN 10-1028/TM

邮发代号 80-971

Quantitative Biology  2019, Vol. 7 Issue (4): 327-334   https://doi.org/10.1007/s40484-019-0183-8
  本期目录
Differential methylation analysis for bisulfite sequencing using DSS
Hao Feng1, Hao Wu2()
1. Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH 44106, USA
2. Department of Biostatistics and Bioinformatics, Emory University Rollins School of Public Health, Atlanta, GA 30322, USA
 全文: PDF(955 KB)   HTML
Abstract

Bisulfite sequencing (BS-seq) technology measures DNA methylation at single nucleotide resolution. A key task in BS-seq data analysis is to identify differentially methylation (DM) under different conditions. Here we provide a tutorial for BS-seq DM analysis using Bioconductor package DSS. DSS uses a beta-binomial model to characterize the sequence counts from BS-seq, and implements rigorous statistical method for hypothesis testing. It provides flexible functionalities for a variety of DM analyses.

Key wordsepigenetics    DNA methylation    bisulfite sequencing    differential methylation
收稿日期: 2019-05-21      出版日期: 2019-12-31
Corresponding Author(s): Hao Wu   
 引用本文:   
. [J]. Quantitative Biology, 2019, 7(4): 327-334.
Hao Feng, Hao Wu. Differential methylation analysis for bisulfite sequencing using DSS. Quant. Biol., 2019, 7(4): 327-334.
 链接本文:  
https://academic.hep.com.cn/qb/CN/10.1007/s40484-019-0183-8
https://academic.hep.com.cn/qb/CN/Y2019/V7/I4/327
chr pos N X
chr1 10497 48 45
chr1 10525 48 48
chr1 10542 48 47
chr1 10589 34 1
Tab.1  
Index Accession number Cell line Condition Sample title Download file name R object
1 GSM1084238 A549 normal A0R_d0_rep1 GSM1084238_A0R_d0_rep1.cpgs.txt dat1
2 GSM1084239 A549 metastatic A3R_d0_rep1 GSM1084239_A3R_d0_rep1.cpgs.txt dat2
3 GSM1084244 HTB56 normal H0R_d0_rep1 GSM1084244_H0R_d0_rep1.cpgs.txt dat3
4 GSM1084245 HTB56 metastatic H3R_d0_rep1 GSM1084245_H3R_d0_rep1.cpgs.txt dat4
5 GSM1251236 A549 normal A0R_d0_rep2 GSM1251236_A0R_d0_rep2.cpgs.txt dat5
6 GSM1251237 A549 metastatic A3R_d0_rep2 GSM1251237_A3R_d0_rep2.cpgs.txt dat6
7 GSM1251238 HTB56 normal H0R_d0_rep2 GSM1251238_H0R_d0_rep2.cpgs.txt dat7
8 GSM1251239 HTB56 metastatic H3R_d0_rep2 GSM1251239_H3R_d0_rep2.cpgs.txt dat8
Tab.2  
fn= dir(pattern= ".cpgs.txt")
for(i in 1:length(fn)){
cat("working on sample ", i, "\n")
#read in file
dat.tmp= read.table(fn[i], header= T)
#split the third column by the ‘/’ sign
m.tmp= as.numeric(unlist(strsplit(as.character(dat.tmp[,3]),
split= "/")))
#odd number indexes the methylated read counts
#even number indexes the unmethylated read counts
idx.even= (1:nrow(dat.tmp))*2
idx.odd= idx.even- 1
chr= dat.tmp$CHR
pos= dat.tmp$POS
N= m.tmp[idx.even] + m.tmp[idx.odd]
X= m.tmp[idx.odd]
#dat.s is a temporary object for one sample
dat.s= data.frame(chr= chr, pos= pos, N= N, X= X)
#save and name the object from one sample
nam= paste("dat", i, sep= "")
assign(nam, dat.s)
}
Tab.3  
BSobj= makeBSseqData(list(dat1, dat5, dat3, dat7),
c("A0R1", "A0R2", "H0R1", "H0R2"))[1:5000]
Tab.4  
dmlTest= DMLtest(BSobj, group1=c("A0R1", "A0R2"),
group2=c("H0R1", "H0R2"), smoothing=FALSE)
Tab.5  
Fig.1  
Fig.2  
Fig.3  
1 T. H. Bestor, (2000) The DNA methyltransferases of mammals. Hum. Mol. Genet., 9, 2395–2402
https://doi.org/10.1093/hmg/9.16.2395. pmid: 11005794
2 A. Bird, (2002) DNA methylation patterns and epigenetic memory. Genes Dev., 16, 6–21
https://doi.org/10.1101/gad.947102. pmid: 11782440
3 W. Reik, (2007) Stability and flexibility of epigenetic gene regulation in mammalian development. Nature, 447, 425–432
https://doi.org/10.1038/nature05918. pmid: 17522676
4 E. Li, , C. Beard, and R. Jaenisch, (1993) Role for DNA methylation in genomic imprinting. Nature, 366, 362–365
https://doi.org/10.1038/366362a0. pmid: 8247133
5 P. A. Jones, (2012) Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat. Rev. Genet., 13, 484–492
https://doi.org/10.1038/nrg3230. pmid: 22641018
6 P. A. Jones, and D. Takai, (2001) The role of DNA methylation in mammalian epigenetics. Science, 293, 1068–1070
https://doi.org/10.1126/science.1063852. pmid: 11498573
7 S. B. Baylin, (2005) DNA methylation and gene silencing in cancer. Nat. Clin. Pract. Oncol., 2, S4–S11
https://doi.org/10.1038/ncponc0354. pmid: 16341240
8 P. W. Laird, and R. Jaenisch, (1996) The role of DNA methylation in cancer genetic and epigenetics. Annu. Rev. Genet., 30, 441–464
https://doi.org/10.1146/annurev.genet.30.1.441. pmid: 8982461
9 P. A. Jones, (1996) DNA methylation errors and cancer. Cancer Res., 56, 2463–2467
pmid: 8653676.
10 H. Feng, , P. Jin, and H. Wu, (2018) Disease prediction by cell-free DNA methylation. Brief. Bioinform., 20, 585–597
https://doi.org/10.1093/bib/bby029 pmid: 29672679.
11 R. Lister, , R. C. O’Malley, , J. Tonti-Filippini, , B. D. Gregory, , C. C. Berry, , A. H. Millar, and J. R. Ecker, (2008) Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell, 133, 523–536
https://doi.org/10.1016/j.cell.2008.03.029. pmid: 18423832
12 D. Zilberman, , M. Gehring, , R. K. Tran, , T. Ballinger, and S. Henikoff, (2007) Genome-wide analysis of Arabidopsis thaliana DNA methylation uncovers an interdependence between methylation and transcription. Nat. Genet., 39, 61–69
https://doi.org/10.1038/ng1929. pmid: 17128275
13 S. J. Cokus, , S. Feng, , X. Zhang, , Z. Chen, , B. Merriman, , C. D. Haudenschild, , S. Pradhan, , S. F. Nelson, , M. Pellegrini, and S. E. Jacobsen, (2008) Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature, 452, 215–219
https://doi.org/10.1038/nature06745. pmid: 18278030
14 A. Zemach, , I. E. McDaniel, , P. Silva, and D. Zilberman, (2010) Genome-wide evolutionary analysis of eukaryotic DNA methylation. Science, 328, 916–919
https://doi.org/10.1126/science.1186366. pmid: 20395474
15 A. Akalin, , M. Kormaksson, , S. Li, , F. E. Garrett-Bakelman, , M. E. Figueroa, , A. Melnick, and C. E. Mason, (2012) methylKit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles. Genome Biol., 13, R87
https://doi.org/10.1186/gb-2012-13-10-r87. pmid: 23034086
16 K. D. Hansen, , B. Langmead, and R. A. Irizarry, (2012) BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions. Genome Biol., 13, R83
https://doi.org/10.1186/gb-2012-13-10-r83. pmid: 23034175
17 K. Hebestreit, , M. Dugas, and H. U. Klein, (2013) Detection of significantly differentially methylated regions in targeted bisulfite sequencing data. Bioinformatics, 29, 1647–1653
https://doi.org/10.1093/bioinformatics/btt263. pmid: 23658421
18 H. Feng, , K. N. Conneely, and H. Wu, (2014) A Bayesian hierarchical model to detect differentially methylated loci from single nucleotide resolution sequencing data. Nucleic Acids Res., 42, e69
https://doi.org/10.1093/nar/gku154. pmid: 24561809
19 H. Wu,, T.L. Xu,, H. Feng,, L. Chen,, B. Li,, B. Yao,, Z.H. Qin,, P. Jin,, and K.N. Conneely, (2015) Detection of differentially methylated regions from whole-genome bisulfite sequencing data without replicates. Nucleic acids Res. 43, e141
20 Y. Park, and H. Wu, (2016) Differential methylation analysis for BS-seq data under general experimental design. Bioinformatics, 32, 1446–1453
https://doi.org/10.1093/bioinformatics/btw026. pmid: 26819470
21 X. Yu, and S. Sun, (2016) HMM-DM: identifying differentially methylated regions using a hidden Markov model. Stat. Appl. Genet. Mol. Biol., 15, 69–81
https://doi.org/10.1515/sagmb-2015-0077. pmid: 26887041
22 S. Sun, and X. Yu, (2016) HMM-Fisher: identifying differential methylation using a hidden Markov model and Fisher’s exact test. Stat. Appl. Genet. Mol. Biol., 15, 55–67
https://doi.org/10.1515/sagmb-2015-0076. pmid: 26854292
23 Y. Assenov, , F. Müller, , P. Lutsik, , J. Walter, , T. Lengauer, and C. Bock, (2014) Comprehensive analysis of DNA methylation data with RnBeads. Nat. Methods, 11, 1138–1140
https://doi.org/10.1038/nmeth.3115. pmid: 25262207
24 G.K. Smyth, (2004) Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat. Appl. Genet. Mol. Biol. 3,1–25 .
https://doi.org/10.2202/1544-6115.1027
25 M. I. Love, , W. Huber, and S. Anders, (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol., 15, 550
https://doi.org/10.1186/s13059-014-0550-8. pmid: 25516281
26 H. Wu, , C. Wang, and Z. Wu, (2012) A new shrinkage estimator for dispersion improves differential expression detection in RNA-seq data. Biostatistics,
https://doi.org/10.1093/biostatistics/kxs033 pmid: 23001152.
27 F. Krueger, and S. R. Andrews, (2011) Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics, 27, 1571–1572
https://doi.org/10.1093/bioinformatics/btr167. pmid: 21493656
28 A. Hascher,, A.K. Haase,, K. Hebestreit,, C. Rohde,, H.U. Klein,, M. Rius,, D. Jungen,, A. Witten,, M. Stoll,, I. Schulze,, et al. (2014) DNA methyltransferase inhibition reverses epigenetically embedded phenotypes in lung cancer preferentially affecting polycomb target genes. Clin. Cancer Res., 4,814–826
https://doi.org/10.1158/1078-0432.CCR-13-1483
29 E. Y. Chen, , C. M. Tan, , Y. Kou, , Q. Duan, , Z. Wang, , G. V. Meirelles, , N. R. Clark, and A. Ma’ayan, (2013) Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics, 14, 128
https://doi.org/10.1186/1471-2105-14-128. pmid: 23586463
30 M. V. Kuleshov, , M. R. Jones, , A. D. Rouillard, , N. F. Fernandez, , Q. Duan, , Z. Wang, , S. Koplev, , S. L. Jenkins, , K. M. Jagodnik, , A. Lachmann, , et al. (2016) Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res., 44, W90–W97
https://doi.org/10.1093/nar/gkw377. pmid: 27141961
[1] QB-19183-OF-FH_suppl_1 Download
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed