|
|
MicroRNA target prediction based on second-order
Hidden Markov Model |
Song GAO1,Diangang QIN1,Tienan FENG1,Yifei WANG1,Liangsheng ZHANG2, |
1.Department of Mathematics,
School of Sciences, Shanghai University, Shanghai 200444, China; 2.Department of Mathematics,
School of Sciences, Shanghai University, Shanghai 200444, China;School of Life Sciences,
Institute of Plant Biology, Fudan University, Shanghai 200433, China; |
|
|
Abstract MicroRNAs are one class of small single-stranded RNA of about 22nt serving as important negative gene regulators. In animals, miRNAs mainly repress protein translation by binding itself to the 3’ UTR regions of mRNAs with imperfect complementary pairing. Although bioinformatics investigations have resulted in a number of target prediction tools, all of these have a common shortcoming—a high false positive rate. Therefore, it is important to further filter the predicted targets. In this paper, based on miRNA:target duplex, we construct a second-order Hidden Markov Model, implement Baum-Welch training algorithm and apply this model to further process predicted targets. The model trains the classifier by 244 positive and 49 negative miRNA:target interaction pairs and achieves a sensitivity of 72.54%, specificity of 55.10% and accuracy of 69.62% by 10-fold cross-validation experiments. In order to further verify the applicability of the algorithm, previously collected datasets, including 195 positive and 38 negative, are chosen to test it, with consistent results. We believe that our method will provide some guidance for experimental biologists, especially in choosing miRNA targets for validation.
|
Keywords
microRNA
target gene
experimentally supported targets
second-order Hidden Markov Model
forward algorithm
|
Issue Date: 01 April 2010
|
|
|
Barciszewski J, Erdmann V A (2008). Noncoding RNAs: Molecular Biology and Molecular Medicine(in Chinese, Trans. Zheng X F). Beijing: Chemical Industry Press, 104–119
|
|
Borodovsky M, Sprizhitskii Y, Golovanov E, Aleksandrov A (1986a). Statistical patterns in primary structuresof functional regions in the E. coli genome. I. Oligonucleotide frequencies analysis. Mol Biol, 20: 826–833
|
|
Borodovsky M, Sprizhitskii Y, Golovanov E, Aleksandrov A (1986b). Statistical patterns in primary structuresof functional regions in the E. coli genome. II. Non-homogeneous Markov models. Mol Biol, 20: 833–840
|
|
Borodovsky M, Sprizhitskii Y, Golovanov E, Aleksandrov A (1986c). Statistical patterns in primary structuresof functional regions in the E. coli genome. III. Computer recognition of coding regions. Mol Biol, 20: 1145–1150
|
|
Churchill G A (1989). Stochastic models for heterogeneousDNA sequences. Bull Mathem Biol, 51: 79–94
|
|
Du S P (2007). The Baum-Welch Algorithm of HMM2 withMultiple Observations. J Biomathem, 22(4): 685–690 (in Chinese)
|
|
Du S P, Li H (2004). Second-order Hidden Markov Models and Its Application to ComputationalLinguistics. J Sichuan Uni (Nat Sci Edi), 41(2): 284–289 (in Chinese)
|
|
Duursma A M, Martijn K, Mariette S, Carlos L S, Reuven A (2008). miR-148 targets human DNMT3b protein coding region. RNA, 14(5): 872–877
doi: 10.1261/rna.972008
|
|
Enright A J, John B, Gaul U, Tuschl T, Sander C, Marks D S (2003). MicroRNA targets in Drosophila. Genome Biol, 5(1): Article Rl
|
|
Gough J, Chothia C (2002). SUPERFAMILY: HMMs representing all proteins of knownstructure, SCOP sequence searches, alignments, and genome assignments. Nucl Acids Res, 30(1): 268–272
doi: 10.1093/nar/30.1.268
|
|
Griffiths-Jones S, Grocock R J, van Dongen S, Bateman A, Enright A J (2006). miRBase: microRNA sequences, targets and gene nomenclature. Nucl Acids Res, 34: D140–D144
doi: 10.1093/nar/gkj112
|
|
Hébert S S, Horré K, Nicolaï L, Papadopoulou A S, Mandemakers W, Silahtaroglu A N, Kauppinen S, Delacourte A, De Strooper B (2008). Loss ofmicroRNA cluster miR-29a/b-1 in sporadic Alzheimer’s diseasecorrelates with increased BACE1/beta-secretase expression. Proc Natl Acad Sci USA, 105(17): 6415–6420
doi: 10.1073/pnas.0710263105
|
|
Huynh T, Miranda K, Tay Y, Ang Y S, Tam W L, Thomson A M, Lim B, Rigoutsos I (2006). A pattern- based method for the identification of microRNA-targetsites and their corresponding RNA/RNA complexes. Cell, 126: 1203–1217
doi: 10.1016/j.cell.2006.07.031
|
|
John B, Enright A J, Aravin A, Uschl T, Sander C, Marks D S (2004). HumanMicroRNA Targets. PLoS Biology, 2(11): 1862–1879
|
|
Kim S K, Nam J W, Rhee J K, Lee W J, Zhang B T (2006). MiTarget:microRNA target gene prediction using a support vector machine. BMC Bioinformatics, 7: 411
doi: 10.1186/1471-2105-7-411
|
|
Kiriakidou M, Nelson P T, Kouranov A, Fitziev P, Bouyioukos C, Mourelatos Z, Hatzigeorgiou A (2004). A combined computational-experimentalapproach predicts human microRNA targets. Genes Dev, 18: 1165–1178
doi: 10.1101/gad.1184704
|
|
Krek A, Grün D, Poy M N, Wolf R, Rosenberg L, Epstein E J, MacMenamin P, da Piedade I, Gunsalus K C, Stoffel M, Rajewsky N (2005). Combinatorial microRNA target predictions. Nat Genet, 37: 495–500
doi: 10.1038/ng1536
|
|
Landais S, Landry S, Legault P, Rassart E (2007). Oncogenic potential of the miR-106-363cluster and its implication in human T-cell leukemia. Cancer Res, 67(12): 5699–5707
doi: 10.1158/0008-5472.CAN-06-4478
|
|
Lewis B P, Burge C B, Bartel D P (2005). Conservedseed pairing, often flanked by adenosines, indicates that thousandsof human genes are microRNA targets. Cell, 120: 15–20
doi: 10.1016/j.cell.2004.12.035
|
|
Lewis B P, Shih I H, Jones-Rhoades M W, Bartel D P, Burge C B(2003). Prediction of mammalian microRNA targets. Cell, 115: 787–798
doi: 10.1016/S0092-8674(03)01018-3
|
|
Luo X B, Lin H X, Pan Z W, Xiao J N, Zhang Y, Lu Y J, Yang B F, Wang Z G (2008). Down-regulation ofmiR-1/miR-133 Contributes to Re-expression of Pacemaker Channel GenesHCN2 and HCN4 in Hypertrophic Heart. JBiol Chem, 283(29): 20045–20052
doi: 10.1074/jbc.M801035200
|
|
Nam J W, Shin K R, Han J, Lee Y, Kim V N, Zhang B T(2005). Human microRNA prediction througha probabilistic co-learning model of sequence and structure. Nucl Acids Res, 33(11): 3570–3581
doi: 10.1093/nar/gki668
|
|
Rabiner L R, Juang B H (1986). An introduction to hidden Markov models. In: IEEE Acoustics, Speech & Signal Processing Magazine, 3: 4–16
|
|
Rehmsmeier M, Steffen P, Hochsmann M, Giegerich R (2004). Fast and effective prediction of microRNA/targetduplexes. RNA, 10: 1507–1517
doi: 10.1261/rna.5248604
|
|
Rossi J J, Hannon G J (2008). MicroRNA Methods. Beijing: Science Press, 1–83
|
|
Rusinov V, Baev V, Minkov I N, Tabler M (2005). MicroInspector: a web tool for detection of miRNA bindingsites in an RNA sequence. Nucl Acids Res, 33: W696–W700
doi: 10.1093/nar/gki364
|
|
Saetrom O, Ola S J, Saetrom P (2005). Weighted sequencemotifs as an improved seeding step in microRNA target prediction algorithms. RNA, 11: 995–1003
doi: 10.1261/rna.7290705
|
|
Sengupta S, den Boon J A, Chen I H, Newton M A, Stanhope S A, Cheng Y J, Chen C J, Hildesheim A, Sugden B, Ahlquist P (2008). MicroRNA 29c is down-regulated innasopharyngeal carcinomas, up-regulating mRNAs encoding extracellularmatrix proteins. Proc Natl Acad Sci USA, 5(15): 5874–5878
doi: 10.1073/pnas.0801130105
|
|
Sethupathy P, Corda B, Hatzigeorgiou A G (2006). TarBase:A comprehensive database of experimentally supported animal microRNAtargets. RNA, 12(2): 192–197
doi: 10.1261/rna.2239606
|
|
Shi X X, Wang T J, He Z Y (2001). The Learning Algorithmof the Second Order HMM and Its Relationship with the First OrderHMM. J Appl Sci, 19(1): 29–32 (in Chinese)
|
|
Skalsky R L, Samols M A, Plaisance K B, Boss I W, Riva A, Lopez M C, Baker H V, Renne R(2007). Kaposi’ssarcoma-associated herpesvirus encodes an ortholog of miR-155. J Virol, 81(23): 12836–12845
doi: 10.1128/JVI.01804-07
|
|
Thadanil R, Tammi M T (2006). MicroTar: predicting microRNA targets from RNA duplexes. BMC Bioinformatics, 7(Suppl 5): S20
doi: 10.1186/1471-2105-7-S5-S20
|
|
Wang Y, Lee A T, Ma J Z, Wang J, Ren J, Yang Y, Tantoso E, Li K B, Tan P, Lee C G L(2008). Profiling microRNA expression in hepatocellular carcinomareveals microRNA-224 up-regulation and apoptosis inhibitor-5 as amicroRNA-224-specific target. J Biol Chem, 283(19): 13205–13215
doi: 10.1074/jbc.M707629200
|
|
Xia W, Cao G J, Shao N S (2009). Research approachof microRNA target gene in search and indentification. Sci China, C: Life Sci, 39(1): 121–128 (in Chinese)
|
|
Xu D, Liu H J, Wang Y F (2005). BSS-HMM3s: An improved HMM method for identifying transcriptionfactor binding sites. DNA Sequence, 16(6): 403–411
doi: 10.1080/10425170500356032
|
|
Yang Y C, Wang Y P, Li K B(2008). MiRTif: a supportvector machine-based microRNA target interaction filter. BMC Bioinformatics, 9(Suppl 12): S4
doi: 10.1186/1471-2105-9-S12-S4
|
|
Yousef M, Jung S, Kossenkov A V, Showe L C, Owe M K Sh (2007). Naïve Bayes for MicroRNA Target Predictions MachineLearning for MicroRNA Targets. Bioinformatics, 23(22): 2987–2992
doi: 10.1093/bioinformatics/btm484
|
|
Zhang B H, Pan X P, Wang Q L, Cobb G P, Anderson T A (2006). Computationalidentification of microRNAs and their targets. Comput Biol Chem, 30: 395–407
doi: 10.1016/j.compbiolchem.2006.08.006
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|