Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2023, Vol. 17 Issue (5) : 175904    https://doi.org/10.1007/s11704-022-2151-0
RESEARCH ARTICLE
circ2CBA: prediction of circRNA-RBP binding sites combining deep learning and attention mechanism
Yajing GUO1(), Xiujuan LEI1(), Lian LIU1, Yi PAN2
1. School of Computer Science, Shaanxi Normal University, Xi’an 710119, China
2. Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
 Download: PDF(9749 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Circular RNAs (circRNAs) are RNAs with closed circular structure involved in many biological processes by key interactions with RNA binding proteins (RBPs). Existing methods for predicting these interactions have limitations in feature learning. In view of this, we propose a method named circ2CBA, which uses only sequence information of circRNAs to predict circRNA-RBP binding sites. We have constructed a data set which includes eight sub-datasets. First, circ2CBA encodes circRNA sequences using the one-hot method. Next, a two-layer convolutional neural network (CNN) is used to initially extract the features. After CNN, circ2CBA uses a layer of bidirectional long and short-term memory network (BiLSTM) and the self-attention mechanism to learn the features. The AUC value of circ2CBA reaches 0.8987. Comparison of circ2CBA with other three methods on our data set and an ablation experiment confirm that circ2CBA is an effective method to predict the binding sites between circRNAs and RBPs.

Keywords circRNAs      RBPs      CNN      BiLSTM      self-attention mechanism     
Corresponding Author(s): Xiujuan LEI   
Just Accepted Date: 12 July 2022   Issue Date: 12 December 2022
 Cite this article:   
Yajing GUO,Xiujuan LEI,Lian LIU, et al. circ2CBA: prediction of circRNA-RBP binding sites combining deep learning and attention mechanism[J]. Front. Comput. Sci., 2023, 17(5): 175904.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-022-2151-0
https://academic.hep.com.cn/fcs/EN/Y2023/V17/I5/175904
Fig.1  Flowchart of circ2CBA. The top part indicates the input data of circ2CBA and the encoding process. The second part is to extract latent features from circRNA sequences. Finally, the bottom part is to learn deep features and to make final prediction
#Positive Samples#Negative Samplescirc2CBACRIPDeepSiteicircRBP-DHN
AGO117318173180.90090.90520.86740.8992
AGO220000200000.80290.80300.75130.7954
AGO3312431240.91050.89110.89910.9135
ALKBH57707700.76960.68150.56110.9416
AUF1289628960.98880.98200.97810.9803
FXR19279270.95790.94660.96780.9065
HUR20000200000.87410.87050.82380.8698
TAF15146714670.98510.98040.97290.8899
Tab.1  The sample numbers of eight sub-datasets and the AUC values of circ2CBA, CRIP, DeepSite, icircRBP-DHN on each of them
Fig.2  The process of feature extraction. The structure includes a two-layer CNN and a max-pooling layer. Finally a splice is performed
Fig.3  The calculation of attention layer. Attention weight is calculated and then it is multiplied by the original feature
Fig.4  Prediction of binding sites. A two-layer FC and the softmax function are used to make final prediction
Fig.5  The ROC curve of circ2CBA
Fig.6  The PR curve of circ2CBA
AUCAPACCF1-score
AGO10.90090.90240.81800.8244
AGO20.80290.80610.72310.7158
AGO30.91050.91420.83040.8357
ALKBH50.76960.74340.64820.6824
AUF10.98880.98760.95860.9595
FXR10.95790.96410.89490.8966
HUR0.87410.87950.78280.7632
TAF150.98510.97720.96590.9628
Average0.89870.89680.82770.8300
Tab.2  Performance of circ2CBA under four metrics
Fig.7  Performance comparison of circ2CBA with other methods on eight sub-datasets. (a) Comparison between circ2CBA and CRIP; (b) comparison between circ2CBA and DeepSite; (c) comparison between circ2CBA and icircRBP-DHN
circ2CBACRIPicircRBP-DHNDeepSite
AGO18078.325701.127654.85?
AGO29179.875996.1810844.54?
AGO31528.281029.971902.83?
ALKBH5703.14250.15440.6418141.11
AUF11581.64876.041419.43?
FXR1474.76287.39429.03?
HUR9253.565917.8111729.75?
TAF15737.22447.251554.09?
Average3942.102563.244496.52?
Tab.3  The running time of circ2CBA, CRIP, DeepSite, icircRBP-DHN on eight sub-datasets
Fig.8  The average AUC comparison between circ2CBA, CRIP, DeepSite and icircRBP-DHN
circ2CBA1CNNBILSTM2CNNNo-attention
0.89870.86630.84480.8920
Tab.4  The average AUC value of different model structures
Fig.9  Comparison between different module structures (a) Comparison between circ2CBNA and 1CNN; (b) comparison between circ2CBNA and BiLSTM2CNN; (c) comparison between circ2CBNA and “no-attention”
circ2CBACRIPicircRBP-DHNDeepSite
AGO10.8980.8900.8950.796
AGO20.7960.7880.7930.731
AGO30.9100.9090.9040.893
ALKBH50.7990.7730.9750.596
AUF10.9840.9840.9830.974
FXR10.9700.9850.9930.954
TAF150.9900.9810.9900.982
HUR0.8690.8630.8620.826
Average0.9020.8970.9240.844
Tab.5  The AUC values on different datasets of different methods
Dataset Protein sequences circRNA sequences
ALKBH5
FXR1
TAF15
Tab.6  The motif logo of three datasets
  
  
  
  
1 J, Liu D, Li H, Luo X Zhu . Circular RNAs: the star molecules in cancer. Molecular Aspects of Medicine, 2019, 70: 141–152
2 H L, Sanger G, Klotz D, Riesner H J, Gross A K Kleinschmidt . Viroids are single-stranded covalently closed circular RNA molecules existing as highly base-paired rod-like structures. Proceedings of the National Academy of Sciences of the United States of America, 1976, 73( 11): 3852–3856
3 N R, Pamudurti O, Bartok M, Jens R, Ashwal-Fluss C, Stottmeister L, Ruhe M, Hanan E, Wyler D, Perez-Hernandez E, Ramberger S, Shenzis M, Samson G, Dittmar M, Landthaler M, Chekulaeva N, Rajewsky S Kadener . Translation of CircRNAs. Molecular Cell, 2017, 66( 1): 9–21.e7
4 B, Capel A, Swain S, Nicolis A, Hacker M, Walter P, Koopman P, Goodfellow R Lovell-Badge . Circular transcripts of the testis-determining gene Sry in adult mouse testis. Cell, 1993, 73( 5): 1019–1030
5 T B, Hansen T I, Jensen B H, Clausen J B, Bramsen B, Finsen C K, Damgaard J Kjems . Natural RNA circles function as efficient microRNA sponges. Nature, 2013, 495( 7441): 384–388
6 S, Memczak M, Jens A, Elefsinioti F, Torti J, Krueger A, Rybak L, Maier S D, Mackowiak L H, Gregersen M, Munschauer A, Loewer U, Ziebold M, Landthaler C, Kocks Noble F, Le N Rajewsky . Circular RNAs are a large class of animal RNAs with regulatory potency. Nature, 2013, 495( 7441): 333–338
7 J, Zang D, Lu A Xu . The interaction of circRNAs and RNA binding proteins: an important part of circRNA maintenance and function. Journal of Neuroscience Research, 2020, 98( 1): 87–97
8 Z, Wang X, Lei F X Wu . Identifying cancer-specific circRNA-RBP binding sites based on deep learning. Molecules, 2019, 24( 22): 4035
9 X, You I, Vlatkovic A, Babic T, Will I, Epstein G, Tushev G, Akbalik M, Wang C, Glock C, Quedenau X, Wang J, Hou H, Liu W, Sun S, Sambandan T, Chen E M, Schuman W Chen . Neural circular RNAs are derived from synaptic genes and regulated by development and plasticity. Nature Neuroscience, 2015, 18( 4): 603–610
10 S J, Conn K A, Pillman J, Toubia V M, Conn M, Salmanidis C A, Phillips S, Roslan A W, Schreiber P A, Gregory G J Goodall . The RNA binding protein quaking regulates formation of circRNAs. Cell, 2015, 160( 6): 1125–1134
11 W W, Du W, Yang E, Liu Z, Yang P, Dhaliwal B B Yang . Foxo3 circular RNA retards cell cycle progression via forming ternary complexes with p21 and CDK2. Nucleic Acids Research, 2016, 44( 6): 2846–2858
12 K, Zhang X, Pan Y, Yang H B Shen . CRIP: predicting circRNA-RBP-binding sites using a codon-based encoding and hybrid deep neural networks. RNA, 2019, 25( 12): 1604–1615
13 Nostrand E L, van G A, Pratt A A, Shishkin C, Gelboin-Burkhart M Y, Fang B, Sundararaman S M, Blue T B, Nguyen C, Surka K, Elkins R, Stanton F, Rigo M, Guttman G W Yeo . Robust transcriptome-wide discovery of RNA-binding protein binding sites with enhanced CLIP (eCLIP). Nature Methods, 2016, 13( 6): 508–514
14 D, Ray H, Kazan K B, Cook M T, Weirauch H S, Najafabadi X, Li S, Gueroussov M, Albu H, Zheng A, Yang H, Na M, Irimia L H, Matzat R K, Dale S A, Smith C A, Yarosh S M, Kelly B, Nabet D, Mecenas W M, Li R S, Laishram M, Qiao H D, Lipshitz F, Piano A H, Corbett R P, Carstens B J, Frey R A, Anderson K W, Lynch L O F, Penalva E P, Lei A G, Fraser B J, Blencowe Q D, Morris T R Hughes . A compendium of RNA-binding motifs for decoding gene regulation. Nature, 2013, 499( 7457): 172–177
15 P, Glažar P, Papavasileiou N Rajewsky . circBase: a database for circular RNAs. RNA, 2014, 20( 11): 1666–1670
16 D B, Dudekula A C, Panda I, Grammatikakis S, De K, Abdelmohsen M Gorospe . CircInteractome: a web tool for exploring circular RNAs and their interacting proteins and microRNAs. RNA Biology, 2016, 13( 1): 34–42
17 D, Yao L, Zhang M, Zheng X, Sun Y, Lu P Liu . Circ2Disease: a manually curated database of experimentally validated circRNAs in human disease. Scientific Reports, 2018, 8( 1): 11018
18 S, Xia J, Feng K, Chen Y, Ma J, Gong F, Cai Y, Jin Y, Gao L, Xia H, Chang L, Wei L, Han C He . CSCD: a database for cancer-specific circular RNAs. Nucleic Acids Research, 2018, 46( D1): D925–D929
19 D D, Licatalosi A, Mele J J, Fak J, Ule M, Kayikci S W, Chi T A, Clark A C, Schweitzer J E, Blume X N, Wang J C, Darnell R B Darnell . HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature, 2008, 456( 7221): 464–469
20 B, Li X Q, Zhang S R, Liu S, Liu W J, Sun Q, Lin Y X, Luo K R, Zhou C M, Zhang Y Y, Tan J H, Yang L H Qu . Discovering the Interactions between Circular RNAs and RNA-binding Proteins from CLIP-seq Data using circScan. bioRxiv, 2017, doi:
21 X, Liu M Yang . Research on conversational machine reading comprehension based on dynamic graph neural network. Journal of Integration Technology, 2022, 11( 2): 67–78
22 X, Lei J, Tie Y Pan . Inferring metabolite-disease association using graph convolutional networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2022, 19( 2): 688–698
23 S, Zhang Y H, Gong J J Wang . The development of deep convolution neural network and its applications on computer vision. Chinese Journal of Computers, 2019, 42( 3): 453–482
24 B, Alipanahi A, Delong M T, Weirauch B J Frey . Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nature Biotechnology, 2015, 33( 8): 831–838
25 X, Pan H B Shen . RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach. BMC Bioinformatics, 2017, 18( 1): 136
26 X, Pan H Shen . Predicting RNA-protein binding sites and motifs through combining local and global deep convolutional neural networks. Bioinformatics, 2018, 34( 20): 3427–3436
27 X Y, Pan P, Rijnbeek J C, Yan H B Shen . Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks. BMC Genomics, 2018, 19( 1): 511
28 C, Jia Y, Bi J, Chen A, Leier F, Li J Song . PASSION: an ensemble neural network approach for identifying the binding sites of RBPs on circRNAs. Bioinformatics, 2020, 36( 15): 4276–4282
29 Z, Wang X Lei . Matrix factorization with neural network for predicting circRNA-RBP interactions. BMC Bioinformatics, 2020, 21( 1): 229
30 M, Tahir H, Tayara M, Hayat K T Chong . kDeepBind: prediction of RNA-Proteins binding sites using convolution neural network and k-gram features. Chemometrics and Intelligent Laboratory Systems, 2021, 208: 104217
31 Z, Du X, Xiao V N Uversky . DeepA-RBPBS: a hybrid convolution and recurrent neural network combined with attention mechanism for predicting RBP binding site. Journal of Biomolecular Structure and Dynamics, 2022, 40( 9): 4250–4258
32 Z, Li S, Zhao S, Zhu Y Fan . MicroRNA-153−5p promotes the proliferation and metastasis of renal cell carcinoma via direct targeting of AGO1. Cell Death & Disease, 2021, 12( 1): 33
33 C, Liu M D, Yao C P, Li K, Shan H, Yang J J, Wang B, Liu X M, Li J, Yao Q, Jiang B Yan . Silencing of circular RNA-ZNF609 ameliorates vascular endothelial dysfunction. Theranostics, 2017, 7( 11): 2863–2877
34 L, Pan C, Xu J, Mei Y, Chen D Wang . Argonaute 3 (AGO3) promotes malignancy potential of cervical cancer via regulation of Wnt/β-catenin signaling pathway. Reproductive Biology, 2021, 21( 1): 100479
35 Z, Liu Q, Wang X, Wang Z, Xu X, Wei J Li . Circular RNA cIARS regulates ferroptosis in HCC cells through interacting with RNA binding protein ALKBH5. Cell Death Discovery, 2020, 6: 72
36 X Y, Tian J, Li T H, Liu D N, Li J J, Wang H, Zhang Z L, Deng F J, Chen J P Cai . The overexpression of AUF1 in colorectal cancer predicts a poor prognosis and promotes cancer progression by activating ERK and AKT pathways. Cancer Medicine, 2020, 9( 22): 8612–8623
37 J, Khlghatyan A, Evstratova L, Bozoyan S, Chamberland D, Chatterjee A, Marakhovskaia T S, Silva K, Toth V, Mongrain J M Beaulieu . Fxr1 regulates sleep and synaptic homeostasis. The EMBO Journal, 2020, 39( 21): e103864
38 M, Shen Y, Guo Q, Dong Y, Gao M E, Stockton M, Li S, Kannan T, Korabelnikov K A, Schoeller C L, Sirois C, Zhou J, Le D, Wang Q, Chang Q Q, Sun X Zhao . FXR1 regulation of parvalbumin interneurons in the prefrontal cortex is critical for schizophrenia-like behaviors. Molecular Psychiatry, 2021, 26( 11): 6845–6867
39 Y, Yang B, Cai X, Shi C, Duan T, Tong C Yu . circ_0044516 functions in the progression of gastric cancer by modulating MicroRNA-149−5p/HuR axis. Molecular and Cellular Biochemistry, 2021, doi:
40 Y, Su C, Jin S M, Sun Z H, Li S W, Xia Z L, Zhang F, Zhang J J, Shao S Z Zheng . Progress in RNA-binding protein HuR and its roles in development of hepatocellular carcinoma. Chinese Journal of Pathophysiology, 2020, 36( 12): 2283–2288
41 A K, Singh V, Kapoor D, Thotala D E Hallahan . TAF15 contributes to the radiation-inducible stress response in cancer. Oncotarget, 2020, 11( 27): 2647–2659
42 L, Fu B, Niu Z, Zhu S, Wu W Li . CD-HIT: accelerated for clustering the next-generation sequencingdata. Bioinformatics, 2012, 28( 23): 3150–3152
43 Y, Zhang S, Qiao S, Ji Y Li . DeepSite: bidirectional LSTM and CNN models for predicting DNA−protein binding. International Journal of Machine Learning and Cybernetics, 2020, 11( 4): 841–851
44 Y, Yang Z, Hou Z, Ma X, Li K C Wong . iCircRBP-DHN: identification of circRNA-RBP interaction sites using deep hierarchical network. Briefings in Bioinformatics, 2020, 22( 4): bbaa274
45 R, Apweiler A, Bairoch C H, Wu W C, Barker B, Boeckmann S, Ferro E, Gasteiger H, Huang R, Lopez M, Magrane M J, Martin D A, Natale C, O'Donovan N, Redaschi L S L Yeh . UniProt: the universal protein knowledgebase. Nucleic Acids Research, 2004, 32( S1): D115–D119
46 T L, Bailey M, Boden F A, Buske M, Frith C E, Grant L, Clementi J, Ren W W, Li W S Noble . MEME SUITE: tools for motif discovery and searching. Nucleic Acids Research, 2009, 37( S2): W202–W208
47 J, Hong R, Gao Y Yang . CrepHAN: cross-species prediction of enhancers by using hierarchical attention networks. Bioinformatics, 2021, 37( 20): 3436–3443
[1] FCS-22151-OF-YG_suppl_1 Download
[1] Hongbin XU, Weili YANG, Qiuxia WU, Wenxiong KANG. Endowing rotation invariance for 3D finger shape and vein verification[J]. Front. Comput. Sci., 2022, 16(5): 165332-.
[2] Guoshuai ZHOU, Xiuxia TIAN, Aoying ZHOU. Image copy-move forgery passive detection based on improved PCNN and self-selected sub-images[J]. Front. Comput. Sci., 2022, 16(4): 164705-.
[3] Juntao CHEN, Quan ZOU, Jing LI. DeepM6ASeq-EL: prediction of human N6-methyladenosine (m 6A) sites with LSTM and ensemble learning[J]. Front. Comput. Sci., 2022, 16(2): 162302-.
[4] Feifei ZHANG,Yongbin YU,Qirong MAO,Jianping GOU,Yongzhao ZHAN. Pose-robust feature learning for facial expression recognition[J]. Front. Comput. Sci., 2016, 10(5): 832-844.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed