Please wait a minute...
Quantitative Biology

ISSN 2095-4689

ISSN 2095-4697(Online)

CN 10-1028/TM

Postal Subscription Code 80-971

Quant. Biol.    2023, Vol. 11 Issue (1) : 31-43    https://doi.org/10.15302/J-QB-022-0309
RESEARCH ARTICLE
Modeling the relationship between gene expression and mutational signature
Limin Jiang, Hui Yu, Yan Guo()
Department of Internal Medicine, Comprehensive Cancer Center, University of New Mexico Albuquerque, NM 87109, USA
 Download: PDF(13504 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Background: Mutational signatures computed from somatic mutations, allow an in-depth understanding of tumorigenesis and may illuminate early prevention strategies. Many studies have shown the regulation effects between somatic mutation and gene expression dysregulation.

Methods: We hypothesized that there are potential associations between mutational signature and gene expression. We capitalized upon RNA-seq data to model 49 established mutational signatures in 33 cancer types. Both accuracy and area under the curve were used as performance measures in five-fold cross-validation.

Results: A total of 475 models using unconstrained genes, and 112 models using protein-coding genes were selected for future inference purposes. An independent gene expression dataset on lung cancer smoking status was used for validation which achieved over 80% for both accuracy and area under the curve.

Conclusion: These results demonstrate that the associations between gene expression and somatic mutations can translate into the associations between gene expression and mutational signatures.

Keywords mutational signature      gene expression      support vector machine      random forest      extreme gradient boost     
Corresponding Author(s): Yan Guo   
Online First Date: 13 January 2023    Issue Date: 13 March 2023
 Cite this article:   
Limin Jiang,Hui Yu,Yan Guo. Modeling the relationship between gene expression and mutational signature[J]. Quant. Biol., 2023, 11(1): 31-43.
 URL:  
https://academic.hep.com.cn/qb/EN/10.15302/J-QB-022-0309
https://academic.hep.com.cn/qb/EN/Y2023/V11/I1/31
Fig.1  The flowchart of the study.
Fig.2  Mutational signatures resolved from TCGA somatic mutation data.
Fig.3  Results of classification models constructed using the all-gene pool.
Fig.4  Gene type composition of the features recruited by SVM models where the pool of all genes was used.
Fig.5  Results of classification models constructed using the pool of protein-coding-genes only.
Fig.6  Validation results on two validation strategies.
Fig.7  Heterogeneous tumors and tumor purity analysis against inferred results of 587 models.
AESA Advanced expression survival analysis
ACC Adrenocortical carcinoma
BLCA Bladder urothelial carcinoma
BRCA Breast invasive carcinoma
CESC Cervical squamous cell carcinoma and endocervical adenocarcinoma
CHOL Cholangiocarcinoma
COAD Colon adenocarcinoma
DLBC Lymphoid neoplasm diffuse large B-cell lymphoma
eQTL gene expression quantitative trait loci
ESCA Esophageal carcinoma
GBM Glioblastoma multiforme
HNSC Head and neck squamous cell carcinoma
KICH Kidney chromophobe
KIRC Kidney renal clear cell carcinoma
KIRP Kidney renal papillary cell carcinoma
LAML Acute myeloid leukemia
LGG Brain lower grade glioma
LIHC Liver hepatocellular carcinoma
LUAD Lung adenocarcinoma
LUSC Lung squamous cell carcinoma
MESO Mesothelioma
OV Ovarian serous cystadenocarcinoma
PAAD Pancreatic adenocarcinoma
PCPG Pheochromocytoma and paraganglioma
PRAD Prostate adenocarcinoma
READ Rectum adenocarcinoma
RF Random forest
ROC Receiver operating characteristics
SARC Sarcoma
SBS Single base substitutions
SKCM Skin cutaneous melanoma
STAD Stomach adenocarcinoma
SVM Support vector machine
TCGA The Cancer Genome Atlas
TGCT Testicular germ cell tumors
THCA Thyroid carcinoma
THYM Thymoma
UCEC Uterine corpus endometrial carcinoma
UCS Uterine carcinosarcoma
UV Ultraviolet
UVM Uveal melanoma
XGBoost EXtreme Gradient Boosting
  
1 E. N., Bergstrom, M. N., Huang, U., Mahto, M., Barnes, M. R., Stratton, S. G. Rozen, L. Alexandrov, (2019). SigProfilerMatrixGenerator: a tool for visualizing and exploring patterns of small mutational events. BMC Genomics, 20: 685
https://doi.org/10.1186/s12864-019-6041-2
2 L. Alexandrov, S., Nik-Zainal, D. C., Wedge, P. J., Campbell, M. Stratton, (2013). Deciphering signatures of mutational processes operative in human cancer. Cell Rep., 3: 246–259
https://doi.org/10.1016/j.celrep.2012.12.008
3 M., Petljak, L. B., Alexandrov, J. S., Brammeld, S., Price, D. C., Wedge, S., Grossmann, K. J., Dawson, Y. S., Ju, F., Iorio, J. M. C. Tubio, et al.. (2019). Characterizing mutational mignatures in muman cancer cell lines reveals episodic APOBEC mutagenesis. Cell, 176: 1282–1294.e20
https://doi.org/10.1016/j.cell.2019.02.012
4 L. B., Alexandrov, S., Nik-Zainal, D. C., Wedge, S. A., Aparicio, S., Behjati, A. V., Biankin, G. R., Bignell, N., Bolli, A., Borg, A. L. rresen-Dale, et al.. (2013). Signatures of mutational processes in human cancer. Nature, 500: 415–421
https://doi.org/10.1038/nature12477
5 L. B., Alexandrov, S., Nik-Zainal, D. C., Wedge, P. J. Campbell, M. Stratton, (2013). Deciphering signatures of mutational processes operative in human cancer. Cell Rep., 3: 246–259
https://doi.org/10.1016/j.celrep.2012.12.008
6 J., Shinde, V., Renault, G., Couchy, J. F., Blanc, E., Tubacher, Q., Bayard, D., Bacq, V., Meyer, J. Semhoun, et al.. (2017). Mutational signatures reveal the dynamic interplay of risk factors and cellular processes during liver tumorigenesis. Nat. Commun., 8: 1315
https://doi.org/10.1038/s41467-017-01358-x
7 P., Polak, J., Kim, L. Z., Braunstein, R., Karlic, N. J., Haradhavala, G., Tiao, D., Rosebrock, D., Livitz, K., bler, K. W. Mouw, et al.. (2017). A mutational signature reveals alterations underlying deficient homologous recombination repair in breast cancer. Nat. Genet., 49: 1476–1486
https://doi.org/10.1038/ng.3934
8 M. Petljak, L. Alexandrov, (2016). Understanding mutagenesis through delineation of mutational signatures in human cancer. Carcinogenesis, 37: 531–540
https://doi.org/10.1093/carcin/bgw055
9 L. B., Alexandrov, Y. S., Ju, K., Haase, P., Van Loo, I., Martincorena, S., Nik-Zainal, Y., Totoki, A., Fujimoto, H., Nakagawa, T. Shibata, et al.. (2016). Mutational signatures associated with tobacco smoking in human cancer. Science, 354: 618–622
https://doi.org/10.1126/science.aag0299
10 L. B., Alexandrov, P. H., Jones, D. C., Wedge, J. E., Sale, P. J., Campbell, S. Nik-Zainal, M. Stratton, (2015). Clock-like mutational processes in human somatic cells. Nat. Genet., 47: 1402–1407
https://doi.org/10.1038/ng.3441
11 J. E., Kucab, X., Zou, S., Morganella, M., Joel, A. S., Nanda, E., Nagy, C., Gomez, A., Degasperi, R., Harris, S. P. Jackson, et al.. (2019). A compendium of mutational signatures of environmental agents. Cell, 177: 821–836.e16
https://doi.org/10.1016/j.cell.2019.03.001
12 D. C., Gulhan, J. J. Lee, G. E. M., Melloni, I. s-Ciriano, P. Park, (2019). Detecting the mutational signature of homologous recombination deficiency in clinical samples. Nat. Genet., 51: 912–919
https://doi.org/10.1038/s41588-019-0390-2
13 D. L. Masica, (2011). Correlation of somatic mutation and expression identifies genes important in human glioblastoma progression and survival. Cancer Res., 71: 4550–4561
https://doi.org/10.1158/0008-5472.CAN-11-0180
14 J., Ping, O., Oyebamiji, H., Yu, S., Ness, J., Chien, F., Ye, H., Kang, D., Samuels, S., Ivanov, D. Chen, et al.. (2020). MutEx: a multifaceted gateway for exploring integrative pan-cancer genomic data. Brief. Bioinform., 21: 1479–1486
https://doi.org/10.1093/bib/bbz084
15 X., Wang, Q., Sun, C., Chen, R., Yin, X., Huang, X., Wang, R., Shi, L. Xu, (2016). ZYG11A serves as an oncogene in non-small cell lung cancer and influences CCNE1 expression. Oncotarget, 7: 8029–8042
https://doi.org/10.18632/oncotarget.6904
16 D. J., Shen, Y. H., Jiang, J. Q., Li, L. W. Xu, K. Tao, (2020). The RNA-binding protein RBM47 inhibits non-small cell lung carcinoma metastasis through modulation of AXIN1 mRNA stability and Wnt/β-catentin signaling. Surg. Oncol., 34: 31–39
https://doi.org/10.1016/j.suronc.2020.02.011
17 H., Zhang, X., Chen, J., Wang, W., Guang, W., Han, H., Zhang, X. Tan, (2014). EGR1 decreases the malignancy of human non-small cell lung carcinoma by regulating KRT18 expression. Sci. Rep., 4: 5416
https://doi.org/10.1038/srep05416
18 G. J., Inman, J., Wang, A., Nagano, L. B., Alexandrov, K. J., Purdie, R. G., Taylor, V., Sherwood, J., Thomson, S., Hogan, L. C. Spender, et al.. (2018). The genomic landscape of cutaneous SCC reveals drivers and a novel azathioprine associated mutational signature. Nat. Commun., 9: 3667
https://doi.org/10.1038/s41467-018-06027-1
19 A. W. T., Ng, S. L., Poon, M. N., Huang, J. Q., Lim, A., Boot, W., Yu, Y., Suzuki, S., Thangaraju, C. C. Y., Ng, P. Tan, et al.. (2017). Aristolochic acids and their derivatives are widely implicated in liver cancers in Taiwan and throughout Asia. Sci. Transl. Med., 9: eaan6446
20 H., Davies, D., Glodzik, S., Morganella, L. R., Yates, J., Staaf, X., Zou, M., Ramakrishna, S., Martin, S., Boyault, A. M. Sieuwerts, et al.. (2017). HRDetect is a predictor of BRCA1 and BRCA2 deficiency based on mutational signatures. Nat. Med., 23: 517–525
https://doi.org/10.1038/nm.4292
21 L. B., Alexandrov, S., Nik-Zainal, H. C., Siu, S. Y. Leung, M. Stratton, (2015). A mutational signature in gastric cancer suggests therapeutic strategies. Nat. Commun., 6: 8683
https://doi.org/10.1038/ncomms9683
22 T. G., Meijer, N. S., Verkaik, A. M., Sieuwerts, J., van Riet, K. A. T., Naipal, C. H. M., van Deurzen, M. A., den Bakker, H. F. B. M., Sleddens, H. J., Dubbink, T. D. den Toom, et al.. (2018). Functional ex vivo assay reveals homologous recombination deficiency in breast cancer beyond BRCA gene defects. Clin. Cancer Res., 24: 6277–6287
https://doi.org/10.1158/1078-0432.CCR-18-0063
23 N., Waddell, M., Pajic, A. M., Patch, D. K., Chang, K. S., Kassahn, P., Bailey, A. L., Johns, D., Miller, K., Nones, K. Quek, et al.. (2015). Whole genomes redefine the mutational landscape of pancreatic cancer. Nature, 518: 495–501
https://doi.org/10.1038/nature14169
24 S., Morganella, L. B., Alexandrov, D., Glodzik, X., Zou, H., Davies, J., Staaf, A. M., Sieuwerts, A. B., Brinkman, S., Martin, M. Ramakrishna, et al.. (2016). The topography of mutational processes in breast cancer genomes. Nat. Commun., 7: 11383
https://doi.org/10.1038/ncomms11383
25 N. J., Haradhvala, J., Kim, Y. E., Maruvka, P., Polak, D., Rosebrock, D., Livitz, J. M., Hess, I., Leshchiner, A., Kamburov, K. W. Mouw, et al.. (2018). Distinct mutational signatures characterize concurrent loss of polymerase proofreading and mismatch repair. Nat. Commun., 9: 1746
https://doi.org/10.1038/s41467-018-04002-4
26 Q., ShengD. C., SamuelsH., YuS., NessY. Y. Zhao. (2020) Cancer-specific expression quantitative loci are affected by expression dysregulation. Brief. Bioinform, 21, 338−347.
27 B., Ye, J., Shi, H., Kang, O., Oyebamiji, D., Hill, H., Yu, S., Ness, F., Ye, J., Ping, J. He, et al.. (2020). Advancing pan-cancer gene expression survial analysis by inclusion of non-coding RNA. RNA Biol., 17: 1666–1673
https://doi.org/10.1080/15476286.2019.1679585
28 S., Georganos, T., Grippa, S., Vanhuysse, M., Lennert, M. Shimoni, (2018). Very high resolution object-based land use-land cover urban classification using extreme gradient boosting. IEEE Geosci. Remote Sens. Lett., 15: 607–611
https://doi.org/10.1109/LGRS.2018.2803259
29 P. Manikandaprabhu, (2016). Unified RF-SVM model based digital radiography classification for Inferior Alveolar Nerve Injury (IANI) identification. Biomed Res-India, 27: 1107–1117
30 Y. Shao, R. Lunetta, (2012). Comparison of support vector machine, neural network, and CART algorithms for the land-cover classification using limited training data points. ISPRS J. Photogramm. Remote Sens., 70: 78–87
https://doi.org/10.1016/j.isprsjprs.2012.04.001
31 F., Blokzijl, R., Janssen, R. van Boxtel, (2018). MutationalPatterns: comprehensive genome-wide analysis of mutational processes. Genome Med., 10: 33
https://doi.org/10.1186/s13073-018-0539-0
32 V., Thorsson, D. L., Gibbs, S. D., Brown, D., Wolf, D. S. Bortone, O. T. H., Yang, E., Porta-Pardo, G. F., Gao, C. L., Plaisier, J. A. Eddy, et al.. (2019). The immune landscape of cancer. Immunity, 51: 411–412
[1] QB-22309-of-GY_suppl_1 Download
[2] QB-22309-of-GY_suppl_2 Download
[1] Md. Bahadur Badsha, Rui Li, Boxiang Liu, Yang I. Li, Min Xian, Nicholas E. Banovich, Audrey Qiuyan Fu. Imputation of single-cell gene expression with an autoencoder neural network[J]. Quant. Biol., 2020, 8(1): 78-94.
[2] Yiyi Liu, Hongyu Zhao. Variable importance-weighted Random Forests[J]. Quant. Biol., 2017, 5(4): 338-351.
[3] Yijun Guo, Bing Wei, Shiyan Xiao, Dongbao Yao, Hui Li, Huaguo Xu, Tingjie Song, Xiang Li, Haojun Liang. Recent advances in molecular machines based on toehold-mediated strand displacement reaction[J]. Quant. Biol., 2017, 5(1): 25-41.
[4] Hailin Meng, Yingfei Ma, Guoqin Mai, Yong Wang, Chenli Liu. Construction of precise support vector machine based models for predicting promoter strength[J]. Quant. Biol., 2017, 5(1): 90-98.
[5] Tianshou Zhou,Tuoqi Liu. Quantitative analysis of gene expression systems[J]. Quant. Biol., 2015, 3(4): 168-181.
[6] Chen Jia, Minping Qian, Yu Kang, Daquan Jiang. Modeling stochastic phenotype switching and bet-hedging in bacteria: stochastic nonlinear dynamics and critical state identification[J]. Quant. Biol., 2014, 2(3): 110-125.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed