Please wait a minute...
Quantitative Biology

ISSN 2095-4689

ISSN 2095-4697(Online)

CN 10-1028/TM

Postal Subscription Code 80-971

Quant. Biol.    2021, Vol. 9 Issue (4) : 426-439    https://doi.org/10.15302/J-QB-021-0259
RESEARCH ARTICLE
Interpretable prediction of drug-cell line response by triple matrix factorization
Xiao-Ying Yan1,2, Shao-Wu Zhang1(), Siu-Ming Yiu3, Jian-Yu Shi4()
1. Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
2. College of Computer Science, Xi’an Shiyou University, Xi’an 710065, China
3. Department of Computer Science, The University of Hong Kong, Hong Kong 999077, China
4. School of Life Sciences, Northwestern Polytechnical University, Xi’an 710072, China
 Download: PDF(1757 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Background: One of the challenges in personalized medicine is to determine specific drugs and their dosages for patient individuals who are undergoing a common disease. The technique of cell lines provides a safe approach to capture the drug responses of patient individuals when given specific drugs with varied dosages. However, it is still costly to determine drug responses in cells w.r.t dosages by biological assays. Computational methods provide a promising screening to infer possible drug responses in the cells of patient individuals on a large scale. Nevertheless, existing computational approaches are insufficient to interpret the underlying reason for drug responses.

Methods: In this work, we propose an interpretable model for analyzing and predicting drug responses across cell lines. The proposed model bridges drug features (e.g., chemical structure fingerprints), cell features (e.g., gene expression profiles), and drug responses across cells (measured by IC50) by a triple matrix factorization (TMF), such that the underlying reason for drug responses in specific cells is possibly interpreted.

Results: The comparison with state-of-the-art computational approaches demonstrates the superiority of our TMF. More importantly, a case study of drug responses in lung-related cell lines shows its interpretable ability to find out highly occurring drug substructures, crucial mutated genes, as well as significant pairs between substructures and mutated genes in terms of drug sensitivity and resistance.

Conclusion: TMF is an effective and interpretable approach for predicting cell lines responses to drugs, and can dig out crucial pairs of chemical substructures and genes, which uncovers the underlying reason for drug responses in specific cells.

Keywords drug response      drug sensitivity      drug resistance      triple matrix factorization     
Corresponding Author(s): Shao-Wu Zhang,Jian-Yu Shi   
Just Accepted Date: 11 June 2021   Online First Date: 15 July 2021    Issue Date: 01 December 2021
 Cite this article:   
Xiao-Ying Yan,Shao-Wu Zhang,Siu-Ming Yiu, et al. Interpretable prediction of drug-cell line response by triple matrix factorization[J]. Quant. Biol., 2021, 9(4): 426-439.
 URL:  
https://academic.hep.com.cn/qb/EN/10.15302/J-QB-021-0259
https://academic.hep.com.cn/qb/EN/Y2021/V9/I4/426
Fig.1  Box plot of normalized IC50 value for part drugs in GDSC by using Formula 1.
Fig.2  Box plot of normalized IC50 value for part drugs in GDSC by using strategy in Ref [17].
Normalization PCC_S/R PCC
Y1 0.52 (±0.01) 0.43 (±0.01)
Y2 0.49 (±0.005) 0.65 (±0.009)
Y 0.80 (±0.006) 0.72 (±0.005)
Tab.1  Results of TMF_y1, TMF and TMF_y2 on GDSC dataset with 10-CV test
Fig.3  The relationship between the parameter k andρ1, ρ2for TMF algorithm on GDSC dataset.
λu ,λv λc λd
1 0.5 0.05 0.005
1 0.9986 1.0402 1.0326 1.0207
0.5 0.9967 1.0387 1.0241 1.0070
0.05 0.9921 1.0346 1.0127 0.9832
0.005 0.9912 1.0341 1.0111 0.9793
Tab.2  The performance of TMF in the case of tuning λu ,λv ,λd , λc on GDSC dataset
Fig.4  The performance of TMF against the noise level.
Fig.5  The performance of TMF against the incomplete rate.
Methods PCC_S/R E_S/R PCC E
KBMF 0.59 (±0.14) 2.00 (±0.51) 0.49 (±0.14) 1.59 (±0.42)
SRMF 0.73 (±0.008) 1.33 (±0.01) 0.60 (±0.009) 1.49 (±0.01)
TMF 0.80 (±0.006) 0.93 (±0.003) 0.72 (±0.005) 0.72 (±0.002)
Tab.3  Results of KBMF, SRMF and TMF on GDSC dataset with 10-CV test
Fig.6  PCC result comparisons of SRMF and TMF for drugs targeting genes in the PI3K/MTOR pathway.
Fig.7  RMSE result comparisons of SRMF and TMF for drugs targeting genes in the PI3K/MTOR pathway.
Fig.8  Scatter plots of observed and predicted drug responses for four drugs AR-42, CUDC-101, Belinostat and CAY10603.
Rank Substructures Group of PubChem fingerprint
1 >= 32 C G1:Hierarchic element counts
2 C(~N)(:C)(:C) G4:Simple atom nearest neighbors
3 N(~C)(:C)(:C) G4:Simple atom nearest neighbors
4 >= 8 N G1:Hierarchic element counts
5 N-S-C:C G6:Simple SMARTS pattern
6 O=C-N-C=O G6:Simple SMARTS pattern
7 N=C-C-[#1] G6:Simple SMARTS pattern
8 N-C=N-[#1] G6:Simple SMARTS pattern
9 C(-N)(=C) G5 Detailed atom neighborhoods
10 N#C-C=C G6 Simple SMARTS patterns
Tab.4  Highly occurring substructures of drugs for sensitive to Lung tissue-related cell lines
Rank Substructures Group of PubChem fingerprint
1 O=C-C-C-C-O G6:Simple SMARTS pattern
2 O-C-C-C-C G6:Simple SMARTS pattern
3 >= 16 O G1:Hierarchic element counts
4 >= 5 any ring size 6 G2:Rings in a canonic ESSSR ring set
5 C-C-N-C-C G6:Simple SMARTS pattern
6 >= 2 Cl G1:Hierarchic element counts
7 N-C:C:C-N G6:Simple SMARTS pattern
8 >= 4 N G1:Hierarchic element counts
9 O-C-C-C-C-C(C)-C G6:Simple SMARTS pattern
10 O=C-C=C G6 Simple SMARTS patterns
Tab.5  Highly occurring substructures of drugs for resistant to Lung tissue-related cell lines
Fig.9  The illustration of triple matrix factorization(TMF).
Fig.10  The description of the algorithm TMF.
1 R. Mirnezami, , J. Nicholson, and A. Darzi, (2012) Preparing for precision medicine. N. Engl. J. Med., 366, 489–491
https://doi.org/10.1056/NEJMp1114866 pmid: 22256780
2 U. Mcdermott, , S. V. Sharma, , L. Dowell, , P. Greninger, , C. Montagut, , J. Lamb, , H. Archibald, , R. Raudales, , A. Tam, , D. Lee, , et al. (2007) Identification of genotype-correlated sensitivity to selective kinase inhibitors by using high-throughput tumor cell line profiling. Proc. Natl. Acad. Sci., USA 104, 19936–19941
3 R. H. Shoemaker, (2006) The NCI60 human tumour cell line anticancer drug screen. Nat. Rev. Cancer, 6, 813–823
https://doi.org/10.1038/nrc1951 pmid: 16990858
4 J. Barretina, , G. Caponigro, , N. Stransky, , K. Venkatesan, , A. A. Margolin, , S. Kim, , C. J. Wilson, , J. Lehár, , G. V. Kryukov, , D. Sonkin, , et al. (2012) The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature, 483, 603–607
https://doi.org/10.1038/nature11003 pmid: 22460905
5 W. Yang, , J. Soares, , P. Greninger, , E. J. Edelman, , H. Lightfoot, , S. Forbes, , N. Bindal, , D. Beare, , J. A. Smith, , I. R. Thompson, , et al. (2013) Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res., 41, D955–D961
https://doi.org/10.1093/nar/gks1111 pmid: 23180760
6 F. Iorio, , T. A. Knijnenburg, , D. J. Vis, , G. R. Bignell, , M. P. Menden, , M. Schubert, , N. Aben, , E. Gonçalves, , S. Barthorpe, , H. Lightfoot, , et al. (2016) A landscape of pharmacogenomic interactions in cancer. Cell, 166, 740–754
https://doi.org/10.1016/j.cell.2016.06.017 pmid: 27397505
7 M. Ammad-Ud-Din, , S. A. Khan, , K. Wennerberg, and T. Aittokallio, (2017) Systematic identification of feature combinations for predicting drug response with Bayesian multi-view multi-task linear regression. Bioinformatics, 33, i359–i368
https://doi.org/10.1093/bioinformatics/btx266 pmid: 28881998
8 P. Geeleher,, N. J. Cox,, R. S Huang,. (2014) Clinical drug response can be predicted using baseline gene expression levels and in vitro drug sensitivity in cell lines. Genome Biol.,15, R47
9 S. Kim, , V. Sundaresan, , L. Zhou, and T. Kahveci, (2016) Integrating domain specific knowledge and network analysis to predict drug sensitivity of cancer cell lines. PLoS One, 11, e0162173
https://doi.org/10.1371/journal.pone.0162173 pmid: 27607242
10 N. Zhang, , H. Wang, , Y. Fang, , J. Wang, , X. Zheng, and X. S. Liu, (2015) Predicting anticancer drug responses using a dual-layer integrated cell line-drug network model. PLOS Comput. Biol., 11, e1004498
https://doi.org/10.1371/journal.pcbi.1004498 pmid: 26418249
11 , F Zhang., M Wang,., , J. Xi (2018) A novel heterogeneous network-based method for drug response prediction in cancer cell lines. Sci. Rep., 8, 3355
12 C. S. Greene, , A. Krishnan, , A. K. Wong, , E. Ricciotti, , R. A. Zelaya, , D. S. Himmelstein, , R. Zhang, , B. M. Hartmann, , E. Zaslavsky, , S. C. Sealfon, , et al. (2015) Understanding multicellular function and disease with human tissue-specific networks. Nat. Genet., 47, 569–576
https://doi.org/10.1038/ng.3259 pmid: 25915600
13 J. Yang, , A. Li, , Y. Li, , X. Guo, and M. Wang, (2019) A novel approach for drug response prediction in cancer cell lines via network representation learning. Bioinformatics, 35, 1527–1535
https://doi.org/10.1093/bioinformatics/bty848 pmid: 30304378
14 M. Ammad-ud-din, , E. Georgii, , M. Gönen, , T. Laitinen, , O. Kallioniemi, , K. Wennerberg, , A. Poso, and S. Kaski, (2014) Integrative and personalized QSAR analysis in cancer by kernelized Bayesian matrix factorization. J. Chem. Inf. Model., 54, 2347–2359
https://doi.org/10.1021/ci500152b pmid: 25046554
15 L. Wang, , X. Li, , L. Zhang, and Q. Gao, (2017) Improved anticancer drug response prediction in cell lines using matrix factorization with similarity regularization. BMC Cancer, 17, 513
https://doi.org/10.1186/s12885-017-3500-5 pmid: 28768489
16 L. Zhang, , X. Chen, , N. N. Guan, , H. Liu, and J. Q. Li, (2018) A hybrid interpolation weighted collaborative filtering method for anti-cancer drug response prediction. Front. Pharmacol., 9, 1017
https://doi.org/10.3389/fphar.2018.01017 pmid: 30258362
17 Y. Wang, , S. H. Bryant, , T. Cheng, , J. Wang, , A. Gindulyte, , B. A. Shoemaker, , P. A. Thiessen, , S. He, and J. Zhang, (2017) PubChem BioAssay: 2017 update. Nucleic Acids Res., 45, D955–D963
https://doi.org/10.1093/nar/gkw1118 pmid: 27899599
18 S. A. Forbes, , G. Tang, , N. Bindal, , S. Bamford, , E. Dawson, , C. Cole, , C. Y. Kok, , M. Jia, , R. Ewing, , A. Menzies, , et al. (2010) COSMIC (the Catalogue of Somatic Mutations in Cancer): a resource to investigate acquired mutations in human cancer. Nucleic Acids Res., 38, D652–D657
https://doi.org/10.1093/nar/gkp995 pmid: 19906727
19 J. Chen, and S. Zhang, (2016) Integrative analysis for identifying joint modular patterns of gene-expression and drug-response data. Bioinformatics, 32, 1724–1732
https://doi.org/10.1093/bioinformatics/btw059 pmid: 26833341
20 J. L. Sebaugh, (2011) Guidelines for accurate EC50/IC50 estimation. Pharm. Stat., 10, 128–134
https://doi.org/10.1002/pst.426 pmid: 22328315
21 C. Porta, , C. Paglino, and A. Mosca, (2014) Targeting PI3K/Akt/mTOR Signaling in Cancer. Front. Oncol., 4, 64
https://doi.org/10.3389/fonc.2014.00064 pmid: 24782981
22 J. Y. Shi, , A. Q. Zhang, , S. W. Zhang, , K. T. Mao, and S. M. Yiu, (2018) A unified solution for different scenarios of predicting drug-target interactions via triple matrix factorization. BMC Syst. Biol., 12, 136
https://doi.org/10.1186/s12918-018-0663-x pmid: 30598094
23 J. Y. Shi, , H. Huang, , J. X. Li, , P. Lei, , Y. N. Zhang, , K. Dong, and S. M. Yiu, (2018) TMFUF: a triple matrix factorization-based unified framework for predicting comprehensive drug-drug interactions of new drugs. BMC Bioinformatics, 19, 411
https://doi.org/10.1186/s12859-018-2379-8 pmid: 30453924
24 N. Guan, , D. Tao, , Z. Luo, , B. Yuan (2011) Manifold regularized discriminative nonnegative matrix factorization with fast gradient descent . IEEE Trans. Image Process., 20, 2030–2048
25 D. D. Lee, and H. S. Seung, (1999) Learning the parts of objects by non-negative matrix factorization. Nature, 401, 788–791
https://doi.org/10.1038/44565 pmid: 10548103
26 R. Marcotte, , A. Sayad, , K. R. Brown, , F. Sanchez-Garcia, , J. Reimand, , M. Haider, , C. Virtanen, , J. E. Bradner, , G. D. Bader, , G. B. Mills, , et al. (2016) Functional genomic landscape of human breast cancer drivers, vulnerabilities, and resistance. Cell, 164, 293–309
https://doi.org/10.1016/j.cell.2015.11.062 pmid: 26771497
[1] QB-21259-OF-ZSW_suppl_1 Download
[1] Chuang Han, Yu Wu. A model of NSCLC microenvironment predicts optimal receptor targets[J]. Quant. Biol., 2019, 7(2): 147-161.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed