Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2020, Vol. 14 Issue (5) : 145312    https://doi.org/10.1007/s11704-019-8374-z
RESEARCH ARTICLE
Multi-task regression learning for survival analysis via prior information guided transductive matrix completion
Lei CHEN1,2(), Kai SHAO1, Xianzhong LONG1, Lingsheng WANG1
1. Jiangsu Key Laboratory of Big Data Security and Intelligent Processing, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
2. MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
 Download: PDF(552 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Survival analysis aims to predict the occurrence time of a particular event of interest, which is crucial for the prognosis analysis of diseases. Currently, due to the limited study period and potential losing tracks, the observed data inevitably involve some censored instances, and thus brings a unique challenge that distinguishes from the general regression problems. In addition, survival analysis also suffers from other inherent challenges such as the high-dimension and small-sample-size problems. To address these challenges, we propose a novel multi-task regression learning model, i.e., prior information guided transductive matrix completion (PigTMC) model, to predict the survival status of the new instances. Specifically, we use the multi-label transductive matrix completion framework to leverage the censored instances together with the uncensored instances as the training samples, and simultaneously employ the multi-task transductive feature selection scheme to alleviate the overfitting issue caused by high-dimension and small-sample-size data. In addition, we employ the prior temporal stability of the survival statuses at adjacent time intervals to guide survival analysis. Furthermore, we design an optimization algorithm with guaranteed convergence to solve the proposed PigTMC model. Finally, the extensive experiments performed on the real microarray gene expression datasets demonstrate that our proposed model outperforms the previously widely used competing methods.

Keywords survival analysis      matrix completion      multi-task regression      transductive learning      multi-task feature selection     
Corresponding Author(s): Lei CHEN   
Issue Date: 10 March 2020
 Cite this article:   
Lei CHEN,Kai SHAO,Xianzhong LONG, et al. Multi-task regression learning for survival analysis via prior information guided transductive matrix completion[J]. Front. Comput. Sci., 2020, 14(5): 145312.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-019-8374-z
https://academic.hep.com.cn/fcs/EN/Y2020/V14/I5/145312
1 T Fernández, N Rivera, Y W Teh. Gaussian processes for survival analysis. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 5021–5029
2 B Efron. The efficiency of Cox’s likelihood function for censored data. Journal of the American Statistical Association, 1977, 72(359): 557–565
https://doi.org/10.1080/01621459.1977.10480613
3 T M Therneau, T Lumley. Package ‘survival’. R Top Doc, 2015, 128
4 Y Li, V Rakesh, C K Reddy. Project success prediction in crowdfunding environments. In: Proceedings of ACM International Conference on Web Search and Data Mining. 2016, 247–256
https://doi.org/10.1145/2835776.2835791
5 M J Crowther , P C Lambert. A general framework for parametric survival analysis. Statistics in Medicine, 2014, 33(30): 5280–5297
https://doi.org/10.1002/sim.6300
6 E T Lee, J Wang. Statistical Methods for Survival Data Analysis. New Jersey: John Wiley & Sons, 2003
https://doi.org/10.1002/0471458546
7 R Tibshirani. The lasso method for variable selection in the Cox model. Statistics in Medicine, 1997, 16(4): 385–395
https://doi.org/10.1002/(SICI)1097-0258(19970228)16:4<385::AID-SIM380>3.0.CO;2-3
8 N Simon, J Friedman, T Hastie, R Tibshirani. Regularization paths for Cox’s proportional hazards model via coordinate descent. Journal of Statistical Software, 2011, 39(5): 1
https://doi.org/10.18637/jss.v039.i05
9 Y Li, L Wang, J Wang, J Wang, J Ye, C K Reddy. Transfer learning for survival analysis via efficient L2, 1-norm regularized Cox regression. In: Proceedings of IEEE International Conference on Data Mining. 2016, 231–240
https://doi.org/10.1109/ICDM.2016.0034
10 Y Li, J Wang, J Ye, C K Reddy. A multi-task learning formulation for survival analysis. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 1715–1724
https://doi.org/10.1145/2939672.2939857
11 Y Li, T Yang, J Zhou, J Ye. Multi-task learning based survival analysis for predicting Alzheimer’s disease progression with multi-source block-wise missing data. In: Proceedings of SIAM International Conference on Data Mining. 2018, 288–296
https://doi.org/10.1137/1.9781611975321.33
12 L Chen, H Zhang, J Lu, K Thung, A Aibaidula, L Liu, S Chen, L Jin, J Wu, Q Wang, L Zhou, D G Shen. Multi-label nonlinear matrix completion with transductive multi-task feature selection for joint MGMT and IDH1 status prediction of patient with high-grade gliomas. IEEE Transactions on Medical Imaging, 2018, 37(8): 1775–1787
https://doi.org/10.1109/TMI.2018.2807590
13 A Goldberg, B Recht, J Xu, R Nowak, J Zhu. Transduction with matrix completion: three birds with one stone. In: Proceedings of the 23rd International Conference on Neural Information Processing Systems. 2010, 757–765
14 R Cabral, F De la Torre, J P Costeira, A Bernardino. Matrix completion for weakly-supervised multi-label image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(1), 121–135
https://doi.org/10.1109/TPAMI.2014.2343234
15 S Tulyakov, X Alameda-Pineda, E Ricci, L Yiu, J F Cohn, N Sebe. Self-adaptive matrix completion for heart rate estimation from face videos under realistic conditions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 2396–2404
https://doi.org/10.1109/CVPR.2016.263
16 D R Cox. Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological), 1972, 34(2): 187–202
https://doi.org/10.1111/j.2517-6161.1972.tb00899.x
17 A Indrayan , A K Bansal. The methods of survival analysis for clinicians. Indian Pediatrics, 2010, 47(9): 743–748
https://doi.org/10.1007/s13312-010-0112-4
18 P Wang, Y Li, C K Reddy. Machine learning for survival analysis: a survey. ACM Computing Surveys (CSUR), 2019, 51(6): 110
https://doi.org/10.1145/3214306
19 M Aitkin, D Clayton. The fitting of exponential, Weibull and extreme value distributions to complex censored survival data using GLIM. Journal of the Royal Statistical Society: Series C (Applied Statistics), 1980, 29(2): 156–163
https://doi.org/10.2307/2986301
20 S Bennett. Log-logistic regression models for survival data. Journal of the Royal Statistical Society: Series C (Applied Statistics), 1983, 32(2): 165–171
https://doi.org/10.2307/2347295
21 Y Li, K S Xu, C K Reddy. Regularized parametric regression for highdimensional survival analysis. In: Proceedings of SIAM International Conference on Data Mining. 2016, 765–773
https://doi.org/10.1137/1.9781611974348.86
22 R Miller , J Halpern. Regression with censored data. Biometrika, 1982, 69(3): 521–531
https://doi.org/10.1093/biomet/69.3.521
23 H Koul , V Susarla, J Van Ryzin. Regression analysis with randomly right-censored data. The Annals of Statistics, 1981, 9(6): 1276–1288
https://doi.org/10.1214/aos/1176345644
24 J Tobin. Estimation of relationships for limited dependent variables. Econometrica, 1958, 26(1): 24–36
https://doi.org/10.2307/1907382
25 J Buckley, I James. Linear regression with censored data. Biometrika, 1979, 66(3): 429–436
https://doi.org/10.1093/biomet/66.3.429
26 S Wang, B Nan, J Zhu, D G Beer. Doubly penalized Buckley–James method for survival data with high-dimensional covariates. Biometrics, 2008, 64(1): 132–140
https://doi.org/10.1111/j.1541-0420.2007.00877.x
27 Y Li, B Vinzamuri, C K Reddy. Regularized weighted linear regression for high-dimensional censored data. In: Proceedings of SIAM International Conference on Data Mining. 2016, 45–53
https://doi.org/10.1137/1.9781611974348.6
28 W Ye, L Chen , G Yang, H Dai, F Xiao. Anomaly-tolerant traffic matrix estimation via prior information guided matrix completion. IEEE Access, 2017, 5: 3172–3182
https://doi.org/10.1109/ACCESS.2017.2671860
29 Y Xu, W Yin. A globally convergent algorithm for nonconvex optimization based on block coordinate update. Journal of Scientific Computing, 2017, 72(2): 700–734
https://doi.org/10.1007/s10915-017-0376-0
30 J Liu, S Ji, J Ye. Multi-task feature learning via efficient l 2, 1-norm minimization. In: Proceedings of AUAI Conference on Uncertainty in Artificial Intelligence. 2009, 339–348
31 T Sørlie, R Tibshirani, J Parker, T Hastie, J S Maron, A Nobel , S Deng, H Johnsen, R Pesich, S Geisler, J Demeter, C M Perou, P E Lønning, P O Brown, A L Børresen-Dale, D Botstein. Repeated observation of breast tumor subtypes in independent gene expression data sets. Proceedings of the National Academy of Sciences, 2003, 100(14): 8418–8423
https://doi.org/10.1073/pnas.0932692100
32 L J Van’t Veer, H Dai, M J Van De Vijver, Y D He, A A M Hart, M Mao, H L Peterse, K Wan Der Kooy, M J Marton, A T Witteveen, G J Schreiber, R M Kerkhoven, C Roberts, P S Linsley, R Bernards, S H Friend. Gene expression profiling predicts clinical outcome of breast cancer. Nature, 2002, 415(6871): 530
https://doi.org/10.1038/415530a
33 D G Beer, S L R Kardia, C C Huang, T J Giordano, A M Levin, D E Misek, L Lin, G Chen, G Tarek, D G Thomas, M L Lizyness, R Kuick, S Hayasaka, J Taylor, M D Lannettoni, M B Orringer, S Hanash. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nature Medicine, 2002, 8(8): 816
https://doi.org/10.1038/nm733
34 H C van Houwelingen, T Bruinsma, A A M Hart, L J Van’t Veer, L F Wessels. Cross-validated Cox regression on microarray gene expression data. Statistics in Medicine, 2006, 25(18): 3201–3216
https://doi.org/10.1002/sim.2353
35 A Rosenwald, G Wright, A Wiestner, W C Chan, J M Connors, E Campo, R D Gascoyne, T M Grogan, H K Muller-Hermelink, E B Smeland, M Chiorazzi, J M Giltnane, E M Hurt, H Zhao, L Averett, S Henrickson, L Yang, J Poweel, W Wilson, E S Jaffe, R Simon, R D Kiausner, E Montserrat, F Bosch, T Greiner, D D Weisenburger, W G Sanger, B J Dave, J C Lynch, J Vose, J O Armitage, R I Fisher, T P Miller, M LeBlanc, G Ott, y S Kvalo, H Holte, J Delabie, L M Staudt. The proliferation gene expression signature is a quantitative integrator of oncogenic events that predicts survival in mantle cell lymphoma. Cancer Cell, 2003, 3(2): 185–197
https://doi.org/10.1016/S1535-6108(03)00028-X
36 F E Harrell Jr, R M Califf, D B Pryor, K L Lee, R A Rosati. Evaluating the yield of medical tests. Jama, 1982, 247(18): 2543–2546
https://doi.org/10.1001/jama.247.18.2543
37 T Therneau. A package for survival analysis in S. R Package Version 2.37-4. 2013
38 Y Yang, H Zou. A cocktail algorithm for solving the elastic net penalized Cox’s regression in high dimensions. Statistics and its Interface, 2013, 6(2): 167–173
https://doi.org/10.4310/SII.2013.v6.n2.a1
39 L Wang, Y Li, J Zhou, D Zhu, J Ye. Multi-task survival analysis. In: Proceedings of IEEE International Conference on Data Mining. 2017, 485–494
https://doi.org/10.1109/ICDM.2017.58
40 J J Faraway. Practical Regression and ANOVA Using R. 2002
41 Z Wang, C Y Wang. Buckley-James boosting for survival analysis with high-dimensional biomarker data. Statistical Applications in Genetics and Molecular Biology, 2010, 9(1): 1–31
https://doi.org/10.2202/1544-6115.1550
42 J Zhou, J Chen, J Ye. Malsar: multi-task learning via structural regularization. Arizona State University, 2011, 21
43 X Alameda-Pineda, E Ricci, Y Yan, N Sebe. Recognizing emotions from abstract paintings using non-linear matrix completion. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 5240–5248
https://doi.org/10.1109/CVPR.2016.566
44 S Boyd, L Vandenberghe. Convex Optimization. Cambridge: Cambridge University Press, 2004
https://doi.org/10.1017/CBO9780511804441
[1] Wendi JI, Xiaoling WANG, Feida ZHU. Time-aware conversion prediction[J]. Front. Comput. Sci., 2017, 11(4): 702-716.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed