Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2017, Vol. 11 Issue (2) : 243-252    https://doi.org/10.1007/s11704-017-6538-2
RESEARCH ARTICLE
Deep model-based feature extraction for predicting protein subcellular localizations from bio-images
Wei SHAO1,Yi DING1,Hong-Bin SHEN2(),Daoqiang ZHANG1()
1. School of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
2. Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai 200240, China
 Download: PDF(693 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Protein subcellular localization prediction is important for studying the function of proteins. Recently, as significant progress has been witnessed in the field of microscopic imaging, automatically determining the subcellular localization of proteins from bio-images is becoming a new research hotspot. One of the central themes in this field is to determine what features are suitable for describing the protein images. Existing feature extraction methods are usually hand-crafted designed, by which only one layer of features will be extracted, which may not be sufficient to represent the complex protein images. To this end, we propose a deep model based descriptor (DMD) to extract the high-level features from protein images. Specifically, in order to make the extracted features more generic, we firstly trained a convolution neural network (i.e., AlexNet) by using a natural image set with millions of labels, and then used the partial parameter transfer strategy to fine-tune the parameters from natural images to protein images. After that, we applied the Lasso model to select the most distinguishing features from the last fully connected layer of the CNN (Convolution Neural Network), and used these selected features for final classifications. Experimental results on a protein image dataset validate the efficacy of our method.

Keywords partial parameter transfer      subcellular location classification      feature extraction      deep model      convolution neural network     
Corresponding Author(s): Hong-Bin SHEN,Daoqiang ZHANG   
Just Accepted Date: 20 January 2017   Online First Date: 23 March 2017    Issue Date: 06 April 2017
 Cite this article:   
Wei SHAO,Yi DING,Hong-Bin SHEN, et al. Deep model-based feature extraction for predicting protein subcellular localizations from bio-images[J]. Front. Comput. Sci., 2017, 11(2): 243-252.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-017-6538-2
https://academic.hep.com.cn/fcs/EN/Y2017/V11/I2/243
1 Chou K C, Shen H B. Cell-PLoc: a package ofWeb servers for predicting subcellular localization of proteins in various organisms. Nature Protocols, 2008, 3(2): 153–162
https://doi.org/10.1038/nprot.2007.494
2 Pierleoni A, Martelli P L, Casadio R. MemLoci: predicting subcellular localization of membrane proteins in eukaryotes. Bioinformatics, 2011, 27(9): 1224–1230
https://doi.org/10.1093/bioinformatics/btr108
3 Xu Y Y, Yang F, Zhang Y, Shen H B. An image-based multi-label human protein subcellular localization predictor (iLocator) reveals protein mislocalizations in cancer tissues. Bioinformatics, 2013, 29(16): 2032–2040
https://doi.org/10.1093/bioinformatics/btt320
4 Hung M C, Link W. Protein localization in disease and therapy. Journal of Cell Science, 2011, 124(20): 3381–3392
https://doi.org/10.1242/jcs.089110
5 Xu Y Y, Yang F, Zhang Y, Shen H B. Bioimaging-based detection of mislocalized proteins in human cancers by semi-supervised learning. Bioinformatics, 2015, 31(7): 1111–1119
https://doi.org/10.1093/bioinformatics/btu772
6 Glory E, Newberg J, Murphy R F. Automated comparison of protein subcellular location patterns between images of normal and cancerous tissues. In: Proceedings of the 5th IEEE International Symposium on Biomedical Imaging. 2008
https://doi.org/10.1109/isbi.2008.4540993
7 Li J, Xiong L, Schneider J, Murphy R F. Protein subcellular location pattern classification in cellular images using latent discriminative models. Bioinformatics, 2012, 28(12): 32–39
https://doi.org/10.1093/bioinformatics/bts230
8 Shao W, Liu M, Zhang D. Human cell structure-driven model construction for predicting protein subcellular location from biological images. Bioinformatics, 2016, 32(1): 114–121
9 Galar M, Fernández A, Barrenechea E, Bustince H, Herrera F. An overview of ensemble methods for binary classifiers in multi-class problems: experimental study on one-vs-one and one-vs-all schemes. Pattern Recognition, 2011, 44(8): 1761–1776
https://doi.org/10.1016/j.patcog.2011.01.017
10 Gu B, Sun X, Sheng V S. Structural minimax probability machine. IEEE Transactions on Neural Networks and Learning Systems, 2016, doi:10.1109/TNNLS.2016.2527796
https://doi.org/10.1109/TNNLS.2016.2527796
11 Wen X Z, Shao L, Xue Y, Fang W. A rapid learning algorithm for vehicle classification. Information Sciences, 2015, 295(1): 395–406
https://doi.org/10.1016/j.ins.2014.10.040
12 Glorot X, Bordes A, Bengio Y. Domain adaptation for large-scale sentiment classification: a deep learning approach. In: Proceedings of the 28th International Conference on Machine Learning. 2011
13 Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell , T. Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACMinternational conference on Multimedia. 2014, 675–678
https://doi.org/10.1145/2647868.2654889
14 Guyon I, Elissee A. An introduction to feature extraction. In: Guyon I, Nikravesh M, Gunn S, et al. eds. Feature Extraction. Studies in Fuzziness and Soft Computing, Vol 207. Springer Berlin Heidelberg, 2006, 1–25
https://doi.org/10.1007/978-3-540-35488-8_1
15 Boland M V, Murphy R F. A neural network classifier capable of recognizing the patterns of all major subcellular structures in fluorescence microscope images of HeLa cells. Bioinformatics, 2001, 17(12): 1213–1223
https://doi.org/10.1093/bioinformatics/17.12.1213
16 Tahir M, Khan A. Protein subcellular localization of fluorescence microscopy images: employing new statistical and Texton based image features and SVM based ensemble classification. Information Sciences An International Journal, 2016, 345(C): 65–80
https://doi.org/10.1016/j.ins.2016.01.064
17 Newberg J, Murphy R F. A framework for the automated analysis of subcellular patterns in human protein atlas images. Journal of Proteome Research, 2008, 7(6): 2300–2308
https://doi.org/10.1021/pr7007626
18 Nanni L, Lumini A, Brahnam S. Local binary patterns variants as texture descriptors for medical image analysis. Artificial Intelligence in Medicine, 2010, 49(2): 117–125
https://doi.org/10.1016/j.artmed.2010.02.006
19 Yang F, Xu Y Y, Wang S T, Shen H B. Image-based classification of protein subcellular location patterns in human reproductive tissue by ensemble learning global and local features. Neurocomputing, 2014, 131(9): 113–123
https://doi.org/10.1016/j.neucom.2013.10.034
20 Godil A, Lian Z, Wagan A. Exploring local features and the Bag-of- Visual-Words approach for bioimage classification. In: Proceedings of the 17th ACM International Conference on Bioinformatics, Computational Biology and Biomedical Informatics. 2013
21 Coelho L P, Kangas J D, Naik A W, Osuna-Highley E, Glory-Afshar E, Fuhrman M, Simha R, Berget P B, Jarvik J W, Murphy R F. Determining the subcellular location of new proteins from microscope images using local features. Bioinformatics, 2013, 29(18): 2343–2349
https://doi.org/10.1093/bioinformatics/btt392
22 Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems. 2012, 1097–1105
23 Sun Q, Amin M, Yan B, Martell C, Markman V, Bhasin A, Ye J. Transfer learning for bilingual content classification. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 2147–2156
https://doi.org/10.1145/2783258.2788575
24 Uhlén M, Ponten F. Antibody-based proteomics for human tissue profiling. Molecular and Cellular Proteomics, 2005, 4(4): 384–393
https://doi.org/10.1074/mcp.R500009-MCP200
25 Uhlén M, Fagerberg L, Hallström B M, Lindskog C, Oksvold P, Mardinoglu A, Sivertsson A, Kampf C, Sjöstedt E, Asplund A, Olsson I, Edlund K, Lundberg E, Navani S, Szigyarto C A K, Odeberg J, Djureinovic D, Takanen J O, Hober S, Alm T, Edqvist P H, Berling H, Tegel H, Mulder J, Rockberg J, Nilsson P, Schwenk J M, Hamsten M, von Feilitzen K, Forsberg M, Persson L, Johansson F, Zwahlen M, von Heijne G, Nielsen J, Pontén F. Tissue-based map of the human proteome. Science, 2015, 347(6220): 1260419
https://doi.org/10.1126/science.1260419
26 Uhlén M, Oksvold P, Fagerber L, Lundberg E, Jonasson K, Forsberg M, Zwahlen M, Kampf C, Wester K, Hober S, Wernerus H, Björling L, Ponten F. Towards a knowledge-based human protein atlas. Nature Biotechnology, 2010, 28(12): 1248–1250
https://doi.org/10.1038/nbt1210-1248
27 Wang W, Yang X, Ooi B C, Zhang D, Zhuang Y. Effective deep learning-based multi-modal retrieval. The VLDB Journal, 2016, 25(1): 79–101
https://doi.org/10.1007/s00778-015-0391-4
28 Pan Z, Deng Z T. Dimensionality reduction via kernel sparse representation. Frontiers of Computer Science. 2014, 8(5): 807–815
https://doi.org/10.1007/s11704-014-3317-1
29 Zhang Y Y, Zhang J C, Pan Z C, Zhang D Q. Multi-view dimensionality reduction via canonical random correlation analysis. Frontiers of Computer Science, 2016, 10(5): 856–869
https://doi.org/10.1007/s11704-015-4538-7
30 Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B (Methodological), 1996, 58(1): 267–288
31 Magerman D M. Statistical decision-tree models for parsing. In: Proceedings of the 33rd Annual Meeting on Association for Computational Linguistics. 1995, 276–283
https://doi.org/10.3115/981658.981695
32 Hagan M T, Demuth H B, Beale M H, De Jesús O. Neural Network Design. Boston: PWS Publishing Company, 1996
33 Dietterich T G, Bakiri G. Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 1995, 2(1): 263–286
34 Escalera S, Tax D M J, Pujol O, Radeva P, Duin R P. Subclass problemdependent design for error-correcting output codes. IEEE Transactions on Pattern Analysis andMachine Intelligence, 2008, 30(6): 1041–1054
35 Pujol O, Radeva P, Vitria J. Discriminant ECOC: a heuristic method for application dependent design of error correcting output codes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(6): 1007–1012
https://doi.org/10.1109/TPAMI.2006.116
36 Chang C C, Lin C J. LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3): 27–32
https://doi.org/10.1145/1961189.1961199
37 Lin T H, Murphy R F, Bar-Joseph Z. Discriminative motif finding for predicting protein subcellular localization. IEEE/ACMTransactions on Computational Biology and Bioinformatics, 2011, 8(2): 441–451
https://doi.org/10.1109/TCBB.2009.82
38 Zhu L, Yang J, Shen H B. Multi label learning for prediction of human protein subcellular localizations. The Protein Journal, 2009, 28(9): 384–390
https://doi.org/10.1007/s10930-009-9205-0
39 Shen H B, Chou K C. A top-down approach to enhance the power of predicting human protein subcellular localization: Hum-mPLoc 2.0. Analytical Biochemistry, 2009, 394(2): 269–274
https://doi.org/10.1016/j.ab.2009.07.046
40 Zhang D, Wang Y, Zhou L, Yuan H, Shen D, the Alzheimer’s Disease Neuroimaging Initiative. Multimodal classification of Alzheimer’s disease and mild cognitive impairment. Neuroimage, 2011, 55(3): 856–867
https://doi.org/10.1016/j.neuroimage.2011.01.008
[1] Xia-an BI, Yiming XIE, Hao WU, Luyun XU. Identification of differential brain regions in MCI progression via clustering-evolutionary weighted SVM ensemble algorithm[J]. Front. Comput. Sci., 2021, 15(6): 156903-.
[2] Yaru XIAN, Jun XIAO, Ying WANG. A fast registration algorithm of rock point cloud based on spherical projection and feature extraction[J]. Front. Comput. Sci., 2019, 13(1): 170-182.
[3] Fengying XIE,Yefen WU,Yang LI,Zhiguo JIANG,Rusong MENG. Adaptive segmentation based on multi-classification model for dermoscopy images[J]. Front. Comput. Sci., 2015, 9(5): 720-728.
[4] Zhisong PAN,Zhantao DENG,Yibing WANG,Yanyan ZHANG. Dimensionality reduction via kernel sparse representation[J]. Front. Comput. Sci., 2014, 8(5): 807-815.
[5] Yin LU,Fuxiang WANG,Xiaoyan LUO,Feng LIU. Novel infrared and visible image fusion method based on independent component analysis[J]. Front. Comput. Sci., 2014, 8(2): 243-254.
[6] R PRIYA, T. N SHANMUGAM. A comprehensive review of significant researches on content based indexing and retrieval of visual information[J]. Front Comput Sci, 2013, 7(5): 782-799.
[7] Jing WANG, Zhijing LIU, Hui ZHAO. A probabilistic model with multi-dimensional features for object extraction[J]. Front Comput Sci, 2012, 6(5): 513-526.
[8] Tim SCHLüTER, Stefan CONRAD. An approach for automatic sleep stage scoring and apnea-hypopnea detection[J]. Front Comput Sci, 2012, 6(2): 230-241.
[9] YANG Jian, YANG Jingyu, ZHANG David. Median Fisher Discriminator: a robust feature extraction method with applications to biometrics[J]. Front. Comput. Sci., 2008, 2(3): 295-305.
[10] DAI Ruwei, XIAO Baihua, LIU Chenglin. Chinese character recognition: history, status and prospects[J]. Front. Comput. Sci., 2007, 1(2): 126-136.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed