When semi-supervised learning meets ensemble learning

doi:10.1007/s11460-011-0126-2

Front Elect Electr Eng Chin

0, Vol.

Issue () : 6-16 https://doi.org/10.1007/s11460-011-0126-2

RESEARCH ARTICLE

When semi-supervised learning meets ensemble learning

Zhi-Hua Zhou(

)

National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210093, China

Download: PDF(305 KB) HTML
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Abstract

Semi-supervised learning and ensemble learning are two important machine learning paradigms. The former attempts to achieve strong generalization by exploiting unlabeled data; the latter attempts to achieve strong generalization by using multiple learners. Although both paradigms have achieved great success during the past decade, they were almost developed separately. In this paper, we advocate that semi-supervised learning and ensemble learning are indeed beneficial to each other, and stronger learning machines can be generated by leveraging unlabeled data and classifier combination.

Keywords machine learning semi-supervised learning ensemble learning

Corresponding Author(s): Zhou Zhi-Hua,Email:zhouzh@nju.edu.cn

Issue Date: 05 March 2011

Cite this article:

Zhi-Hua Zhou. When semi-supervised learning meets ensemble learning[J]. Front Elect Electr Eng Chin, 0, (): 6-16.

URL:

https://academic.hep.com.cn/fee/EN/10.1007/s11460-011-0126-2
https://academic.hep.com.cn/fee/EN/Y0/V/I/6

1	Chapelle O, Sch?lkopf B, Zien A. Semi-Supervised Learning. Cambridge: MIT Press, 2006
2	Zhou Z H, Li M. Semi-supervised learning by disagreement. Knowledge and Information Systems , 2010, 24(3): 415-439 doi: 10.1007/s10115-009-0209-z
3	Zhu X. Semi-supervised learning literature survey. Technical Report 1530. Madison: University of Wisconsin at Madison, Department of Computer Sciences, 2006. http://www.cs.wisc.edu/～jerryzhu/pub/ssl survey.pdf
4	Zhou Z H. Ensemble Learning. In: Li S Z, ed. Encyclopedia of Biometrics . Berlin: Springer, 2009, 270-273
5	Bennett K, Demiriz A, Maclin R. Exploiting unlabeled data in ensemble methods. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . 2002, 289-296
6	d’Alché-Buc F, Grandvalet Y, Ambroise C. Semi-Supervised MarginBoost. In: Dietterich T G, Becker S, Ghahramani Z, eds. Advances in Neural Information Processing Systems 14 . 2002, 553-560
7	Li M, Zhou Z H. Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Transactions on Systems, Man and Cybernetics-Part A: Systems and Humans , 2007, 37(6): 1088-1098 doi: 10.1109/TSMCA.2007.904745
8	Mallapragada P K, Jin R, Jain A K, Liu Y. SemiBoost: boosting for semi-supervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2009, 31(11): 2000-2014 doi: 10.1109/TPAMI.2008.235
9	Valizadegan H, Jin R, Jain A K. Semi-supervised boosting for multi-class classification. In: Proceedings of the 19th European Conference on Machine Learning . 2008, 522-537
10	Zhou Z H, Li M. Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering , 2005, 17(11): 1529-1541 doi: 10.1109/TKDE.2005.186
11	Zhou Z H. When semi-supervised learning meets ensemble learning. In: Proceedings of the 8th International Workshop on Multiple Classifier Systems . 2010, 529-538
12	Krogh A, Vedelsby J. Neural network ensembles, cross validation, and active learning. In: Tesauro G, Touretzky D S, Leen T K, eds. Advances in Neural Information Processing Systems 7 . Cambridge: MIT Press, 1995, 231-238
13	Brown G. An information theoretic perspective on multiple classifier systems. In: Proceedings of the 8th International Workshop on Multiple Classifier Systems . 2009, 344-353 doi: 10.1007/978-3-642-02326-2_35
14	Zhou Z H, Li N. Multi-information ensemble diversity. In: Proceedings of the 9th International Workshop on Multiple Classifier Systems . 2010, 134-144 doi: 10.1007/978-3-642-12127-2_14
15	Breiman L. Bagging predictors. Machine Learning , 1996, 24(2): 123-140 doi: 10.1007/BF00058655
16	Freund Y, Schapire R E. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences , 1997, 55(1): 119-139 doi: 10.1006/jcss.1997.1504
17	Ho T K. The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence , 1998, 20(8): 832-844 doi: 10.1109/34.709601
18	Breiman L. Random forests. Machine Learning , 2001, 45(1): 5-32 doi: 10.1023/A:1010933404324
19	Zhou Z H. Learning with unlabeled data and its application to image retrieval. In: Proceedings of the 9th Pacific Rim International Conference on Artificial Intelligence . 2006, 5-10
20	Vapnik V N. Statistical Learning Theory. New York: Wiley, 1998
21	Settles B. Active learning literature survey. Technical Report 1648. Wisconsin: University of Wisconsin at Madison, Department of Computer Sciences , 2009. http://pages.cs.wisc.edu/～bsettles/pub/settles.activelearning.pdf
22	Miller D J, Uyar H S. A mixture of experts classifier with learning based on both labelled and unlabelled data. In: Mozer M, Jordan M I, Petsche T, eds. Advances in Neural Information Processing Systems 9 . Cambridge: MIT Press, 1997, 571-577
23	Nigam K, McCallum A K, Thrun S, Mitchell T. Text classification from labeled and unlabeled documents using EM. Machine Learning , 2000, 39(2-3): 103-134 doi: 10.1023/A:1007692713085
24	Shahshahani B, Landgrebe D. The effect of unlabeled samples in reducing the small sample size problem and mitigating the hughes phenomenon. IEEE Transactions on Geoscience and Remote Sensing , 1994, 32(5): 1087-1095 doi: 10.1109/36.312897
25	Xu L. Bayesian Ying Yang System and Theory as a Unified Statistical Learning Approach: (I) Unsupervised and Semi-Unsupervised Learning. In: Amari S, Kassabov N, eds. Brain-like Computing and Intelligent Information Systems . Berlin: Springer-Verlag, 1997, 241-274
26	Chapelle O, Zien A. Semi-supervised learning by low density separation. In: Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics . 2005, 57-64
27	Grandvalet Y, Bengio Y. Semi-supervised Learning by Entropy Minimization. In: Saul L K, Weiss Y, Bottou L, eds. Advances in Neural Information Processing Systems 17 . Cambridge: MIT Press, 2005, 529-536
28	Joachims T. Transductive inference for text classification using support vector machines. In: Proceedings of the 16th International Conference on Machine Learning . 1999, 200-209
29	Lawrence N D, Jordan M I. Semi-supervised learning via Gaussian processes. In: Saul L K, Weiss Y, Bottou L, eds. Advances in Neural Information Processing Systems 17 . Cambridge: MIT Press, 2005, 753-760
30	Belkin M, Niyogi P. Semi-supervised learning on Riemannian manifolds. Machine Learning , 2004, 56(1-3): 209-239 doi: 10.1023/B:MACH.0000033120.25363.1e
31	Belkin M, Niyogi P, Sindhwani V. On manifold regularization. In: Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics . 2005, 17-24
32	Belkin M, Niyogi P, Sindhwani V. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research , 2006, 7: 2399-2434
33	Zhou D, Bousquet O, Lal T N, J. Weston, Sch?lkopf B. Learning with local and global consistency. In: Thrun S, Saul L, Sch?lkopf B, eds. Advances in Neural Information Processing Systems 16 . 2004, 321-328
34	Zhu X, Ghahramani Z, Lafferty J. Semi-supervised learning using Gaussian fields and harmonic functions. In: Proceedings of the 20th International Conference on Machine Learning . 2003, 912-919
35	Blum A, Mitchell T. Combining labeled and unlabeleddata with co-training. In: Proceedings of the 11th Annual Conference on Computational Learning Theory . 1998, 92-100
36	Dasgupta S, Littman M, McAllester D. PAC Generalization Bunds for Co-training. In: Dietterich T G, Becker S, Ghahramani Z, eds. Advances in Neural Information Processing Systems 14 . Cambridge: MIT Press, 2002, 375-382
37	Balcan M F, Blum A, Yang K. Co-Training and Expansion: Towards Bridging Theory and Practice. In: Saul L K, Weiss Y, Bottou L, eds. Advances in Neural Information Processing Systems 17 . Cambridge: MIT Press, 2005, 89-96
38	Abney S. Bootstrapping. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics . 2002, 360-367
39	Nigam K, Ghani R. Analyzing the effectiveness and applicability of co-training. In: Proceedings of the 9th ACM International Conference on Information and Knowledge Management , 2000, 86-93
40	Goldman S, Zhou Y. Enhancing supervised learning with unlabeled data. In: Proceedings of the 17th International Conference on Machine Learning . 2000, 327-334
41	Zhou Z H, Li M. Semi-supervised regression with cotraining. In: Proceedings of the 19th International Joint Conference on Artificial Intelligence . 2005, 908-913
42	Zhou Z H, Li M. Semi-supervised regression with cotraining style algorithms. IEEE Transactions on Knowledge and Data Engineering , 2007, 19(11): 1479-1493 doi: 10.1109/TKDE.2007.190644
43	Mohamed T A, Gayar N El, Atiya A F. A cotraining approach for time series prediction with missing data. In: Proceedings of the 7th International Workshop on Multiple Classifier Systems . 2007, 93-102 doi: 10.1007/978-3-540-72523-7_10
44	Wang W, Zhou Z H. Analyzing co-training style algorithms. In: Proceedings of the 18th European Conference on Machine Learning , 2007, 454-465
45	Hwa R, Osborne M, Sarkar A, Steedman M. Corrected cotraining for statistical parsers. In: Working Notes of the ICML’03Workshop on the Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining . 2003
46	Pierce D, Cardie C. Limitations of co-training for natural language learning from large data sets. In: Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing . 2001, 1-9
47	Sarkar A. Applying co-training methods to statistical parsing. In: Proceedings of the 2nd Annual Meeting of the North American Chapter of the Association for Computational Linguistics . 2001, 95-102
48	Steedman M, Osborne M, Sarkar A, Clark S, Hwa R, Hockenmaier J, Ruhlen P, Baker S, Crim J. Bootstrapping statistical parsers from small data sets. In: Proceedings of the 11th Conference on the European Chapter of the Association for Computational Linguistics . 2003, 331-338
49	Li M, Li H, Zhou Z H. Semi-supervised document retrieval. Information Processing and Management , 2009, 45(3): 341-355 doi: 10.1016/j.ipm.2008.11.002
50	Mavroeidis D, Chaidos K, Pirillos S, Christopoulos D, Vazirgiannis M. Using tri-training and support vector machines for addressing the ECML-PKDD 2006 Discovery Challenge. In: Proceedings of ECMLPKDD 2006 Discovery Challenge Workshop . 2006, 39-47
51	Kockelkorn M, Lneburg A, Scheffer T. Using transduction and multi-view learning to answer emails. In: Proceedings of the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases . 2003, 266-277
52	Zhou Z H, Chen K J, Dai H B. Enhancing relevance feedback in image retrieval using unlabeled data. ACM Transactions on Information Systems , 2006, 24(2): 219-244 doi: 10.1145/1148020.1148023
53	Zhou Z H, Chen K J, Jiang Y. Exploiting unlabeled data in content-based image retrieval. In: Proceedings of the 15th European Conference on Machine Learning . 2004, 525-536
54	Wang W, Zhou Z H. On multi-view active learning and the combination with semi-supervised learning. In: Proceedings of the 25th International Conference on Machine Learning . 2008, 1152-1159 doi: 10.1145/1390156.1390301
55	Muslea I, Minton S, Knoblock C A. Active learning with multiple views. Journal of Artificial Intelligence Research , 2006, 27(1): 203-233
56	Zhou Z H, Zhan D C, Yang Q. Semi-supervised learning with very few labeled training examples. In: Proceedings of the 22nd AAAI Conference on Artificial Intelligence . 2007, 675-680
57	Hotelling H. Relations between two sets of variates. Biometrika , 1936, 28(4): 321-377
58	Hardoon D R, Szedmak S, Shawe-Taylor J. Canonical correlation analysis: an overview with application to learning methods. Neural Computation , 2004, 16(12): 2639-2664 doi: 10.1162/0899766042321814
59	Zhang M L, Zhou Z H. Classifier ensemble with unlabeled data. CORR abs10909.3593 , 2009
60	Zhang M L, Zhou Z H. Exploiting unlabeled data to enhance ensemble diversity. In: Proceedings of the 9th IEEE International Conference on Data Mining . 2010 doi: 10.1109/ICDM.2010.12
61	Xu L, Amari S. Combining classifiers and learning mixtureof-experts. In: Dopico J R R, Dorado J, Pazos A, eds. Encyclopedia of Artificial Intelligence . 2009, 318-326

[1]	Lei XU. On essential topics of BYY harmony learning: Current status, challenging issues, and gene analysis applications[J]. Front Elect Electr Eng, 2012, 7(1): 147-196.
[2]	Chen SONG, Xiaohong GUAN, Qianchuan ZHAO, Qing-Shan JIA. Remanufacturing planning based on constrained ordinal optimization[J]. Front Elect Electr Eng Chin, 2011, 6(3): 443-452.
[3]	Xiaofei HE, Binbin LIN. Tangent space learning and generalization[J]. Front Elect Electr Eng Chin, 2011, 6(1): 27-42.
[4]	Changshui ZHANG, Fei WANG. Graph-based semi-supervised learning[J]. Front Elect Electr Eng Chin, 2011, 6(1): 17-26.
[5]	Shun-ichi AMARI, . Information geometry in optimization, machine learning and statistical inference[J]. Front. Electr. Electron. Eng., 2010, 5(3): 241-260.
[6]	SONG Yangqiu, LEE Jianguo, ZHANG Changshui, XIANG Shiming. Semi-supervised Gaussian random field transduction and induction[J]. Front. Electr. Electron. Eng., 2008, 3(1): 1-9.

Viewed

Full text

Abstract

Cited

Shared

Discussed