Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2016, Vol. 10 Issue (5) : 845-855    https://doi.org/10.1007/s11704-016-5421-x
RESEARCH ARTICLE
Multi-label active learning by model guided distribution matching
Nengneng GAO,Sheng-Jun HUANG(),Songcan CHEN
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing 211106, China
 Download: PDF(985 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Multi-label learning is an effective framework for learning with objects that have multiple semantic labels, and has been successfully applied into many real-world tasks. In contrast with traditional single-label learning, the cost of labeling a multi-label example is rather high, thus it becomes an important task to train an effective multi-label learning model with as few labeled examples as possible. Active learning, which actively selects the most valuable data to query their labels, is the most important approach to reduce labeling cost. In this paper, we propose a novel approach MADM for batch mode multi-label active learning. On one hand, MADM exploits representativeness and diversity in both the feature and label space by matching the distribution between labeled and unlabeled data. On the other hand, it tends to query predicted positive instances, which are expected to be more informative than negative ones. Experiments on benchmark datasets demonstrate that the proposed approach can reduce the labeling cost significantly.

Keywords multi-label learning      batch mode active learning      distribution matching     
Corresponding Author(s): Sheng-Jun HUANG   
Just Accepted Date: 25 February 2016   Online First Date: 19 April 2016    Issue Date: 07 September 2016
 Cite this article:   
Nengneng GAO,Sheng-Jun HUANG,Songcan CHEN. Multi-label active learning by model guided distribution matching[J]. Front. Comput. Sci., 2016, 10(5): 845-855.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-016-5421-x
https://academic.hep.com.cn/fcs/EN/Y2016/V10/I5/845
1 Zhang M L, Zhou Z H. A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8): 1819–1837
https://doi.org/10.1109/TKDE.2013.39
2 Yang Y, Wu F, Nie F, Fei Wu, Shen H T, Zhuang Y, Hauptmann A G. Web and personal image annotation by mining label correlation with relaxed visual graph embedding. IEEE Transactions on Image Processing, 2012, 21(3): 1339–1351
https://doi.org/10.1109/TIP.2011.2169269
3 Lin W Z, Fang J A, Xiao X, Chou K C. iLoc-Animal: a multi-label learning classifier for predicting subcellular localization of animal proteins. Molecular BioSystems, 2013, 9(4): 634–644
https://doi.org/10.1039/c3mb25466f
4 Settles B. Active learning literature survey. Madison: University of Wisconsin. Technical Report. 2010
5 Li X, Wang L, Sung E. Multilabel SVM active learning for image classification. In: Proceedings of the 21st IEEE International Conference on Image Processing. 2004, 2207–2210
6 Brinker K. On active learning in multi-label classification. In: Bühlmann P, Tellner D, Havemann S, et al., eds. From Data and Information Analysis to Knowledge Engineering. Springer Berlin Heidelberg, 2006, 206–213
https://doi.org/10.1007/3-540-31314-1_24
7 Yang B, Sun J T, Wang T, Chen Z. Effective multi-label active learning for text classification. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2009, 917–926
https://doi.org/10.1145/1557019.1557119
8 Vasisht D, Damianou A, Varma M, Kapoor A. Active learning for sparse bayesian multilabel classification. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014, 472–481
https://doi.org/10.1145/2623330.2623759
9 Wu J, Sheng V S, Zhang J, Zhao P, Cui Z. Multi-label active learning for image classification. In: Proceedings of IEEE International Conference on Image Processing. 2014, 5227–5231
https://doi.org/10.1109/icip.2014.7026058
10 Zhao S, Wu J, Sheng V S, Ye C, Zhao P, Cui Z. Weak labeled multi-label active learning for image classification. In: Proceedings of the 23rd Annual ACM Conference on Multimedia Conference. 2015, 1127–1130
https://doi.org/10.1145/2733373.2806298
11 Li X, Guo Y. Active learning with multi-label svm classification. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence. 2013, 1479–1485
12 Huang S J, Zhou Z H. Active query driven by uncertainty and diversity for incremental multi-label learning. In: Proceeding of the 13th IEEE International Conference on Data Mining. 2013, 1079–1084
https://doi.org/10.1109/icdm.2013.74
13 Huang S J, Jin R, Zhou Z H. Active learning by querying informative and representative examples. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(10): 1936–1949
https://doi.org/10.1109/TPAMI.2014.2307881
14 Li S Y, Jiang Y, Zhou Z H. Multi-label active learning from crowds. arXiv preprint arXiv:1508.00722, 2015
15 Guo Y, Schuurmans D. Discriminative batch mode active learning. In: Proceedings of Advances in Neural Information Processing Systems. 2008, 593–600
16 Yang Y, Ma Z, Nie F, Chang X, Hauptmann A G. Multi-class active learning by uncertainty sampling with diversity maximization. International Journal of Computer Vision, 2014, 113(2): 113–127
https://doi.org/10.1007/s11263-014-0781-x
17 Long C, Hua G, Multi-class multi-annotator active learning with robust Gaussian Process for visual recognition, In: Proceedings of IEEE International Conference on Computer Vision. 2015
https://doi.org/10.1109/iccv.2015.325
18 Xin J, Cui Z, Zhao P, He T. Active transfer learning of matching query results across multiple sources. Frontiers of Computer Science, 2015, 1–13
19 Hoi S C H, Jin R, Zhu J, Lyu M. Batch mode active learning and its application to medical image classification. In: Proceedings of the 23rd International Conference on Machine Learning. 2006, 417–424
https://doi.org/10.1145/1143844.1143897
20 Chattopadhyay R, Wang Z, Fan W, Ian D, Sethuraman P, Jieping Y. Batch mode active sampling based on marginal probability distribution matching. ACMTransactions on Knowledge Discovery from Data, 2013, 7(3): 965–991
https://doi.org/10.1145/2513092.2513094
21 Guo Y. Active instance sampling via matrix partition. In: Proceedings of Advances in Neural Information Processing Systems. 2010, 802–810
22 Hung C W, Lin H T. Multi-label active learning with auxiliary learner. In: Proceedings of the 3rd Asian Conference on Machine Learning. 2011, 315–332
23 Vapnik V N. The nature of statistical learning theory. In: Cowell R G, Dawid A P, Lauritzen S L, et al., eds. Statistics for Engineering and Information Science. New York: Springer-Verlag, 2000
https://doi.org/10.1007/978-1-4757-3264-1
24 Borgwardt K M, Gretton A, Rasch M, Kriegel H, Schölkopf B, Smola A. Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics, 2006, 22(14): 49–57
https://doi.org/10.1093/bioinformatics/btl242
25 Gretton A, Borgwardt K M, Rasch M, Kriegel H, Schölkopf B, Smola A. A kernel method for the two-sample-problem. In: Proceedings of Advances in Neural Information Processing Systems. 2006, 513–520
26 Huang J, Smola A, Gretton A, Borgwardt K M, Schölkopf B. Correcting sample selection bias by unlabeled data. In: Proceedings of Advances in Neural Information Processing Systems. 2006, 601–608
27 Pan S J, Tsang I W, Kwok J T, Yang Q. Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks, 2011, 22(2): 199–210
https://doi.org/10.1109/TNN.2010.2091281
28 Sriperumbudur B K, Gretton A, Fukumizu K, Schölkopf B, Lanckriet G. Hilbert space embeddings and metrics on probability measures. The Journal of Machine Learning Research, 2010, 11: 1517–1561
29 Trohidis K, Tsoumakas G, Kalliris G, Vlahavas I. Multi-label classification of music into emotions. In: Proceedings of the 9th International Conference On Music Information Retrieval. 2008, 325–330
30 Sebastiani F. Machine learning in automated text categorization. ACM Computing Surveys, 2002, 34(2): 1–47
https://doi.org/10.1145/505282.505283
31 Boutell M R, Luo J, Shen X, Brown C M. Learning multi-label scene classification. Pattern Recognition, 2004, 37(9): 1757–1771
https://doi.org/10.1016/j.patcog.2004.03.009
32 Zhang M L and Zhou Z H. ML-kNN: a lazy learning approach to multilabel learning. Pattern Recognition, 2007, 40(7): 2038–2048
https://doi.org/10.1016/j.patcog.2006.12.019
33 Xu J. Fast multi-label core vector machine. Pattern Recognition, 2013, 46(3): 885–898
https://doi.org/10.1016/j.patcog.2012.09.003
[1]  Supplementary Material Download
[1] Yan-Ping SUN, Min-Ling ZHANG. Compositional metric learning for multi-label classification[J]. Front. Comput. Sci., 2021, 15(5): 155320-.
[2] Yuling MA, Chaoran CUI, Jun YU, Jie GUO, Gongping YANG, Yilong YIN. Multi-task MIML learning for pre-course student performance prediction[J]. Front. Comput. Sci., 2020, 14(5): 145313-.
[3] Liang SUN, Hongwei GE, Wenjing KANG. Non-negative matrix factorization based modeling and training algorithm for multi-label learning[J]. Front. Comput. Sci., 2019, 13(6): 1243-1254.
[4] Min-Ling ZHANG, Yu-Kun LI, Xu-Ying LIU, Xin GENG. Binary relevance for multi-label learning: an overview[J]. Front. Comput. Sci., 2018, 12(2): 191-202.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed