Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2018, Vol. 12 Issue (2) : 191-202    https://doi.org/10.1007/s11704-017-7031-7
REVIEW ARTICLE
Binary relevance for multi-label learning: an overview
Min-Ling ZHANG1,2,3(), Yu-Kun LI1,2,3, Xu-Ying LIU1,2,3, Xin GENG1,2,3
1. School of Computer Science and Engineering, Southeast University, Nanjing 210096, China
2. Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, China
3. Collaborative Innovation Center forWireless Communications Technology, Nanjing 211100, China
 Download: PDF(435 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Multi-label learning deals with problems where each example is represented by a single instance while being associated with multiple class labels simultaneously. Binary relevance is arguably the most intuitive solution for learning from multi-label examples. It works by decomposing the multi-label learning task into a number of independent binary learning tasks (one per class label). In view of its potential weakness in ignoring correlations between labels, many correlation-enabling extensions to binary relevance have been proposed in the past decade. In this paper, we aim to review the state of the art of binary relevance from three perspectives. First, basic settings for multi-label learning and binary relevance solutions are briefly summarized. Second, representative strategies to provide binary relevancewith label correlation exploitation abilities are discussed. Third, some of our recent studies on binary relevance aimed at issues other than label correlation exploitation are introduced. As a conclusion, we provide suggestions on future research directions.

Keywords machine learning      multi-label learning      binary relevance      label correlation      class-imbalance      relative labeling-importance     
Corresponding Author(s): Min-Ling ZHANG   
Just Accepted Date: 18 July 2017   Online First Date: 27 November 2017    Issue Date: 22 March 2018
 Cite this article:   
Min-Ling ZHANG,Yu-Kun LI,Xu-Ying LIU, et al. Binary relevance for multi-label learning: an overview[J]. Front. Comput. Sci., 2018, 12(2): 191-202.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-017-7031-7
https://academic.hep.com.cn/fcs/EN/Y2018/V12/I2/191
1 Zhang M-L, Zhou Z-H. A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8): 1819–1837
https://doi.org/10.1109/TKDE.2013.39
2 Zhou Z-H, Zhang M-L. Multi-label learning. In: Sammut C, Webb G I, eds. Encyclopedia of Machine Learning and Data Mining. Berlin: Springer, 2016, 1–8
https://doi.org/10.1007/978-1-4899-7502-7_910-1
3 Schapire R E, Singer Y. Boostexter: a boosting-based system for text categorization. Machine Learning, 2000, 39(2–3): 135–168
https://doi.org/10.1023/A:1007649029923
4 Cabral R S, De la Torre F, Costeira J P, Bernardino A. Matrix completion for multi-label image classification. In: Proceedings of Advances in Neural Information Processing Systems. 2011, 190–198
5 Sanden C, Zhang J Z. Enhancing multi-label music genre classification through ensemble techniques. In: Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2011, 705–714
https://doi.org/10.1145/2009916.2010011
6 Barutcuoglu Z, Schapire R E, Troyanskaya O G. Hierarchical multilabel prediction of gene function. Bioinformatics, 2006, 22(7): 830–836
https://doi.org/10.1093/bioinformatics/btk048
7 Qi G-J, Hua X-S, Rui Y, Tang J, Mei T, Zhang H-J. Correlative multilabel video annotation. In: Proceedings of the 15th ACM International Conference on Multimedia. 2007, 17–26
8 Tang L, Rajan S, Narayanan V K. Large scale multi-label classification via metalabeler. In: Proceedings of the 19th International Conference on World Wide Web. 2009, 211–220
https://doi.org/10.1145/1526709.1526738
9 Boutell M R, Luo J, Shen X, Brown C M. Learning multi-label scene classification. Pattern Recognition, 2004, 37(9): 1757–1771
https://doi.org/10.1016/j.patcog.2004.03.009
10 Tsoumakas G, Katakis I, Vlahavas I. Mining multi-label data. In: Maimon O, Rokach L, eds. Data Mining and Knowledge Discovery Handbook. Berlin: Springer, 2010, 667–686
11 Gibaja E, Ventura S. A tutorial on multilabel learning. ACM Computing Surveys, 2015, 47(3): 52
https://doi.org/10.1145/2716262
12 Read J, Pfahringer B, Holmes G, Frank E. Classifier chains for multilabel classification. In: Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2009, 254–269
https://doi.org/10.1007/978-3-642-04174-7_17
13 Dembczyński K, Cheng W, Hüllermeier E. Bayes optimal multilabel classification via probabilistic classifier chains. In: Proceedings of the 27th International Conference on Machine Learning. 2010, 279–286
14 Read J, Pfahringer B, Holmes G, Frank E. Classifier chains for multilabel classification. Machine Learning, 2011, 85(3): 333–359
https://doi.org/10.1007/s10994-011-5256-5
15 Kumar A, Vembu S, Menon A K, Elkan C. Learning and inference in probabilistic classifier chains with beam search. In: Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2012, 665–680
https://doi.org/10.1007/978-3-642-33460-3_48
16 Li N, Zhou Z-H. Selective ensemble of classifier chains. In: Proceedings of International Workshop on Multiple Classifier Systems. 2013, 146–156
https://doi.org/10.1007/978-3-642-38067-9_13
17 Senge R, del Coz J J, Hüllermeier E. Rectifying classifier chains for multi-label classification. In: Proceedings of the 15th German Workshop on Learning, Knowledge, and Adaptation. 2013, 162–169
18 Mena D, Montañés E, Quevedo J R, del Coz J J. A family of admissible heuristics for A* to perform inference in probabilistic classifier chains. Machine Learning, 2017, 106(1): 143–169
https://doi.org/10.1007/s10994-016-5593-5
19 Godbole S, Sarawagi S. Discriminative methods for multi-labeled classification. In: Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining. 20004, 22–30
https://doi.org/10.1007/978-3-540-24775-3_5
20 Montañés E, Quevedo J R, del Coz J J. Aggregating independent and dependent models to learn multi-label classifiers. In: proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2011, 484–500
https://doi.org/10.1007/978-3-642-23783-6_31
21 Montañés E, Senge R, Barranquero J, Quevedo J R, del Coz J J, Hüllermeier E. Dependent binary relevance models for multi-label classification. Pattern Recognition, 2014, 47(3): 1494–1508
https://doi.org/10.1016/j.patcog.2013.09.029
22 Tahir M A, Kittler J, Bouridane A. Multi-label classification using stacked spectral kernel discriminant analysis. Neurocomputing, 2016, 171: 127–137
https://doi.org/10.1016/j.neucom.2015.06.023
23 LozaMencía E, Janssen F. Learning rules for multi-label classification: a stacking and a separate-and-conquer approach. Machine Learning, 2016, 105(1): 77–126
https://doi.org/10.1007/s10994-016-5552-1
24 Tsoumakas G, Dimou A, Spyromitros E, Mezaris V, Kompatsiaris I, Vlahavas I. Correlation-based pruning of stacked binary relevance models for multi-label learning. In: Proceedings of the 1st International Workshop on Learning from Multi-Label Data. 2009, 101–116
25 Zhang M-L, Zhang K. Multi-label learning by exploiting label dependency. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2010, 999–1007
https://doi.org/10.1145/1835804.1835930
26 Alessandro A, Corani G, Mauá D, Gabaglio S. An ensemble of Bayesian networks for multilabel classification. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence. 2013, 1220–1225
27 Sucar L E, Bielza C, Morales E F, Hernandez-Leal P, Zaragoza J H, Larrañaga P. Multi-label classification with Bayesian network-based chain classifiers. Pattern Recognition Letters, 2014, 41: 14–22
https://doi.org/10.1016/j.patrec.2013.11.007
28 Li Y-K, Zhang M-L. Enhancing binary relevance for multi-label learning with controlled label correlations exploitation. In: Proceedings of Pacific Rim International Conference on Artificial Intelligence. 2014, 91–103
https://doi.org/10.1007/978-3-319-13560-1_8
29 Alali A, Kubat M. Prudent: a pruned and confident stacking approach for multi-label classification. IEEE Transactions on Knowledge and Data Engineering, 2015, 27(9): 2480–2493
https://doi.org/10.1109/TKDE.2015.2416731
30 Petterson J, Caetano T. Reverse multi-label learning. In: Proceedings of the Neural Information Processing Systems Comference. 2010, 1912–1920
31 Spyromitros-Xioufis E, Spiliopoulou M, Tsoumakas G, Vlahavas I. Dealing with concept drift and class imbalance in multi-label stream classification. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence. 2011, 1583–1588
32 Tahir M A, Kittler J, Yan F. Inverse random under sampling for class imbalance problem and its application to multi-label classification. Pattern Recognition, 2012, 45(10): 3738–3750
https://doi.org/10.1016/j.patcog.2012.03.014
33 Quevedo J R, Luaces O, Bahamonde A. Multilabel classifiers with a probabilistic thresholding strategy. Pattern Recognition, 2012, 45(2): 876–883
34 Pillai I, Fumera G, Roli F. Threshold optimisation for multi-label classifiers. Pattern Recognition, 2013, 46(7): 2055–2065
https://doi.org/10.1016/j.patcog.2013.01.012
35 Dembczynski K, Jachnik A, Kotłowski W, Waegeman W, Hüllermeier E. Optimizing the F-measure in multi-label classification: plug-in rule approach versus structured loss minimization. In: Proceedings of the 30th International Conference on Machine Learning. 2013, 1130–1138
36 Charte F, Rivera A J, del Jesus M J, Herrera F. Addressing imbalance in multilabel classification: measures and random resampling algorithms. Neurocomputing, 2015, 163: 3–16
https://doi.org/10.1016/j.neucom.2014.08.091
37 Charte F, Rivera A J, del Jesus M J, Herrera F. Mlsmote: approaching imbalanced multilabel learning through synthetic instance generation. Knowledge-Based Systems, 2015, 89: 385–397
https://doi.org/10.1016/j.knosys.2015.07.019
38 Zhang M-L, Li Y-K, Liu X-Y. Towards class-imbalance aware multilabel learning. In: Proceedings of the 24th International Joint Conference on Artificial Intelligence. 2015, 4041–4047
39 Wu B, Lyu S, Ghanem B. Constrained submodular minimization for missing labels and class imbalance in multi-label learning. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 2229–2236
40 Cheng W, Dembczynski K J, Hüllermeier E. Graded multilabel classification: the ordinal case. In: Proceedings of the 27th International Conference on Machine Learning. 2010, 223–230
41 Xu M, Li Y-F, Zhou Z-H. Multi-label learning with PRO loss. In: Proceedings of the 27th AAAI Conference on Artificial Intelligence. 2013, 998–1004
42 Li Y-K, Zhang M-L, Geng X. Leveraging implicit relative labelingimportance information for effective multi-label learning. In: Proceedings of the 15th IEEE International Conference on Data Mining. 2015, 251–260
43 Geng X, Yin C, Zhou Z-H. Facial age estimation by learning from label distributions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(10): 2401–2412
https://doi.org/10.1109/TPAMI.2013.51
44 Geng X. Label distribution learning. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(7): 1734–1748
https://doi.org/10.1109/TKDE.2016.2545658
45 Gao N, Huang S-J, Chen S. Multi-label active learning by model guided distribution matching. Frontiers of Computer Science, 2016, 10(5): 845–855
https://doi.org/10.1007/s11704-016-5421-x
46 Dembczyński K, Waegeman W, Cheng W, Hüllermeier E. On label dependence and loss minimization in multi-label classification. Machine Learning, 2012, 88(1–2): 5–45
https://doi.org/10.1007/s10994-012-5285-8
47 Gao W, Zhou Z-H. On the consistency of multi-label learning. In: Proceedings of the 24th Annual Conference on Learning Theory. 2011, 341–358
48 Sun Y-Y, Zhang Y, Zhou Z-H. Multi-label learning with weak label. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence. 2010, 593–598
49 Xu M, Jin R, Zhou Z-H. Speedup matrix completion with side information: application to multi-label learning. In: Proceedings of the Neural Information Processing Systems Conference. 2013, 2301–2309
50 Cabral R, De la Torre F, Costeira J P, Bernardino A. Matrix completion for weakly-supervised multi-label image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(1): 121–135
https://doi.org/10.1109/TPAMI.2014.2343234
51 Senge R, del Coz J J, Hüllermeier E. On the problem of error propagation in classifier chains for multi-label classification. In: Spiliopoulou M, Schmidt-Thieme L, Janning R, eds. Data Analysis, Machine Learning and Knowledge Discovery. Berlin: Springer, 2014. 163–170
https://doi.org/10.1007/978-3-319-01595-8_18
52 Zhou Z-H. Ensemble Methods: Foundations and Algorithms. Boca Raton, FL: Chap-man & Hall/CRC, 2012
53 Koller D, Friedman N. Probabilistic Graphical Models: Principles and Techniques. Cambridge, MA: MIT Press, 2009
54 Koivisto M. Advances in exact Bayesian structure discovery in Bayesian networks. In: Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence. 2006, 241–248
55 Smith V, Yu J, Smulders T, Hartemink A, Jarvis E. Computational inference of neural information flow networks. PLoS Computational Biology, 2006, 2: 1436–1449
https://doi.org/10.1371/journal.pcbi.0020161
56 Murphy K. Software for graphical models: a review. ISBA Bulletin, 2007, 14(4): 13–15
57 Tsoumakas G, Spyromitros-Xioufis E, Vilcek J, Vlahavas I. MULAN: a Java library for multi-label learning. Journal of Machine Learning Research, 2011, 12: 2411–2414
58 He H, Garcia E A. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(9): 1263–1284
https://doi.org/10.1109/TKDE.2008.239
59 Wang S, Yao X. Multiclass imbalance problems: analysis and potential solutions. IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics, 2012, 42(4): 1119–1130
https://doi.org/10.1109/TSMCB.2012.2187280
60 Liu X-Y, Li Q-Q, Zhou Z-H. Learning imbalanced multi-class data with optimal dichotomy weights. In: Proceedings of the 13th IEEE International Conference on Data Mining. 2013, 478–487
https://doi.org/10.1109/ICDM.2013.51
61 Abdi L, Hashemi S. To combat multi-class imbalanced problems by means of over-sampling techniques. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(1): 238–251
https://doi.org/10.1109/TKDE.2015.2458858
62 Zhou D, Bousquet O, Lal T N, Weston J, Schölkopf B. Learning with local and global consistency. In: Proceedings of the Neural Information Processing Systems Conference. 2004, 284–291
63 Zhu X, Goldberg A B. Introduction to semi-supervised learning. In: Brachman R, Stone P, eds. Synthesis Lectures to Artificial Intelligence and Machine Learning. San Francisco, CA: Morgan & Claypool Publishers, 2009, 1–130
https://doi.org/10.2200/S00196ED1V01Y200906AIM006
64 Della Pietra S, Della Pietra V, Lafferty J. Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(4): 380–393
https://doi.org/10.1109/34.588021
65 Zhang M-L, Wu L. LIFT: multi-label learning with label-specific features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(1): 107–120
https://doi.org/10.1109/TPAMI.2014.2339815
66 Xu X, Yang X, Yu H, Yu D-J, Yang J, Tsang E C C. Multi-label learning with label-specific feature reduction. Knowledge-Based Systems, 2016, 104: 52–61
https://doi.org/10.1016/j.knosys.2016.04.012
67 Huang J, Li G, Huang Q, Wu X. Learning label-specific features and class-dependent labels for multi-label classification. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(12): 3309–3323
https://doi.org/10.1109/TKDE.2016.2608339
68 Weston J, Bengio S, Usunier N. WSABIE: scaling up to large vocabulary image annotation. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence. 2011, 2764–2770
69 Agrawal R, Gupta A, Prabhu Y, Varma M. Multi-label learning with millions of labels: recommending advertiser bid phrases for Web pages. In: Proceedings of the 22nd International Conference on World Wide Web. 2013, 13–24
https://doi.org/10.1145/2488388.2488391
70 Xu C, Tao D, Xu C. Robust extreme multi-label learning. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 1275–1284
https://doi.org/10.1145/2939672.2939798
71 Jain H, Prabhu Y, Varma M. Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 935–944
https://doi.org/10.1145/2939672.2939756
72 Zhou W J, Yu Y, Zhang M-L. Binary linear compression for multi-label classification. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017
https://doi.org/10.24963/ijcai.2017/496
[1] Xia-an BI, Yiming XIE, Hao WU, Luyun XU. Identification of differential brain regions in MCI progression via clustering-evolutionary weighted SVM ensemble algorithm[J]. Front. Comput. Sci., 2021, 15(6): 156903-.
[2] Yan-Ping SUN, Min-Ling ZHANG. Compositional metric learning for multi-label classification[J]. Front. Comput. Sci., 2021, 15(5): 155320-.
[3] Jian SUN, Pu-Feng DU. Predicting protein subchloroplast locations: the 10th anniversary[J]. Front. Comput. Sci., 2021, 15(2): 152901-.
[4] Syed Farooq ALI, Muhammad Aamir KHAN, Ahmed Sohail ASLAM. Fingerprint matching, spoof and liveness detection: classification and literature review[J]. Front. Comput. Sci., 2021, 15(1): 151310-.
[5] Yuling MA, Chaoran CUI, Jun YU, Jie GUO, Gongping YANG, Yilong YIN. Multi-task MIML learning for pre-course student performance prediction[J]. Front. Comput. Sci., 2020, 14(5): 145313-.
[6] Liang SUN, Hongwei GE, Wenjing KANG. Non-negative matrix factorization based modeling and training algorithm for multi-label learning[J]. Front. Comput. Sci., 2019, 13(6): 1243-1254.
[7] Xu-Ying LIU, Sheng-Tao WANG, Min-Ling ZHANG. Transfer synthetic over-sampling for class-imbalance learning with limited minority class data[J]. Front. Comput. Sci., 2019, 13(5): 996-1009.
[8] Yu-Feng LI, De-Ming LIANG. Safe semi-supervised learning: a brief introduction[J]. Front. Comput. Sci., 2019, 13(4): 669-676.
[9] Wenhao ZHENG, Hongyu ZHOU, Ming LI, Jianxin WU. CodeAttention: translating source code to comments by exploiting the code constructs[J]. Front. Comput. Sci., 2019, 13(3): 565-578.
[10] Hao SHAO. Query by diverse committee in transfer active learning[J]. Front. Comput. Sci., 2019, 13(2): 280-291.
[11] Qingying SUN, Zhongqing WANG, Shoushan LI, Qiaoming ZHU, Guodong ZHOU. Stance detection via sentiment information and neural network model[J]. Front. Comput. Sci., 2019, 13(1): 127-138.
[12] Ruochen HUANG, Xin WEI, Liang ZHOU, Chaoping LV, Hao MENG, Jiefeng JIN. A survey of data-driven approach on multimedia QoE evaluation[J]. Front. Comput. Sci., 2018, 12(6): 1060-1075.
[13] Qiang LV, Yixin CHEN, Zhaorong LI, Zhicheng CUI, Ling CHEN, Xing ZHANG, Haihua SHEN. Achieving data-driven actionability by combining learning and planning[J]. Front. Comput. Sci., 2018, 12(5): 939-949.
[14] Ashish Kumar DWIVEDI, Anand TIRKEY, Santanu Kumar RATH. Software design pattern mining using classification-based techniques[J]. Front. Comput. Sci., 2018, 12(5): 908-922.
[15] Bo SUN, Haiyan CHEN, Jiandong WANG, Hua XIE. Evolutionary under-sampling based bagging ensemble method for imbalanced data classification[J]. Front. Comput. Sci., 2018, 12(2): 331-350.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed