Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

邮发代号 80-970

2019 Impact Factor: 1.275

Frontiers of Computer Science  2023, Vol. 17 Issue (4): 174327   https://doi.org/10.1007/s11704-022-1624-5
  本期目录
Active label distribution learning via kernel maximum mean discrepancy
Xinyue DONG, Tingjin LUO(), Ruidong FAN, Wenzhang ZHUGE, Chenping HOU()
College of Science, National University of Defense Technology, Changsha 410073, China
 全文: PDF(12170 KB)   HTML
Abstract

Label distribution learning (LDL) is a new learning paradigm to deal with label ambiguity and many researches have achieved the prominent performances. Compared with traditional supervised learning scenarios, the annotation with label distribution is more expensive. Direct use of existing active learning (AL) approaches, which aim to reduce the annotation cost in traditional learning, may lead to the degradation of their performance. To deal with the problem of high annotation cost in LDL, we propose the active label distribution learning via kernel maximum mean discrepancy (ALDL-kMMD) method to tackle this crucial but rarely studied problem. ALDL-kMMD captures the structural information of both data and label, extracts the most representative instances from the unlabeled ones by incorporating the nonlinear model and marginal probability distribution matching. Besides, it is also able to markedly decrease the amount of queried unlabeled instances. Meanwhile, an effective solution is proposed for the original optimization problem of ALDL-kMMD by constructing auxiliary variables. The effectiveness of our method is validated with experiments on the real-world datasets.

Key wordslabel distribution learning    active learning    maximum mean discrepancy    auxiliary variable
收稿日期: 2021-11-01      出版日期: 2022-12-09
Corresponding Author(s): Tingjin LUO,Chenping HOU   
 引用本文:   
. [J]. Frontiers of Computer Science, 2023, 17(4): 174327.
Xinyue DONG, Tingjin LUO, Ruidong FAN, Wenzhang ZHUGE, Chenping HOU. Active label distribution learning via kernel maximum mean discrepancy. Front. Comput. Sci., 2023, 17(4): 174327.
 链接本文:  
https://academic.hep.com.cn/fcs/CN/10.1007/s11704-022-1624-5
https://academic.hep.com.cn/fcs/CN/Y2023/V17/I4/174327
Notations Descriptions
L The training set
U The unlabeled set
Q The query set
c The number of the labels
n l Number of labeled instances in L
n u Number of unlabeled instances in U
b Number of instances in Q dataset
yi=[yi1,yi2,...,yic] The label distribution of xi.
a The weight vector
Tab.1  
  
Data Instance Feature Label
Yeast-alpha 2,465 24 18
Yeast-cold 2,465 24 18
Yeast-cdc 2,465 24 15
Yeast-elu 2,465 24 14
Yeast-heat 2,465 24 7
Human-gene 30,542 36 68
Natural-scene 2,000 294 9
SBU-3DFE 2,500 243 6
Movie 7,755 1,869 5
Tab.2  
Fig.1  
Data Methods Number of queries Win/tie/loss
1 3 5 7 9
Yeast-alpha RANDOM .810±.016 .836±.017 .894±.034 .908±.033 .918±.027
MMAD .886±.019 .896±.020 .903±.029 .916±.025 .926±.030
MAED .906±.019 .919±.026 .923±.056 .927±.010 .928±.013
QBC .897±.029 .906±.017 .917±.036 .924±.017 .928±.013
QUIRE .917±.021 .920±.015 .925±.025 .927±.012 .929±.013
DUAL .923±.023 .926±.017 .942±.009 .945±.018 .950±.015
ALDL-MMD .913±.029 .927±.026 .940±.020 .942±.013 .945±.015
ALDL-kMMD .923±.018 .927±.026 .943±.019 .947±.027 .955±.016 .7/8/0
Yeast-cold RANDOM .918±.016 .927±.017 .931±.031 .933±.033 .934±.017
MAED .909±.040 .912±.025 .914±.032 .917±.002 .919±.039
MMAD .916±.011 .922±.023 .924±.010 .926±.022 .927±.029
QBC .922±.022 .931±.029 .934±.056 .936±.034 .937±.031
QUIRE .923±.017 .933±.020 .934±.012 .936±.019 .938±.017
DUAL .941±.015 .950±.020 .951±.032 .952±.012 .954±.023
ALDL-MMD .924±.020 .933±.015 .935±.043 .936±.006 .937±.026
ALDL-kMMD .944±.031 .951±.039 .953±.039 .954±.031 .955±.015 30/5/0
Yeast-cdc RANDOM .631±.034 .691±.028 .764±.034 .804±.015 .823±.016
MMAD .679±.016 .764±.029 .802±.024 .821±.039 .839±.006
MAED .600±.016 .766±.040 .802±.043 .812±.022 .820±.033
QBC .521±.028 .534±.029 .612±.022 .680±.019 .767±.014
QUIRE .725±.023 .820±.017 .835±.024 .863±.018 .864±.017
DUAL .893±.026 .912±.018 .922±.014 .925±.014 .937±.028
ALDL-MMD .720±.010 .825± .020 .834±.024 .865±.018 .864±.017
ALDL-kMMD .913±.029 .927±.026 .940±.020 .942±.013 .945±.015 34/1/0
Yeast-elu RANDOM .638±.032 .905±.018 .912±.023 .922±.011 .933±.017
MMAD .489±.017 .638±.012 .720±.018 .906±.049 .947±.029
MAED .674±.017 .716±.019 .889±.030 .938±.014 .955±.039
QBC .662±.032 .879±.036 .908±.012 .914±.027 .912±.012
QUIRE .683±.018 .726±.020 .915±.016 .917±.023 .918±.016
DUAL .873±.017 .906±.025 .920±.026 .925±.017 .929±.026
ALDL-MMD .712±.017 .855±.009 .859±.019 .861±.008 .876±.007
ALDL-kMMD .906±.019 .919±.026 .923±.056 .927±.010 .938±.013 33/2/0
Yeast-heat Random .564±.020 .573±.016 .591±.010 .622±.026 .690±.022
MAED .697±.015 .709±.040 .711±.034 .718±.010 .751±.029
MMAD .604±.031 .613±.023 .624±.022 .644±.011 .688±.029
QBC .545±.026 .550±.035 .567±.024 .591±.036 .668±.043
QUIRE .702±.025 .725±.027 .715±.013 .721±.018 .737±.024
DUAL .818±.016 .825±.015 .830±.027 .845±.020 .894±.019
ALDL-MMD .540±.017 .548±.022 .645±.016 .556±.022 .582±.033
ALDL-kMMD .823±.029 .817±.026 .830±.020 .847±.013 .905±.015 32/2/1
Human gene RANDOM .631±.044 .764±.034 .804±.015 .817±.028 .823±.016
MMAD .659±.017 .802±.034 .821±.039 .839±.026 .852±.028
MAED .600±.026 .691±.028 .766±.040 .802±.043 .820±.033
QBC .691±.027 .764±.029 .812±.022 .817±.029 .823±.024
QUIRE .818±.021 .824±.030 .829±.012 .906±.022 .916±.014
DUAL .909±.019 .925±.017 .935±.021 .948±.018 .958±.022
ALDL-MMD .820±.025 .925±.020 .931±.031 .933±.033 .934±.017
ALDL-kMMD .918±.016 .927±.017 .945±.018 .951±.014 .960±.017 31/4/0
Natural scene RANDOM .821±.020 .884±.031 .908±.036 .912±.015 .927±.028
MMAD .827±.024 .858±.036 .879±.038 .926±.032 .940±.028
MAED .720±.160 .931±.029 .934±.056 .936±.034 .937±.031
QBC .855±.028 .939±.024 .943±.035 .954±.012 .957±.029
QUIRE .873±.027 .889±.019 .937±.023 .953±.017 .960±.022
DUAL .918±.021 .931±.020 .952±.014 .960±.018 .974±.013
ALDL-MMD .807±.026 .924±.014 .940±.018 .952±.025 .958±.018
ALDL-kMMD .922±.022 .944±.009 .969±.016 .972±.025 .976±.016 34/1/0
SBU-3DFE RANDOM .801±.031 .912±.028 .930±.016 .943±.019 .953±.016
MMAD .752±.014 .800±.024 .819±.036 .878±.040 .897±.031
MAED .812±.020 .924±.024 .941±.024 .944±.015 .953±.019
QBC .879±.019 .908±.016 .915±.035 .959±.017 .962±.017
QUIRE .876±.015 .921±.018 .939±.019 .945±.023 .967±.016
DUAL .912±.016 .944±.007 .973±.014 .976±.017 .979±.024
ALDL-MMD .879±.018 .933±.015 .935±.043 .936±.006 .937±.026
ALDL-kMMD .924±.020 .948±.018 .951±.016 .973±.011 .977±.007 31/3/1
Movie RANDOM .810±.016 .836±.017 .894±.034 .908±.033 .918±.027
MMAD .886±.019 .896±.020 .903±.029 .916±.025 .926±.030
MAED .906±.019 .919±.026 .923±.056 .927±.010 .928±.013
QBC .897±.029 .906±.017 .917±.036 .924±.017 .928±.013
QUIRE .895±.019 .913±.015 .925±.023 .927±.017 .929±.019
DUAL .933±.022 .946±.017 .952±.021 .955±.016 .957±.021
ALDL-MMD .913±.029 .927±.026 .940±.020 .942±.013 .945±.015
ALDL-kMMD .944±.031 .951±.039 .953±.039 .954±.031 .955±.015 31/4/0
Tab.3  
Fig.2  
Fig.3  
Method Yeast-cold Natural-scene Human-gene
RANDOM 2.563 2.674 10.254
QBC 24.563 26.436 100.545
MMAD 3.583 5.927 17.769
MAED 4.327 5.451 25.528
ALDL-MMD 4.256 4.765 24.658
ALDL-kMMD 4.654 4.836 26.342
Tab.4  
  
  
  
  
  
1 X Geng . Label distribution learning. IEEE Transactions on Knowledge and Data Engineering, 2016, 28( 7): 1734–1748
2 M L, Zhang Z H Zhou . A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 2014, 26( 8): 1819–1837
3 X, Geng C, Yin Z H Zhou . Facial age estimation by learning from label distributions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35( 10): 2401–2412
4 B B, Gao H Y, Zhou J, Wu X Geng . Age estimation using expectation of label distribution learning. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2018, 712–718
5 S G, Kong R O Mbouna . Head pose estimation from a 2D face image using 3D face morphing with depth parameters. IEEE Transactions on Image Processing, 2015, 24( 6): 1801–1808
6 Y, Zhou H, Xue X Geng . Emotion distribution recognition from facial expressions. In: Proceedings of the 23rd ACM International Conference on Multimedia. 2015, 1247–1250
7 D, Zhou X, Zhang Y, Zhou Q, Zhao X Geng . Emotion distribution learning from texts. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016, 638–647
8 X, Dong S, Gu W, Zhuge T, Luo C Hou . Active label distribution learning. Neurocomputing, 2021, 436: 12–21
9 C J C Burges . A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 1998, 2( 2): 121–167
10 A, McCallum K Nigam . A comparison of event models for naive Bayes text classification. In: Proceedings of AAAI-98 Workshop on Learning for Text Categorization. 1998, 41–48
11 S, Tong D Koller . Support vector machine active learning with applications to text classification. The Journal of Machine Learning Research, 2002, 2: 45–66
12 Y, Freund H S, Seung E, Shamir N Tishby . Selective sampling using the query by committee algorithm. Machine Learning, 1997, 28( 2–3): 133–168
13 Y, Guo D Schuurmans . Discriminative batch mode active learning. In: Proceedings of the 20th International Conference on Neural Information Processing Systems. 2007, 593–600
14 T, Ren X, Jia W, Li S Zhao . Label distribution learning with label correlations via low-rank approximation. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence. 2019, 3325–3331
15 A L, Berger V J D, Pietra S A D Pietra . A maximum entropy approach to natural language processing. Computational Linguistics, 1996, 22( 1): 39–71
16 S D, Pietra V D, Pietra J Lafferty . Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19( 4): 380–393
17 J, Nocedal S J Wright . Numerical Optimization. New York: Springer, 2006
18 N, Xu Y P, Liu X Geng . Label enhancement for label distribution learning. IEEE Transactions on Knowledge and Data Engineering, 2021, 33( 4): 1632–1643
19 Wang J, Geng X, Xue H. Re-weighting large margin label distribution learning for classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, DOI: 10.1109/TPAMI.2021.3082623
20 Seung H S, Opper M, Sompolinsky H. Query by committee. In: Proceedings of the 5th Annual Workshop on Computational Learning Theory. 1992, 287–294
21 D D, Lewis J Catlett . Heterogeneous uncertainty sampling for supervised learning. In: Cohen W W, ed. Machine Learning Proceedings. New Brunswick: Elsevier, 1994
22 M F, Balcan A, Broder T Zhang . Margin based active learning. In: Proceedings of the 20th International Conference on Computational Learning Theory. 2007, 35–50
23 D V Lindley . On a measure of the information provided by an experiment. The Annals of Mathematical Statistics, 1956, 27( 4): 986–1005
24 K, Yu J, Bi V Tresp . Active learning via transductive experimental design. In: Proceedings of the 23rd International Conference on Machine Learning. 2006, 1081–1088
25 Nguyen H T, Smeulders A. Active learning using pre-clustering. In: Proceedings of the 21st International Conference on Machine Learning. 2004, 9
26 F, Nie D, Xu X Li . Initialization independent clustering with actively self-training method. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2012, 42( 1): 17–27
27 D, Cai X He . Manifold adaptive experimental design for text categorization. IEEE Transactions on Knowledge and Data Engineering, 2012, 24( 4): 707–719
28 S J, Huang R, Jin Z H Zhou . Active learning by querying informative and representative examples. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36( 10): 1936–1949
29 Huang S J, Zhou Z H. Active query driven by uncertainty and diversity for incremental multi-label learning. In: Proceedings of the 13th IEEE International Conference on Data Mining. 2013, 1079–1084
30 S J, Huang J W, Zhao Z Y Liu . Cost-effective training of deep CNNs with active model adaptation. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018, 1580–1588
31 S J, Huang J L, Chen X, Mu Z H Zhou . Cost-effective active learning from diverse labelers. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017, 1879–1885
32 Yang Y, Zhou D W, Zhan D C, Xiong H, Jiang Y, Yang J. Cost-effective incremental deep model: matching model capacity with the least sampling. IEEE Transactions on Knowledge and Data Engineering, 2021, DOI: 10.1109/TKDE.2021.3132622
33 Y P, Tang S J Huang . Dual active learning for both model and data selection. In: Proceedings of the 30th International Joint Conference on Artificial Intelligence. 2021, 3052–3058
34 K M, Borgwardt A, Gretton M J, Rasch H P, Kriegel B, Schölkopf A J Smola . Integrating structured biological data by Kernel Maximum Mean Discrepancy. Bioinformatics, 2006, 22( 14): e49–e57
35 R, Fortet E Mourier . Convergence de la répartition empirique vers la répartition théorique. Annales Scientifiques de l'École Normale Supérieure, 1953, 70( 3): 267–285
36 M B, Eisen P T, Spellman P O, Brown D Botstein . Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America, 1998, 95( 25): 14863–14868
37 A, Kapoor K, Grauman R, Urtasun T Darrell . Gaussian processes for object categorization. International Journal of Computer Vision, 2010, 88( 2): 169–188
38 F B, Guo Y Lin . Identify protein-coding genes in the genomes of Aeropyrum pernix K1 and Chlorobium tepidum TLS. Journal of Biomolecular Structure and Dynamics, 2009, 26( 4): 413–420
39 Z, Ghafoori J C, Bezdek C, Leckie S Karunasekera . Unsupervised and active learning using maximin-based anomaly detection. In: Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2019, 90–106
[1] FCS-21624-OF-XD_suppl_1 Download
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed