Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2023, Vol. 17 Issue (5) : 175331    https://doi.org/10.1007/s11704-022-2225-z
RESEARCH ARTICLE
Attribute augmentation-based label integration for crowdsourcing
Yao ZHANG1, Liangxiao JIANG1(), Chaoqun LI2
1. School of Computer Science, China University of Geosciences, Wuhan 430074, China
2. School of Mathematics and Physics, China University of Geosciences, Wuhan 430074, China
 Download: PDF(4214 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Crowdsourcing provides an effective and low-cost way to collect labels from crowd workers. Due to the lack of professional knowledge, the quality of crowdsourced labels is relatively low. A common approach to addressing this issue is to collect multiple labels for each instance from different crowd workers and then a label integration method is used to infer its true label. However, to our knowledge, almost all existing label integration methods merely make use of the original attribute information and do not pay attention to the quality of the multiple noisy label set of each instance. To solve these issues, this paper proposes a novel three-stage label integration method called attribute augmentation-based label integration (AALI). In the first stage, we design an attribute augmentation method to enrich the original attribute space. In the second stage, we develop a filter to single out reliable instances with high-quality multiple noisy label sets. In the third stage, we use majority voting to initialize integrated labels of reliable instances and then use cross-validation to build multiple component classifiers on reliable instances to predict all instances. Experimental results on simulated and real-world crowdsourced datasets demonstrate that AALI outperforms all the other state-of-the-art competitors.

Keywords crowdsourcing      label integration      attribute augmentation      instance filtering     
Corresponding Author(s): Liangxiao JIANG   
Just Accepted Date: 19 August 2022   Issue Date: 15 December 2022
 Cite this article:   
Yao ZHANG,Liangxiao JIANG,Chaoqun LI. Attribute augmentation-based label integration for crowdsourcing[J]. Front. Comput. Sci., 2023, 17(5): 175331.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-022-2225-z
https://academic.hep.com.cn/fcs/EN/Y2023/V17/I5/175331
Fig.1  Overview framework of AALI
  
Dataset #Ins #Att Att.type #Classes
Anneal 898 38 Hybrid 6
Audiology 226 69 Nominal 24
Autos 205 25 Hybrid 7
Balance-scale 625 4 Numeric 3
Biodeg 1055 41 Numeric 2
Breast-cancer 286 9 Nominal 2
Breast-w 699 9 Numeric 2
Car 1728 6 Nominal 4
Credit-a 690 15 Hybrid 2
Credit-g 1000 20 Hybrid 2
Diabetes 768 8 Numeric 2
Heart-c 303 13 Hybrid 5
Heart-h 294 13 Hybrid 5
Heart-statlog 270 13 Numeric 2
Hepatitis 155 19 Hybrid 2
Horse-colic 368 22 Hybrid 2
Hypothyroid 3772 29 Hybrid 4
Ionosphere 351 34 Numeric 2
Iris 150 4 Numeric 3
Kr-vs-kp 3196 36 Nominal 2
Labor 57 16 Hybrid 2
Letter 20000 16 Numeric 26
Lymph 148 18 Hybrid 4
Mushroom 8124 22 Nominal 2
Segment 2310 19 Numeric 7
Sick 3772 29 Hybrid 2
Sonar 208 60 Numeric 2
Spambase 4601 57 Numeric 2
Tic-tac-toe 958 9 Nominal 2
Vehicle 846 18 Numeric 4
Vote 435 16 Nominal 2
Vowel 990 13 Hybrid 11
Waveform 5000 40 Numeric 3
Zoo 101 17 Hybrid 7
Tab.1  Description of 34 simulated datasets
Dataset LQo LQr LQu Perr Peru
Anneal 88.75 88.75 ? 100 0
Audiology 83.63 83.63 ? 100 0
Autos 95.61 95.61 ? 100 0
Balance-scale 79.36 88.79 62.57 71.36 28.64
Biodeg 74.98 95.89 71.05 20.76 79.24
Breast-cancer 75.52 96.36 71 19.23 80.77
Breast-w 77.11 96.67 73.59 21.46 78.54
Car 84.9 86.33 50 99.88 0.12
Credit-a 74.49 94.21 70.12 17.54 82.46
Credit-g 73 96.97 68.14 16.5 83.5
Diabetes 76.43 96.3 69.98 17.58 82.42
Heart-c 71.62 71.62 ? 100 0
Heart-h 70.07 70.07 ? 100 0
Heart-statlog 71.85 92.11 68.97 14.07 85.93
Hepatitis 72.26 92.86 68.5 18.06 81.94
Horse-colic 78.8 98.51 75.75 18.21 81.79
Hypothyroid 85.42 88.44 100 99.97 0.03
Ionosphere 78.06 96.83 74.65 17.95 82.05
Iris 77.33 93.9 52.94 54.67 45.33
Kr-vs-kp 76.31 94.6 72.03 19.12 80.88
Labor 70.18 91.67 64.44 21.05 78.95
Letter 99.07 99.07 ? 100 0
Lymph 70.95 70.95 ? 100 0
Mushroom 64.99 81.28 62.44 13.81 86.19
Segment 97.1 97.05 28.57 99.7 0.3
Sick 68.43 91.31 65.34 16.17 83.83
Sonar 69.23 88 66.67 12.02 87.98
Spambase 79.09 95.04 72.71 23.23 76.77
Tic-tac-toe 73.7 92.63 70.96 19.83 80.17
Vehicle 89.13 91.44 37.5 95.27 4.73
Vote 72.64 91.46 67.14 18.85 81.15
Vowel 98.08 98.08 ? 100 0
Waveform 93.04 99.03 76.22 71.82 28.18
Zoo 82.18 82.18 ? 100 0
Average 79.22 90.81 66.45 53.47 46.53
Tab.2  The label quality (%) of the integrated labels inferred by MV from the original dataset, the reliable dataset and the unreliable dataset
Dataset DS ZC KOS GTIC IWMV AALI
Anneal 35.16 80.60 93.21 54.77 91.50 91.14
Audiology 86.81 59.87 65.35 78.67 81.77 83.81
Autos 86.05 36.10 78.68 83.56 92.59 92.29
Balance-scale 53.22 77.58 82.46 82.98 84.05 85.57
Biodeg 66.26 70.88 75.97 75.33 76.12 78.91
Breast-cancer 70.28 69.16 66.26 73.57 70.24 75.98
Breast-w 65.52 75.55 76.47 76.60 76.37 81.33
Car 76.45 78.64 88.20 81.83 86.24 86.55
Credit-a 55.51 74.87 77.55 73.97 77.41 82.84
Credit-g 70.00 72.63 76.36 77.64 76.27 77.99
Diabetes 65.10 72.70 74.65 74.43 74.39 76.45
Heart-c 53.56 70.63 71.35 43.50 73.40 73.99
Heart-h 63.95 69.63 68.61 36.80 73.33 75.44
Heart-statlog 55.56 74.74 69.33 75.93 75.04 84.78
Hepatitis 79.35 67.55 60.84 63.35 68.65 69.10
Horse-colic 63.04 71.03 67.66 68.83 72.01 69.38
Hypothyroid 92.31 79.07 93.26 62.11 88.10 89.68
Ionosphere 64.10 70.11 70.80 70.11 73.93 75.73
Iris 84.73 49.20 82.13 86.87 88.13 94.40
Kr-vs-kp 52.22 74.39 77.21 74.71 77.09 81.29
Labor 64.21 67.72 55.44 68.77 67.89 70.88
Letter 99.17 95.77 85.52 99.34 99.44 99.64
Lymph 62.23 70.88 77.57 77.91 77.64 78.38
Mushroom 51.80 73.25 73.57 72.00 73.38 69.62
Segment 96.94 70.73 88.43 97.05 97.37 98.32
Sick 93.88 73.33 77.22 73.84 77.11 80.79
Sonar 52.69 72.50 71.92 74.57 74.62 74.71
Spambase 60.60 72.48 75.34 74.57 75.25 80.21
Tic-tac-toe 65.34 75.92 77.78 77.80 77.32 77.23
Vehicle 92.80 63.51 88.16 92.96 93.45 93.66
Vote 61.38 69.36 65.24 72.78 72.07 77.40
Vowel 97.31 87.93 78.66 98.40 98.36 99.01
Waveform 89.66 73.86 87.30 88.54 89.74 92.62
Zoo 76.34 12.97 77.03 88.81 85.54 90.59
Average 70.69 69.86 76.34 75.67 80.47 82.64
Tab.3  Label quality (%) comparisons of six methods on 34 simulated datasets
DS ZC KOS GTIC IWMV AALI
DS ? ° ° ° °
ZC ? ° ° ° °
KOS ? ? ° °
GTIC ? ? ? ° °
IWMV ? ? ? ? ? °
AALI ? ? ? ? ? ?
Tab.4  The label quality comparisons of the Wilcoxon signed-ranks test
Dataset DS ZC KOS GTIC IWMV AALI
Anneal 36.17 80.29 84.19 49.11 86.53 90.88
Audiology 72.12 75.66 68.58 74.78 72.12 76.32
Autos 70.73 56.10 61.46 70.24 76.59 79.04
Balance-scale 46.08 77.76 79.84 76.20 75.72 78.33
Biodeg 66.26 74.31 79.04 76.48 75.20 78.24
Breast-cancer 70.28 71.33 73.43 70.98 70.28 74.83
Breast-w 65.52 91.70 89.70 88.13 93.13 95.28
Car 73.44 88.89 88.14 89.99 89.12 92.17
Credit-a 55.51 79.57 81.59 76.52 78.70 85.51
Credit-g 70.00 67.20 67.60 68.50 68.70 70.81
Diabetes 65.10 68.10 70.70 72.79 60.42 73.35
Heart-c 54.46 62.71 52.48 21.78 66.01 75.58
Heart-h 63.95 74.49 77.55 26.87 75.51 75.81
Heart-statlog 55.56 62.59 64.44 63.70 62.59 73.15
Hepatitis 79.35 76.13 72.90 64.94 74.19 82.45
Horse-colic 63.04 85.05 83.70 80.16 69.84 93.48
Hypothyroid 92.29 94.98 98.13 96.29 96.99 97.48
Ionosphere 64.10 78.06 78.63 82.62 66.38 93.73
Iris 80.00 86.67 86.37 90.67 90.00 91.67
Kr-vs-kp 52.22 97.47 97.31 92.49 97.28 99.47
Labor 64.91 73.68 49.12 75.44 63.16 75.44
Letter 88.00 82.11 86.83 88.00 87.45 92.43
Lymph 54.73 75.00 72.97 68.24 68.24 77.70
Mushroom 51.80 97.45 98.73 79.10 99.06 100.00
Segment 96.49 82.90 95.36 96.54 96.76 96.88
Sick 93.88 94.70 97.03 94.51 73.62 99.81
Sonar 53.37 54.33 54.33 61.06 63.94 73.88
Spambase 60.60 88.63 88.20 88.61 89.15 92.31
Tic-tac-toe 65.34 73.38 77.33 75.87 80.48 85.28
Vehicle 69.27 50.95 65.60 70.21 71.63 73.46
Vote 61.38 90.33 90.71 86.13 92.87 94.13
Vowel 79.70 78.48 73.30 77.88 81.92 79.90
Waveform 71.88 71.04 70.66 70.16 71.90 75.38
Zoo 81.16 70.30 70.30 90.07 91.48 92.10
Average 67.31 77.42 77.83 75.15 78.73 84.89
Tab.5  Model quality (%) comparisons of six methods on 34 simulated datasets
DS ZC KOS GTIC IWMV AALI
DS ? ° ° ° ° °
ZC ? ? °
KOS ? ? °
GTIC ? ? °
IWMV ? ? °
AALI ? ? ? ? ? -
Tab.6  The model quality comparisons of the Wilcoxon signed-ranks test
Dataset #Ins #Att Att.type #Classes #Workers #Labels
Income 600 10 hybrid 2 67 6000
Leaves 384 64 numeric 6 83 3840
LabelMe 1000 512 numeric 8 59 2547
NER 5985 2824 nominal 2 47 27990
Tab.7  Description of four real-world datasets
Dataset DS ZC KOS GTIC IWMV AALI
Income 72.67 72.03 71.33 71.83 71.77 73.4
Leaves 63.8 64.61 63.54 62.24 64.35 66.69
LabelMe 74.7 77.2 76.5 76.7 77.2 78.7
NER 89.64 88.64 86.83 83.11 87.95 84.11
Tab.8  Label quality (%) comparisons of six methods on four real-world datasets
Dataset DS ZC KOS GTIC IWMV AALI
Income 69.83 69 71 70.33 69.17 76.67
Leaves 54.43 52.34 53.39 50.52 52.6 78.39
LabelMe 48.8 49 47.5 47.2 47 52.8
NER 85.03 84.86 84.06 82.09 84.48 87.18
Tab.9  Model quality (%) comparisons of six methods on four real-world datasets
Fig.2  The comparison results for AALI and its two variants on the real-world dataset “Leaves”. (a) Label quality comparison results; (b) Model quality comparison results
Fig.3  The label quality of AALI when θ varies from 0.05 to 0.95 on the real-world datasets “Income” and “Leaves”. (a) “Income”; (b) “Leaves”
  
  
  
1 L, Jiang L, Zhang L, Yu D Wang . Class-specific attribute weighted naive Bayes. Pattern Recognition, 2019, 88: 321–330
2 Y, Dong L, Jiang C Li . Improving data and model quality in crowdsourcing using co-training-based noise correction. Information Sciences, 2022, 583: 174–188
3 Z, Chen L, Jiang C Li . Label distribution-based noise correction for multiclass crowdsourcing. International Journal of Intelligent Systems, 2022, 37( 9): 5752–5767
https://doi.org/10.1002/int.22812
4 N, Zhang J, Xue Y, Ma R, Zhang T, Liang Y A Tan . Hybrid sequence-based android malware detection using natural language processing. International Journal of Intelligent Systems, 2021, 36( 10): 5770–5784
5 Y, Hu Z, Ou X, Xu M Song . A crowdsourcing repeated annotations system for visual object detection. In: Proceedings of the 3rd International Conference on Vision, Image and Signal Processing. 2019, 14
6 E N N, Ocquaye Q, Mao Y, Xue H Song . Cross lingual speech emotion recognition via triple attentive asymmetric convolutional neural network. International Journal of Intelligent Systems, 2021, 36( 1): 53–71
7 V S, Sheng F, Provost P G Ipeirotis . Get another label? Improving data quality and data mining using multiple, noisy labelers. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008, 614−622
8 T, Tian J, Zhu B You . Max-margin majority voting for learning from crowds. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41( 10): 2480–2494
9 V S, Sheng J Zhang . Machine learning with crowdsourcing: a brief summary of the past research and future directions. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2019, 9837−9843
10 J Zhang . Knowledge learning with crowdsourcing: a brief review and systematic perspective. IEEE/CAA Journal of Automatica Sinica, 2022, 9( 5): 749–762
11 A P, Dawid A M Skene . Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics), 1979, 28( 1): 20–28
12 G, Demartini D E, Difallah P Cudré-Mauroux . ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In: Proceedings of the 21st International Conference on World Wide Web. 2012, 469−478
13 V C, Raykar S, Yu L H, Zhao G H, Valadez C, Florin L, Bogoni L Moy . Learning from crowds. The Journal of Machine Learning Research, 2010, 11: 1297–1322
14 M A, Gemalmaz M Yin . Accounting for confirmation bias in crowdsourced label aggregation. In: Proceedings of the 30th International Joint Conference on Artificial Intelligence. 2021, 1729−1735
15 J, Whitehill P, Ruvolo T, Wu J, Bergsma J Movellan . Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of the 22nd International Conference on Neural Information Processing Systems. 2009, 2035−2043
16 T, Han H, Sun Y, Song Y, Fang X Liu . Find truth in the hands of the few: acquiring specific knowledge with crowdsourcing. Frontiers of Computer Science, 2021, 15( 4): 154315
17 J, Zhang X Wu . Multi-label truth inference for crowdsourcing using mixture models. IEEE Transactions on Knowledge and Data Engineering, 2021, 33( 5): 2083–2095
18 F, Rodrigues F C Pereira . Deep learning from crowds. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence. 2018, 1611−1618
19 M Y, Guan V, Gulshan A M, Dai G E Hinton . Who said what: modeling individual labelers improves classification. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence. 2018, 3109−3118
20 K, Atarashi S, Oyama M Kurihara . Semi-supervised learning from crowds using deep generative models. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence. 2018, 1555–1562
21 S Y, Li S J, Huang S Chen . Crowdsourcing aggregation with deep Bayesian learning. Science China Information Sciences, 2021, 64( 3): 130104
22 V S, Sheng J, Zhang B, Gu X Wu . Majority voting and pairing with multiple noisy labeling. IEEE Transactions on Knowledge and Data Engineering, 2019, 31( 7): 1355–1368
23 F, Tao L, Jiang C Li . Label similarity-based weighted soft majority voting and pairing for crowdsourcing. Knowledge and Information Systems, 2020, 62( 7): 2521–2538
24 F, Tao L, Jiang C Li . Differential evolution-based weighted soft majority voting for crowdsourcing. Engineering Applications of Artificial Intelligence, 2021, 106: 104474
25 D R, Karger S, Oh D Shah . Budget-optimal task allocation for reliable crowdsourcing systems. Operations Research, 2014, 62( 1): 1–24
26 H, Li B Yu . Error rate bounds and iterative weighted majority voting for crowdsourcing. 2014, arXiv preprint arXiv: 1411.4086
27 J, Zhang X, Wu V S Sheng . Imbalanced multiple noisy labeling. IEEE Transactions on Knowledge and Data Engineering, 2015, 27( 2): 489–503
28 J, Zhang V S, Sheng J, Wu X Wu . Multi-class ground truth inference in crowdsourcing with clustering. IEEE Transactions on Knowledge and Data Engineering, 2016, 28( 4): 1080–1085
29 J, Zhang M, Wu V S Sheng . Ensemble learning from crowds. IEEE Transactions on Knowledge and Data Engineering, 2019, 31( 8): 1506–1519
30 L, Jiang H, Zhang F, Tao C Li . Learning from crowds with multiple noisy label distribution propagation. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(11): 6558−6568
31 J, Zhang V S, Sheng B, Nicholson X Wu . CEKA: a tool for mining the wisdom of crowds. The Journal of Machine Learning Research, 2015, 16( 1): 2853–2858
32 I H, Witten E, Frank M A Hall . Data Mining: Practical Machine Learning Tools and Techniques. 3rd ed. Morgan Kaufmann: Elsevier, 2011
33 P, Langley W, Iba K Thompson . An analysis of Bayesian classifiers. In: Proceedings of the Tenth National Conference on Artificial Intelligence. 1992, 223−228
34 J R Quinlan . C4.5: Programs for Machine Learning. San Mateo: Morgan Kaufmann Publishers, 1993
35 Cessie S, le Houwelingen J C van . Ridge estimators in logistic regression. Journal of the Royal Statistical Society: Series C (Applied Statistics), 1992, 41( 1): 191–201
36 J, Alcala-Fdez A, Fernández J, Luengo J, Derrac S, García L, Sánchez H Herrera . KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. Journal of Multiple-Valued Logic and Soft Computing, 2011, 17(2−3): 255−287
37 J Demšar . Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 2006, 7: 1–30
38 L, Jiang L, Zhang C, Li J Wu . A correlation-based feature weighting filter for naive Bayes. IEEE Transactions on Knowledge and Data Engineering, 2019, 31( 2): 201–213
39 A, Oliva A Torralba . Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision, 2001, 42( 3): 145–175
[1] FCS-22225-OF-YZ_suppl_1 Download
[1] Xiaochuan LIN, Kaimin WEI, Zhetao LI, Jinpeng CHEN, Tingrui PEI. Aggregation-based dual heterogeneous task allocation in spatial crowdsourcing[J]. Front. Comput. Sci., 2024, 18(6): 186605-.
[2] Lijuan REN, Liangxiao JIANG, Wenjun ZHANG, Chaoqun LI. Label distribution similarity-based noise correction for crowdsourcing[J]. Front. Comput. Sci., 2024, 18(5): 185323-.
[3] Jiaran LI, Richong ZHANG, Samuel MENSAH, Wenyi QIN, Chunming HU. Classification-oriented dawid skene model for transferring intelligence from crowds to machines[J]. Front. Comput. Sci., 2023, 17(5): 175332-.
[4] Peng LI, Junzuo LAI, Yongdong WU. Accountable attribute-based authentication with fine-grained access control and its application to crowdsourcing[J]. Front. Comput. Sci., 2023, 17(1): 171802-.
[5] Tao HAN, Hailong SUN, Yangqiu SONG, Yili FANG, Xudong LIU. Find truth in the hands of the few: acquiring specific knowledge with crowdsourcing[J]. Front. Comput. Sci., 2021, 15(4): 154315-.
[6] Gang WU, Zhiyong CHEN, Jia LIU, Donghong HAN, Baiyou QIAO. Task assignment for social-oriented crowdsourcing[J]. Front. Comput. Sci., 2021, 15(2): 152316-.
[7] Zhenghui HU, Wenjun WU, Jie LUO, Xin WANG, Boshu LI. Quality assessment in competition-based software crowdsourcing[J]. Front. Comput. Sci., 2020, 14(6): 146207-.
[8] Bo YUAN, Xiaolei ZHOU, Xiaoqiang TENG, Deke GUO. Enabling entity discovery in indoor commercial environments without pre-deployed infrastructure[J]. Front. Comput. Sci., 2019, 13(3): 618-636.
[9] Xiaolei ZHOU, Tao CHEN, Deke GUO, Xiaoqiang TENG, Bo YUAN. From one to crowd: a survey on crowdsourcing-based wireless indoor localization[J]. Front. Comput. Sci., 2018, 12(3): 423-450.
[10] Najam NAZAR,He JIANG,Guojun GAO,Tao ZHANG,Xiaochen LI,Zhilei REN. Source code fragment summarization with small-scale crowdsourcing based features[J]. Front. Comput. Sci., 2016, 10(3): 504-517.
[11] Xiaolan XU,Wenjun WU,Ya WANG,Yuchuan WU. Software crowdsourcing for developing Software-as-a-Service[J]. Front. Comput. Sci., 2015, 9(4): 554-565.
[12] Wenjun WU, Wei-Tek TSAI, Wei LI. An evaluation framework for software crowdsourcing[J]. Front. Comput. Sci., 2013, 7(5): 694-709.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed