Attribute augmentation-based label integration for crowdsourcing

doi:10.1007/s11704-022-2225-z

Front. Comput. Sci.

2023, Vol. 17

Issue (5) : 175331 https://doi.org/10.1007/s11704-022-2225-z

RESEARCH ARTICLE

Attribute augmentation-based label integration for crowdsourcing

Yao ZHANG¹, Liangxiao JIANG¹(

), Chaoqun LI²

¹. School of Computer Science, China University of Geosciences, Wuhan 430074, China
². School of Mathematics and Physics, China University of Geosciences, Wuhan 430074, China

Download: PDF(4214 KB) HTML
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Abstract

Crowdsourcing provides an effective and low-cost way to collect labels from crowd workers. Due to the lack of professional knowledge, the quality of crowdsourced labels is relatively low. A common approach to addressing this issue is to collect multiple labels for each instance from different crowd workers and then a label integration method is used to infer its true label. However, to our knowledge, almost all existing label integration methods merely make use of the original attribute information and do not pay attention to the quality of the multiple noisy label set of each instance. To solve these issues, this paper proposes a novel three-stage label integration method called attribute augmentation-based label integration (AALI). In the first stage, we design an attribute augmentation method to enrich the original attribute space. In the second stage, we develop a filter to single out reliable instances with high-quality multiple noisy label sets. In the third stage, we use majority voting to initialize integrated labels of reliable instances and then use cross-validation to build multiple component classifiers on reliable instances to predict all instances. Experimental results on simulated and real-world crowdsourced datasets demonstrate that AALI outperforms all the other state-of-the-art competitors.

Keywords crowdsourcing label integration attribute augmentation instance filtering

Corresponding Author(s): Liangxiao JIANG

Just Accepted Date: 19 August 2022 Issue Date: 15 December 2022

Cite this article:

Yao ZHANG,Liangxiao JIANG,Chaoqun LI. Attribute augmentation-based label integration for crowdsourcing[J]. Front. Comput. Sci., 2023, 17(5): 175331.

URL:

https://academic.hep.com.cn/fcs/EN/10.1007/s11704-022-2225-z
https://academic.hep.com.cn/fcs/EN/Y2023/V17/I5/175331

Fig.1 Overview framework of AALI

Tab.1 Description of 34 simulated datasets

Dataset	$LQ o$	$LQ r$	$LQ u$	$Per r$	$Per u$
Anneal	88.75	88.75	?	100	0
Audiology	83.63	83.63	?	100	0
Autos	95.61	95.61	?	100	0
Balance-scale	79.36	88.79	62.57	71.36	28.64
Biodeg	74.98	95.89	71.05	20.76	79.24
Breast-cancer	75.52	96.36	71	19.23	80.77
Breast-w	77.11	96.67	73.59	21.46	78.54
Car	84.9	86.33	50	99.88	0.12
Credit-a	74.49	94.21	70.12	17.54	82.46
Credit-g	73	96.97	68.14	16.5	83.5
Diabetes	76.43	96.3	69.98	17.58	82.42
Heart-c	71.62	71.62	?	100	0
Heart-h	70.07	70.07	?	100	0
Heart-statlog	71.85	92.11	68.97	14.07	85.93
Hepatitis	72.26	92.86	68.5	18.06	81.94
Horse-colic	78.8	98.51	75.75	18.21	81.79
Hypothyroid	85.42	88.44	100	99.97	0.03
Ionosphere	78.06	96.83	74.65	17.95	82.05
Iris	77.33	93.9	52.94	54.67	45.33
Kr-vs-kp	76.31	94.6	72.03	19.12	80.88
Labor	70.18	91.67	64.44	21.05	78.95
Letter	99.07	99.07	?	100	0
Lymph	70.95	70.95	?	100	0
Mushroom	64.99	81.28	62.44	13.81	86.19
Segment	97.1	97.05	28.57	99.7	0.3
Sick	68.43	91.31	65.34	16.17	83.83
Sonar	69.23	88	66.67	12.02	87.98
Spambase	79.09	95.04	72.71	23.23	76.77
Tic-tac-toe	73.7	92.63	70.96	19.83	80.17
Vehicle	89.13	91.44	37.5	95.27	4.73
Vote	72.64	91.46	67.14	18.85	81.15
Vowel	98.08	98.08	?	100	0
Waveform	93.04	99.03	76.22	71.82	28.18
Zoo	82.18	82.18	?	100	0
$A v e r a g e$	79.22	90.81	66.45	53.47	46.53

Tab.2 The label quality (%) of the integrated labels inferred by MV from the original dataset, the reliable dataset and the unreliable dataset

Dataset	DS	ZC	KOS	GTIC	IWMV	AALI
Anneal	35.16	80.60	93.21	54.77	91.50	91.14
Audiology	86.81	59.87	65.35	78.67	81.77	83.81
Autos	86.05	36.10	78.68	83.56	92.59	92.29
Balance-scale	53.22	77.58	82.46	82.98	84.05	85.57
Biodeg	66.26	70.88	75.97	75.33	76.12	78.91
Breast-cancer	70.28	69.16	66.26	73.57	70.24	75.98
Breast-w	65.52	75.55	76.47	76.60	76.37	81.33
Car	76.45	78.64	88.20	81.83	86.24	86.55
Credit-a	55.51	74.87	77.55	73.97	77.41	82.84
Credit-g	70.00	72.63	76.36	77.64	76.27	77.99
Diabetes	65.10	72.70	74.65	74.43	74.39	76.45
Heart-c	53.56	70.63	71.35	43.50	73.40	73.99
Heart-h	63.95	69.63	68.61	36.80	73.33	75.44
Heart-statlog	55.56	74.74	69.33	75.93	75.04	84.78
Hepatitis	79.35	67.55	60.84	63.35	68.65	69.10
Horse-colic	63.04	71.03	67.66	68.83	72.01	69.38
Hypothyroid	92.31	79.07	93.26	62.11	88.10	89.68
Ionosphere	64.10	70.11	70.80	70.11	73.93	75.73
Iris	84.73	49.20	82.13	86.87	88.13	94.40
Kr-vs-kp	52.22	74.39	77.21	74.71	77.09	81.29
Labor	64.21	67.72	55.44	68.77	67.89	70.88
Letter	99.17	95.77	85.52	99.34	99.44	99.64
Lymph	62.23	70.88	77.57	77.91	77.64	78.38
Mushroom	51.80	73.25	73.57	72.00	73.38	69.62
Segment	96.94	70.73	88.43	97.05	97.37	98.32
Sick	93.88	73.33	77.22	73.84	77.11	80.79
Sonar	52.69	72.50	71.92	74.57	74.62	74.71
Spambase	60.60	72.48	75.34	74.57	75.25	80.21
Tic-tac-toe	65.34	75.92	77.78	77.80	77.32	77.23
Vehicle	92.80	63.51	88.16	92.96	93.45	93.66
Vote	61.38	69.36	65.24	72.78	72.07	77.40
Vowel	97.31	87.93	78.66	98.40	98.36	99.01
Waveform	89.66	73.86	87.30	88.54	89.74	92.62
Zoo	76.34	12.97	77.03	88.81	85.54	90.59
$A v e r a g e$	70.69	69.86	76.34	75.67	80.47	82.64

Tab.3 Label quality (%) comparisons of six methods on 34 simulated datasets

	DS	ZC	KOS	GTIC	IWMV	AALI
DS	?		$°$	$°$	$°$	$°$
ZC		?	$°$	$°$	$°$	$°$
KOS		$?$	?		$°$	$°$
GTIC	$?$	$?$		?	$°$	$°$
IWMV	$?$	$?$	$?$	$?$	?	$°$
AALI	$?$	$?$	$?$	$?$	$?$	?

Tab.4 The label quality comparisons of the Wilcoxon signed-ranks test

Dataset	DS	ZC	KOS	GTIC	IWMV	AALI
Anneal	36.17	80.29	84.19	49.11	86.53	90.88
Audiology	72.12	75.66	68.58	74.78	72.12	76.32
Autos	70.73	56.10	61.46	70.24	76.59	79.04
Balance-scale	46.08	77.76	79.84	76.20	75.72	78.33
Biodeg	66.26	74.31	79.04	76.48	75.20	78.24
Breast-cancer	70.28	71.33	73.43	70.98	70.28	74.83
Breast-w	65.52	91.70	89.70	88.13	93.13	95.28
Car	73.44	88.89	88.14	89.99	89.12	92.17
Credit-a	55.51	79.57	81.59	76.52	78.70	85.51
Credit-g	70.00	67.20	67.60	68.50	68.70	70.81
Diabetes	65.10	68.10	70.70	72.79	60.42	73.35
Heart-c	54.46	62.71	52.48	21.78	66.01	75.58
Heart-h	63.95	74.49	77.55	26.87	75.51	75.81
Heart-statlog	55.56	62.59	64.44	63.70	62.59	73.15
Hepatitis	79.35	76.13	72.90	64.94	74.19	82.45
Horse-colic	63.04	85.05	83.70	80.16	69.84	93.48
Hypothyroid	92.29	94.98	98.13	96.29	96.99	97.48
Ionosphere	64.10	78.06	78.63	82.62	66.38	93.73
Iris	80.00	86.67	86.37	90.67	90.00	91.67
Kr-vs-kp	52.22	97.47	97.31	92.49	97.28	99.47
Labor	64.91	73.68	49.12	75.44	63.16	75.44
Letter	88.00	82.11	86.83	88.00	87.45	92.43
Lymph	54.73	75.00	72.97	68.24	68.24	77.70
Mushroom	51.80	97.45	98.73	79.10	99.06	100.00
Segment	96.49	82.90	95.36	96.54	96.76	96.88
Sick	93.88	94.70	97.03	94.51	73.62	99.81
Sonar	53.37	54.33	54.33	61.06	63.94	73.88
Spambase	60.60	88.63	88.20	88.61	89.15	92.31
Tic-tac-toe	65.34	73.38	77.33	75.87	80.48	85.28
Vehicle	69.27	50.95	65.60	70.21	71.63	73.46
Vote	61.38	90.33	90.71	86.13	92.87	94.13
Vowel	79.70	78.48	73.30	77.88	81.92	79.90
Waveform	71.88	71.04	70.66	70.16	71.90	75.38
Zoo	81.16	70.30	70.30	90.07	91.48	92.10
$A v e r a g e$	67.31	77.42	77.83	75.15	78.73	84.89

Tab.5 Model quality (%) comparisons of six methods on 34 simulated datasets

	DS	ZC	KOS	GTIC	IWMV	AALI
DS	?	$°$	$°$	$°$	$°$	$°$
ZC	$?$	?				$°$
KOS	$?$		?			$°$
GTIC	$?$			?		$°$
IWMV	$?$				?	$°$
AALI	$?$	$?$	$?$	$?$	$?$	-

Tab.6 The model quality comparisons of the Wilcoxon signed-ranks test

Tab.7 Description of four real-world datasets

Tab.8 Label quality (%) comparisons of six methods on four real-world datasets

Tab.9 Model quality (%) comparisons of six methods on four real-world datasets

Fig.2 The comparison results for AALI and its two variants on the real-world dataset “Leaves”. (a) Label quality comparison results; (b) Model quality comparison results

Fig.3 The label quality of AALI when

θ

varies from 0.05 to 0.95 on the real-world datasets “Income” and “Leaves”. (a) “Income”; (b) “Leaves”

1	L, Jiang L, Zhang L, Yu D Wang . Class-specific attribute weighted naive Bayes. Pattern Recognition, 2019, 88: 321–330
2	Y, Dong L, Jiang C Li . Improving data and model quality in crowdsourcing using co-training-based noise correction. Information Sciences, 2022, 583: 174–188
3	Z, Chen L, Jiang C Li . Label distribution-based noise correction for multiclass crowdsourcing. International Journal of Intelligent Systems, 2022, 37( 9): 5752–5767 https://doi.org/10.1002/int.22812
4	N, Zhang J, Xue Y, Ma R, Zhang T, Liang Y A Tan . Hybrid sequence-based android malware detection using natural language processing. International Journal of Intelligent Systems, 2021, 36( 10): 5770–5784
5	Y, Hu Z, Ou X, Xu M Song . A crowdsourcing repeated annotations system for visual object detection. In: Proceedings of the 3rd International Conference on Vision, Image and Signal Processing. 2019, 14
6	E N N, Ocquaye Q, Mao Y, Xue H Song . Cross lingual speech emotion recognition via triple attentive asymmetric convolutional neural network. International Journal of Intelligent Systems, 2021, 36( 1): 53–71
7	V S, Sheng F, Provost P G Ipeirotis . Get another label? Improving data quality and data mining using multiple, noisy labelers. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008, 614−622
8	T, Tian J, Zhu B You . Max-margin majority voting for learning from crowds. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41( 10): 2480–2494
9	V S, Sheng J Zhang . Machine learning with crowdsourcing: a brief summary of the past research and future directions. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2019, 9837−9843
10	J Zhang . Knowledge learning with crowdsourcing: a brief review and systematic perspective. IEEE/CAA Journal of Automatica Sinica, 2022, 9( 5): 749–762
11	A P, Dawid A M Skene . Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics), 1979, 28( 1): 20–28
12	G, Demartini D E, Difallah P CudrÃ©-Mauroux . ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In: Proceedings of the 21st International Conference on World Wide Web. 2012, 469−478
13	V C, Raykar S, Yu L H, Zhao G H, Valadez C, Florin L, Bogoni L Moy . Learning from crowds. The Journal of Machine Learning Research, 2010, 11: 1297–1322
14	M A, Gemalmaz M Yin . Accounting for confirmation bias in crowdsourced label aggregation. In: Proceedings of the 30th International Joint Conference on Artificial Intelligence. 2021, 1729−1735
15	J, Whitehill P, Ruvolo T, Wu J, Bergsma J Movellan . Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of the 22nd International Conference on Neural Information Processing Systems. 2009, 2035−2043
16	T, Han H, Sun Y, Song Y, Fang X Liu . Find truth in the hands of the few: acquiring specific knowledge with crowdsourcing. Frontiers of Computer Science, 2021, 15( 4): 154315
17	J, Zhang X Wu . Multi-label truth inference for crowdsourcing using mixture models. IEEE Transactions on Knowledge and Data Engineering, 2021, 33( 5): 2083–2095
18	F, Rodrigues F C Pereira . Deep learning from crowds. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence. 2018, 1611−1618
19	M Y, Guan V, Gulshan A M, Dai G E Hinton . Who said what: modeling individual labelers improves classification. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence. 2018, 3109−3118
20	K, Atarashi S, Oyama M Kurihara . Semi-supervised learning from crowds using deep generative models. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence. 2018, 1555–1562
21	S Y, Li S J, Huang S Chen . Crowdsourcing aggregation with deep Bayesian learning. Science China Information Sciences, 2021, 64( 3): 130104
22	V S, Sheng J, Zhang B, Gu X Wu . Majority voting and pairing with multiple noisy labeling. IEEE Transactions on Knowledge and Data Engineering, 2019, 31( 7): 1355–1368
23	F, Tao L, Jiang C Li . Label similarity-based weighted soft majority voting and pairing for crowdsourcing. Knowledge and Information Systems, 2020, 62( 7): 2521–2538
24	F, Tao L, Jiang C Li . Differential evolution-based weighted soft majority voting for crowdsourcing. Engineering Applications of Artificial Intelligence, 2021, 106: 104474
25	D R, Karger S, Oh D Shah . Budget-optimal task allocation for reliable crowdsourcing systems. Operations Research, 2014, 62( 1): 1–24
26	H, Li B Yu . Error rate bounds and iterative weighted majority voting for crowdsourcing. 2014, arXiv preprint arXiv: 1411.4086
27	J, Zhang X, Wu V S Sheng . Imbalanced multiple noisy labeling. IEEE Transactions on Knowledge and Data Engineering, 2015, 27( 2): 489–503
28	J, Zhang V S, Sheng J, Wu X Wu . Multi-class ground truth inference in crowdsourcing with clustering. IEEE Transactions on Knowledge and Data Engineering, 2016, 28( 4): 1080–1085
29	J, Zhang M, Wu V S Sheng . Ensemble learning from crowds. IEEE Transactions on Knowledge and Data Engineering, 2019, 31( 8): 1506–1519
30	L, Jiang H, Zhang F, Tao C Li . Learning from crowds with multiple noisy label distribution propagation. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(11): 6558−6568
31	J, Zhang V S, Sheng B, Nicholson X Wu . CEKA: a tool for mining the wisdom of crowds. The Journal of Machine Learning Research, 2015, 16( 1): 2853–2858
32	I H, Witten E, Frank M A Hall . Data Mining: Practical Machine Learning Tools and Techniques. 3rd ed. Morgan Kaufmann: Elsevier, 2011
33	P, Langley W, Iba K Thompson . An analysis of Bayesian classifiers. In: Proceedings of the Tenth National Conference on Artificial Intelligence. 1992, 223−228
34	J R Quinlan . C4.5: Programs for Machine Learning. San Mateo: Morgan Kaufmann Publishers, 1993
35	Cessie S, le Houwelingen J C van . Ridge estimators in logistic regression. Journal of the Royal Statistical Society: Series C (Applied Statistics), 1992, 41( 1): 191–201
36	J, Alcala-Fdez A, FernÃ¡ndez J, Luengo J, Derrac S, GarcÃa L, SÃ¡nchez H Herrera . KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. Journal of Multiple-Valued Logic and Soft Computing, 2011, 17(2−3): 255−287
37	J Demšar . Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 2006, 7: 1–30
38	L, Jiang L, Zhang C, Li J Wu . A correlation-based feature weighting filter for naive Bayes. IEEE Transactions on Knowledge and Data Engineering, 2019, 31( 2): 201–213
39	A, Oliva A Torralba . Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision, 2001, 42( 3): 145–175

[1]

FCS-22225-OF-YZ_suppl_1

Download

[1]	Xiaochuan LIN, Kaimin WEI, Zhetao LI, Jinpeng CHEN, Tingrui PEI. Aggregation-based dual heterogeneous task allocation in spatial crowdsourcing[J]. Front. Comput. Sci., 2024, 18(6): 186605-.
[2]	Lijuan REN, Liangxiao JIANG, Wenjun ZHANG, Chaoqun LI. Label distribution similarity-based noise correction for crowdsourcing[J]. Front. Comput. Sci., 2024, 18(5): 185323-.
[3]	Jiaran LI, Richong ZHANG, Samuel MENSAH, Wenyi QIN, Chunming HU. Classification-oriented dawid skene model for transferring intelligence from crowds to machines[J]. Front. Comput. Sci., 2023, 17(5): 175332-.
[4]	Peng LI, Junzuo LAI, Yongdong WU. Accountable attribute-based authentication with fine-grained access control and its application to crowdsourcing[J]. Front. Comput. Sci., 2023, 17(1): 171802-.
[5]	Tao HAN, Hailong SUN, Yangqiu SONG, Yili FANG, Xudong LIU. Find truth in the hands of the few: acquiring specific knowledge with crowdsourcing[J]. Front. Comput. Sci., 2021, 15(4): 154315-.
[6]	Gang WU, Zhiyong CHEN, Jia LIU, Donghong HAN, Baiyou QIAO. Task assignment for social-oriented crowdsourcing[J]. Front. Comput. Sci., 2021, 15(2): 152316-.
[7]	Zhenghui HU, Wenjun WU, Jie LUO, Xin WANG, Boshu LI. Quality assessment in competition-based software crowdsourcing[J]. Front. Comput. Sci., 2020, 14(6): 146207-.
[8]	Bo YUAN, Xiaolei ZHOU, Xiaoqiang TENG, Deke GUO. Enabling entity discovery in indoor commercial environments without pre-deployed infrastructure[J]. Front. Comput. Sci., 2019, 13(3): 618-636.
[9]	Xiaolei ZHOU, Tao CHEN, Deke GUO, Xiaoqiang TENG, Bo YUAN. From one to crowd: a survey on crowdsourcing-based wireless indoor localization[J]. Front. Comput. Sci., 2018, 12(3): 423-450.
[10]	Najam NAZAR,He JIANG,Guojun GAO,Tao ZHANG,Xiaochen LI,Zhilei REN. Source code fragment summarization with small-scale crowdsourcing based features[J]. Front. Comput. Sci., 2016, 10(3): 504-517.
[11]	Xiaolan XU,Wenjun WU,Ya WANG,Yuchuan WU. Software crowdsourcing for developing Software-as-a-Service[J]. Front. Comput. Sci., 2015, 9(4): 554-565.
[12]	Wenjun WU, Wei-Tek TSAI, Wei LI. An evaluation framework for software crowdsourcing[J]. Front. Comput. Sci., 2013, 7(5): 694-709.

Viewed

Full text

Abstract

Cited

Shared

Discussed