Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2022, Vol. 16 Issue (6) : 166338    https://doi.org/10.1007/s11704-021-1208-9
RESEARCH ARTICLE
Data fusing and joint training for learning with noisy labels
Yi WEI1, Mei XUE1(), Xin LIU2(), Pengxiang XU1
1. College of Electrical Engineering and Control Science, Nanjing Tech University, Nanjing 211816, China
2. Beijing Seetatech Technology Co., Ltd, Beijing 100029, China
 Download: PDF(10199 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

It is well known that deep learning depends on a large amount of clean data. Because of high annotation cost, various methods have been devoted to annotating the data automatically. However, a larger number of the noisy labels are generated in the datasets, which is a challenging problem. In this paper, we propose a new method for selecting training data accurately. Specifically, our approach fits a mixture model to the per-sample loss of the raw label and the predicted label, and the mixture model is utilized to dynamically divide the training set into a correctly labeled set, a correctly predicted set, and a wrong set. Then, a network is trained with these sets in the supervised learning manner. Due to the confirmation bias problem, we train the two networks alternately, and each network establishes the data division to teach the other network. When optimizing network parameters, the labels of the samples fuse respectively by the probabilities from the mixture model. Experiments on CIFAR-10, CIFAR-100 and Clothing1M demonstrate that this method is the same or superior to the state-of-the-art methods.

Keywords deep learning      noisy labels      data fusing     
Corresponding Author(s): Mei XUE,Xin LIU   
Just Accepted Date: 09 August 2021   Issue Date: 28 January 2022
 Cite this article:   
Yi WEI,Mei XUE,Xin LIU, et al. Data fusing and joint training for learning with noisy labels[J]. Front. Comput. Sci., 2022, 16(6): 166338.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-021-1208-9
https://academic.hep.com.cn/fcs/EN/Y2022/V16/I6/166338
Fig.1  Framework of our method. First, the two kinds of sample losses are generated by the Net1. Then, the training dataset is divided into correct set and wrong set by GMM, and the label of the sample are refined by the weights from GMM. Finally, the label-updated data are used for training the Net2
Fig.2  Distributions of the normalized loss on CIFAR-10 with 80% noise ratio. Top: network A; bottom: network B. (a) Epoch 15: cross entropy; (b) Epoch 50: our method; (c) Epoch 150: our method; (d) Epoch 250: our method
Fig.3  Distributions of the normalized loss on CIFAR-10 with 80% noise ratio. Top: epoch 15; bottom: epoch 50. (a) The distributions of the small-loss method by GMM; (b) The distributions of DST; (c) The loss of raw labels; (d) The loss of predicted labels
Fig.4  
Method Total Correct Rate/%
Small-Loss [18] at epoch 15 4346 4005 92.1
Our method at epoch 15 9441 8469 89.7
Small-Loss [18] at epoch 50 15489 11163 72.1
Our method at epoch 50 42301 35618 84.2
Tab.1  Comparison between our method and Small-Loss method
Method Noise ratio/%
20 50 80
CIFAR-10 Cross-Entropy 86.8 79.5 62.7
Co-teaching+ [16] 89.5 85.7 67.4
P-correction [24] 92.4 89.1 77.5
Meta-Learning [39] 92.9 89.3 77.4
M-correction [11] 94.0 92.0 86.8
DivideMix [18] 96.1 94.6 93.2
Our method 96.1 95.2 92.8
CIFAR-100 Cross-Entropy 62.1 46.7 19.8
Co-teaching+ [16] 65.6 51.8 27.9
P-correction [24] 69.4 57.5 31.1
Meta-Learning [39] 68.5 59.2 42.4
M-correction [11] 73.9 66.1 48.2
DivideMix [18] 77.3 74.6 60.2
Our method 78.0 74.7 60.4
Tab.2  Comparison with baselines with Criterion 1
Method Noise ratio/%
20 40 60 80
CIFAR-10 MentorNet [14] 92.0 89.0 ? 49.0
D2L [25] 85.1 83.4 72.8 ?
Reweight [13] 86.9 ? ? ?
Abstention [27] 93.4 90.9 87.6 70.8
M-correction [11] 94.0 92.8 90.3 74.1
DivideMix [18] 96.2 94.9 94.3 79.8
Our method 96.0 95.3 94.2 88.5
CIFAR-100 MentorNet [14] 73.0 68.0 ? 35.0
D2L [25] 62.2 52.0 42.3 ?
Reweight [13] 61.3 ? ? ?
Abstention [27] 75.8 68.2 59.4 34.1
M-correction [11] 73.7 70.1 59.5 45.5
DivideMix [18] 77.2 75.2 72.0 60.0
Our method 77.5 75.3 72.2 60.1
Tab.3  Comparison with baselines with Criterion 2
Method Best Last
Cross-Entropy 85.3 71.7
P-correction [24] 88.5 88.1
Meta-Learning [39] 89.2 88.6
M-correction [11] 87.4 86.3
DivideMix [18] 93.4 92.1
Our method 93.5 92.3
Tab.4  Comparison with baselines on CIFAR-10 with asymmetric noise
Method Test accuracy/%
Cross-Entropy 69.32
F-correction [8] 69.84
M-correction [11] 71.00
Joint-Optim [10] 72.23
Meta-Learning [39] 73.47
P-correction [24] 73.49
DivideMix [18] 74.76
Our method 74.79
Tab.5  Comparison with state-of-the-art methods on Clothing1M
Method Noise ratio/%
20 50 80
our method 96.1 95.2 92.8
training with one network 95.4 94.3 91.7
training w/o the correctly predicted set 94.7 91.7 78.7
training w/o the correctly labeled set 95.5 94.5 91.6
training with the wrong set 94.5 92.1 79.7
Tab.6  Ablation results on CIFAR-10
Fig.5  Ablation study results on CIFAR-10. Noise ratio (a) 20%; (b) 50%; (c) 80%
1 Y Yan , R Rosales , G Fung , R Subramanian , J Dy . Learning from multiple annotators with varying expertise. Machine Learning, 2014, 95( 3): 291– 327
2 X Yu, T Liu, M Gong, D Tao. Learning with biased complementary labels. In: Proceedings of the 15th European Conference on Computer Vision. 2018, 69– 85
3 A Blum , A Kalai , H Wasserman . Noise-tolerant learning, the parity problem, and the statistical query model. Journal of the ACM, 2003, 50( 4): 506– 519
4 R Tanno, A Saeedi, S Sankaranarayanan, D C Alexander, N Silberman. Learning from noisy labels by regularized estimation of annotator confusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019, 1123 6– 11245
5 C Zhang, S Bengio, M Hardt, B Recht, O Vinyals. Understanding deep learning requires rethinking generalization. In: Proceedings of the 5th International Conference on Learning Representations (ICLR). 2017
6 J Goldberger, E Ben-Reuven. Training deep neural-networks using a noise adaptation layer. In: Proceedings of the ICLR. 2017
7 T Liu , D Tao . Classification with noisy labels by importance reweighting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38( 3): 447– 461
8 G Patrini, A Rozza, A K Menon, R Nock, L Qu. Making deep neural networks robust to label noise: a loss correction approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017, 2233−2241
9 S E Reed, H Lee, D Anguelov, C Szegedy, D Erhan, A Rabinovich. Training deep neural networks on noisy labels with bootstrapping. In: Proceedings of the 3rd International Conference on Learning Representations (ICLR). 2015
10 D Tanaka, D Ikami, T Yamasaki, K Aizawa. Joint optimization framework for learning with noisy labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2018, 5552−5560
11 E Arazo, D Ortego, P Albert, N E O’Connor, K McGuinness. Unsupervised label noise modeling and loss correction. In: Proceedings of the 36th International Conference on Machine Learning (ICML). 2019, 312– 321
12 H Zhang, M Cissé, Y N Dauphin, D Lopez-Paz. mixup: beyond empirical risk minimization. In: Proceedings of the 6th International Conference on Learning Representations (ICLR). 2018
13 M Ren, W Zeng, B Yang, R Urtasun. Learning to reweight examples for robust deep learning. In: Proceedings of the 35th International Conference on Machine Learning (ICML). 2018, 4331−4340
14 L Jiang, Z Zhou, T Leung, L Li, L Fei-Fei. MentorNet: learning data-driven curriculum for very deep neural networks on corrupted labels. In: Proceedings of the 35th International Conference on Machine Learning (ICML). 2018, 2309−2318
15 B Han, Q Yao, X Yu, G Niu, M Xu, W Hu, I W Tsang, M Sugiyama. Co-teaching: robust training of deep neural networks with extremely noisy labels. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems (NeurIPS). 2018, 8536−8546
16 X Yu, B Han, J Yao, G Niu, I W Tsang, M Sugiyama. How does disagreement help generalization against label corruption? In: Proceedings of the 36th International Conference on Machine Learning (ICML). 2019, 7164−7173
17 H Wei, L Feng, X Chen, B An. Combating noisy labels by agreement: a joint training method with co-regularization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020, 1372 3– 13732
18 J Li, R Socher, S C H Hoi. DivideMix: learning with noisy labels as semi-supervised learning. In: Proceedings of the 8th International Conference on Learning Representations. 2020
19 Y Li, J Yang, Y Song, L Cao, J Luo, L Li. Learning from noisy labels with distillation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). 2017, 1928−1936
20 T Xiao, T Xia, Y Yang, C Huang, X Wang. Learning from massive noisy labeled data for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2015, 2691−2699
21 A Vahdat. Toward robustness against label noise in training deep discriminative neural networks. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS). 2017, 5601−5610
22 A Veit, N Alldrin, G Chechik, I Krasin, A Gupta, S J Belongie. Learning from noisy large-scale datasets with minimal supervision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017, 6575−6583
23 K H Lee, X He, L Zhang, L Yang. CleanNet: transfer learning for scalable image classifier training with label noise. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2018, 5447−5456
24 K Yi, J Wu. Probabilistic end-to-end noise correction for learning with noisy labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019, 7010−7018
25 X Ma, Y Wang, M E Houle, S Zhou, S M Erfani, S Xia, S N R Wijewickrema, J Bailey. Dimensionality-driven learning with noisy labels. In: Proceedings of the 35th International Conference on Machine Learning (ICML). 2018, 3361−3370
26 D Hendrycks, M Mazeika, D Wilson, K Gimpel. Using trusted data to train deep networks on labels corrupted by severe noise. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems (NeurIPS). 2018, 1047 7– 10486
27 S Thulasidasan, T Bhattacharya, J A Bilmes, G Chennupati, J Mohd-Yusof. Combating label noise in deep learning using abstention. In: Proceedings of the 36th International Conference on Machine Learning (ICML). 2019, 6234−6243
28 Y Shen, S Sanghavi. Learning with bad training data via iterative trimmed loss minimization. In: Proceedings of the 36th International Conference on Machine Learning (ICML). 2019, 5739−5748
29 A Ghosh, H Kumar, P S Sastry. Robust loss functions under label noise for deep neural networks. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI). 2017, 1919−1925
30 Y Wang, X Ma, Z Chen, Y Luo, J Yi, J Bailey. Symmetric cross entropy for robust learning with noisy labels. In: Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 2019, 322– 330
31 Y Ding, L Wang, D Fan, B Gong. A semi-supervised two-stage approach to learning from noisy labels. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV). 2018, 1215−1224
32 K Kong , J Lee , Y Kwak , M Kang , S G Kim , W J Song . Recycling: semi-supervised learning with noisy labels in deep neural networks. IEEE Access, 2019, 7 : 66998– 67005
33 D Berthelot, N Carlini, I J Goodfellow, A Oliver, N Papernot, C Raffel. MixMatch: a holistic approach to semi-supervised learning. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems (NeurIPS). 2019, 454
34 D Arpit, S Jastrzebski, N Ballas, D Krueger, E Bengio, M S Kanwal, T Maharaj, A Fischer, A C Courville, Y Bengio, S Lacoste-Julien. A closer look at memorization in deep networks. In: Proceedings of the 34th International Conference on Machine Learning (ICML). 2017, 233−242
35 P Chen, B Liao, G Chen, S Zhang. Understanding and utilizing deep neural networks trained with noisy labels. In: Proceedings of the 36th International Conference on Machine Learning (ICML). 2019, 1062−1070
36 H Permuter , J Francos , I Jermyn . A study of Gaussian mixture models of color and texture features for image classification and segmentation. Pattern Recognition, 2006, 39( 4): 695– 706
37 A Tarvainen, H Valpola. Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS). 2017, 1195−1204
38 A Krizhevsky. Learning multiple layers of features from tiny images. University of Toronto, Dissertation, 2009
39 J Li, Y Wong, Q Zhao, M S Kankanhalli. Learning to learn from noisy labeled data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019, 5046−5054
40 Y Wang, W Liu, X Ma, J Bailey, H Zha, L Song, S T Xia. Iterative learning with open-set noisy labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2018, 8688−8696
41 K He, X Zhang, S Ren, J Sun. Identity mappings in deep residual networks. In: Proceedings of the 14th European Conference on Computer Vision (ECCV). 2016, 630−645
[1] Yufei ZENG, Zhixin LI, Zhenbin CHEN, Huifang MA. Aspect-level sentiment analysis based on semantic heterogeneous graph convolutional network[J]. Front. Comput. Sci., 2023, 17(6): 176340-.
[2] Yamin HU, Hao JIANG, Zongyao HU. Measuring code maintainability with deep neural networks[J]. Front. Comput. Sci., 2023, 17(6): 176214-.
[3] Muazzam MAQSOOD, Sadaf YASMIN, Saira GILLANI, Maryam BUKHARI, Seungmin RHO, Sang-Soo YEO. An efficient deep learning-assisted person re-identification solution for intelligent video surveillance in smart cities[J]. Front. Comput. Sci., 2023, 17(4): 174329-.
[4] Donghong HAN, Yanru KONG, Jiayi HAN, Guoren WANG. A survey of music emotion recognition[J]. Front. Comput. Sci., 2022, 16(6): 166335-.
[5] Tian WANG, Jiakun LI, Huai-Ning WU, Ce LI, Hichem SNOUSSI, Yang WU. ResLNet: deep residual LSTM network with longer input for action recognition[J]. Front. Comput. Sci., 2022, 16(6): 166334-.
[6] Pinzhuo TIAN, Yang GAO. Improving meta-learning model via meta-contrastive loss[J]. Front. Comput. Sci., 2022, 16(5): 165331-.
[7] Tian WANG, Shiye LEI, Youyou JIANG, Choi CHANG, Hichem SNOUSSI, Guangcun SHAN, Yao FU. Accelerating temporal action proposal generation via high performance computing[J]. Front. Comput. Sci., 2022, 16(4): 164317-.
[8] Kaimin WEI, Tianqi LI, Feiran HUANG, Jinpeng CHEN, Zefan HE. Cancer classification with data augmentation based on generative adversarial networks[J]. Front. Comput. Sci., 2022, 16(2): 162601-.
[9] Yu OU, Lang LI. Side-channel analysis attacks based on deep learning network[J]. Front. Comput. Sci., 2022, 16(2): 162303-.
[10] Qiang LIN, Yusheng HAO, Caihong LIU. Wi-Fi based non-invasive detection of indoor wandering using LSTM model[J]. Front. Comput. Sci., 2021, 15(6): 156505-.
[11] Anirban DUTTA, Gudmalwar ASHISHKUMAR, Ch V Rama RAO. Performance analysis of ASR system in hybrid DNN-HMM framework using a PWL euclidean activation function[J]. Front. Comput. Sci., 2021, 15(4): 154705-.
[12] Huiying ZHANG, Yu ZHANG, Xin GENG. Practical age estimation using deep label distribution learning[J]. Front. Comput. Sci., 2021, 15(3): 153318-.
[13] Syed Farooq ALI, Muhammad Aamir KHAN, Ahmed Sohail ASLAM. Fingerprint matching, spoof and liveness detection: classification and literature review[J]. Front. Comput. Sci., 2021, 15(1): 151310-.
[14] Chune LI, Yongyi MAO, Richong ZHANG, Jinpeng HUAI. A revisit to MacKay algorithm and its application to deep network compression[J]. Front. Comput. Sci., 2020, 14(4): 144304-.
[15] Guijuan ZHANG, Yang LIU, Xiaoning JIN. A survey of autoencoder-based recommender systems[J]. Front. Comput. Sci., 2020, 14(2): 430-450.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed