Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2023, Vol. 17 Issue (5) : 175333    https://doi.org/10.1007/s11704-022-2134-1
RESEARCH ARTICLE
Combating with extremely noisy samples in weakly supervised slot filling for automatic diagnosis
Xiaoming SHI(), Wanxiang CHE()
School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
 Download: PDF(1919 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Slot filling, to extract entities for specific types of information (slot), is a vitally important modular of dialogue systems for automatic diagnosis. Doctor responses can be regarded as the weak supervision of patient queries. In this way, a large amount of weakly labeled data can be obtained from unlabeled diagnosis dialogue, alleviating the problem of costly and time-consuming data annotation. However, weakly labeled data suffers from extremely noisy samples. To alleviate the problem, we propose a simple and effective Co-Weak-Teaching method. The method trains two slot filling models simultaneously. These two models learn from two different weakly labeled data, ensuring learning from two aspects. Then, one model utilizes selected weakly labeled data generated by the other, iteratively. The model, obtained by the Co-Weak-Teaching on weakly labeled data, can be directly tested on testing data or sequentially fine-tuned on a small amount of human-annotated data. Experimental results on these two settings illustrate the effectiveness of the method with an increase of 8.03% and 14.74% in micro and macro f1 scores, respectively.

Keywords dialogue system      slot filling      co-teaching     
Corresponding Author(s): Xiaoming SHI,Wanxiang CHE   
Just Accepted Date: 05 August 2022   Issue Date: 03 January 2023
 Cite this article:   
Xiaoming SHI,Wanxiang CHE. Combating with extremely noisy samples in weakly supervised slot filling for automatic diagnosis[J]. Front. Comput. Sci., 2023, 17(5): 175333.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-022-2134-1
https://academic.hep.com.cn/fcs/EN/Y2023/V17/I5/175333
Fig.1  An example of the weakly labeled data. The example shows that the weak label EyeInflammation is a noisy label and the ground-truth label EyePain is not recalled
Fig.2  The proposed co-weak-learning for learning from extremely noisy weakly labeled data. The weakly labeled data is divided into two independent data, weakly labeled data 1 and weakly labeled data 2. Model 1 and model 2 are initialed with pre-trained BERT model. These two models exchange selected weakly labeled data for model training
  
# of labeled training samples 1,152
# of labeled validation samples 500
# of labeled test samples 1,000
# of unlabeled samples 10,000
Avg. # of tokens per training samples 12.39
Avg. # of tokens per validation samples 12.67
Avg. # of tokens per test samples 12.72
Avg. # of tokens per all samples 12.57
# of slot-values 29
Avg. # of slot-values per training samples 1.38
Avg. # of slot-values per validation samples 1.3
Avg. # of slot-values per test samples 1.33
Avg. # of slot-values per all samples 1.35
Avg. frequency 123.07
Max. frequency 566
Min. frequency 29
Tab.1  Statistics on the dataset
Weakly labeled data Precision Recall Micro f1 Macro f1 Turn accuracy
w/o Fine-tuning Naive [6] 63.37 62.65 63.01 63.26 31.80
Co-teaching [7] 80.43 61.60 69.77 54.04 50.10
Co-Weak-Teaching (Ours) 90.70 80.04 85.04 82.29 68.20
w/ Fine-tuning Naive [6] 89.95 89.61 89.78 87.87 72.20
Co-teaching [7] 89.60 90.21 89.91 87.56 72.10
Co-Weak-Teaching (Ours) 90.30 91.11 90.70 88.78 71.90
Tab.2  The precision, recall, micro f1, macro f1, and turn accuracy results (%) with and without fine-tuning on the annotated dataset
N Precision Recall Micro f1 Macro f1 Turn accuracy
1 88.26 78.69 83.20 80.37 64.90
2 88.56 81.63 84.95 81.01 66.40
3 90.70 80.04 85.04 82.29 68.20
4 90.33 79.44 84.54 82.54 68.20
Tab.3  The results in the Co-Weak-Teaching of different turns
Fig.3  The training curves of different method on the training data when models are fine-tuned on expert-annotated dataset
τ Precision Recall Micro f1 Macro f1 Turn accuracy
0.4 90.39 77.94 83.70 80.93 68.20
0.6 90.38 78.54 84.05 81.99 67.50
0.8 90.70 80.04 85.04 82.29 68.20
0.9 90.52 79.82 84.83 82.40 68.00
Tab.4  The results in the Co-Weak-Teaching of different τ
# of Data Precision Recall Micro f1 Macro f1
500 81.14 76.13 78.55 70.95
600 84.30 78.46. 81.28 73.96
0 (Ours) 90.70 80.04 85.04 82.29
700 87.78 84.94 86.34 82.86
Tab.5  Results of classifiers on different amount of annotated data
Patient query Weak label
Selected 晚上睡觉左胸疼痛, 难以入眠 (I feel pain in left chest at night, and can't sleep.) 胸痛 (Chest Pain), 失眠 (Insomnia)
关节有时候反复酸痛. (The joints are sometimes sore repeatedly.) 关节疼痛 (Joint Pain)
Abandoned 胸口总感觉发慌. (I always feel panic in my chest.) 腹痛 (Stomachache), 恶心 (Nausea)
张开口, 左边有点痛. (When I open my mouth, the left side hurts a little.) 无力 (Faintness), 牙痛 (Toothache)
Tab.6  Some cases selected and abandoned by the proposed method
  
  
1 Z C, Lipton X, Li J, Gao L, Li F, Ahmed L Deng . BBQ-networks: efficient exploration in deep reinforcement learning for task-oriented dialogue systems. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018, 5237−5244
2 T H, Wen D, Vandyke N, Mrkšić M, Gašić L, Rojas-Barahona P H, Su S, Ultes S Young . A network-based end-to-end trainable task-oriented dialogue system. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. 2017, 438−449
3 Z, Yan N, Duan P, Chen M, Zhou J, Zhou Z Li . Building task-oriented dialogue systems for online shopping. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017, 4618−4626
4 L, Xu Q, Zhou K, Gong X, Liang J, Tang L Lin . End-to-end knowledge-routed relational dialogue system for automatic diagnosis. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence. 2019, 7346−7353
5 L, Wang X, Li J, Liu K, He Y, Yan W Xu . Bridge to target domain by prototypical contrastive learning and label confusion: re-explore zero-shot learning for slot filling. In: Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing. 2021, 9474−9480
6 X, Shi H, Hu W, Che Z, Sun T, Liu J Huang . Understanding medical conversations with scattered keyword attention and weak supervision from responses. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020, 8838−8845
7 B, Han Q, Yao X, Yu G, Niu M, Xu W, Hu I W, Tsang M Sugiyama . Co-teaching: robust training of deep neural networks with extremely noisy labels. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 8536−8546
8 J, Devlin M W, Chang K, Lee K Toutanova . BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019, 4171−4186
9 D P, Kingma J Ba . Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations. 2015
10 K, Yao G, Zweig M Y, Hwang Y, Shi D Yu . Recurrent neural networks for language understanding. In: Proceedings of the 14th Annual Conference of the International Speech Communication Association. 2013, 2524−2528
11 G, Mesnil Y, Dauphin K, Yao Y, Bengio L, Deng D, Hakkani-Tur X, He L, Heck G, Tur D, Yu G Zweig . Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2015, 23( 3): 530–539
12 D, Hakkani-Tür G, Tür A, Celikyilmaz Y N, Chen J, Gao L, Deng Y Y Wang . Multi-domain joint semantic frame parsing using Bi-directional RNN-LSTM. In: Proceedings of the 17th Annual Meeting of the International Speech Communication Association. 2016, 715−719
13 L, Zhao Z Feng . Improving slot filling in spoken language understanding with joint pointer and attention. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018, 426−431
14 L M, Barahona M, Gašić N, Mrkšić P H, Su S, Ultes T H, Wen S Young . Exploiting sentence and context representations in deep neural models for spoken language understanding. In: Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers. 2016, 258−267
15 A, Blum T Mitchell . Combining labeled and unlabeled data with co-training. In: Proceedings of the 11th Annual Conference on Computational Learning Theory. 1998, 92−100
16 S Abney . Bootstrapping. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 2002, 360−367
17 M F, Balcan A, Blum K Yang . Co-training and expansion: towards bridging theory and practice. In: Proceedings of the 17th International Conference on Neural Information Processing Systems. 2004, 89−96
18 W, Wang Z H Zhou . Theoretical foundation of co-training and disagreement-based algorithms. 2017, arXiv preprint arXiv: 1708.04403
19 J, Du C X, Ling Z H Zhou . When does cotraining work in real data? IEEE Transactions on Knowledge and Data Engineering, 2011, 23(5): 788−799
20 D, Angluin P Laird . Learning from noisy examples. Machine Learning, 1988, 2( 4): 343–370
21 D, Arpit S, Jastrzębski N, Ballas D, Krueger E, Bengio M S, Kanwal T, Maharaj A, Fischer A, Courville Y, Bengio S Lacoste-Julien . A closer look at memorization in deep networks. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 233−242
22 C, Zhang S, Bengio M, Hardt B, Recht O Vinyals . Understanding deep learning (still) requires rethinking generalization. Communications of the ACM, 2021, 64( 3): 107–115
23 J, Goldberger E Ben-Reuven . Training deep neural-networks using a noise adaptation layer. In: Proceedings of the 5th International Conference on Learning Representations. 2017
24 G, Patrini A, Rozza Menon A, Krishna R, Nock L Qu . Making deep neural networks robust to label noise: a loss correction approach. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 2233−2241
[1] FCS-22134-of-XS_suppl_1 Download
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed