Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2024, Vol. 18 Issue (3) : 183312    https://doi.org/10.1007/s11704-023-2503-4
Artificial Intelligence
Incorporating contextual evidence to improve implicit discourse relation recognition in Chinese
Sheng XU, Peifeng LI(), Qiaoming ZHU
School of Computer Science and Technology, Soochow University, Suzhou 215000, China
 Download: PDF(3256 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

The discourse analysis task, which focuses on understanding the semantics of long text spans, has received increasing attention in recent years. As a critical component of discourse analysis, discourse relation recognition aims to identify the rhetorical relations between adjacent discourse units (e.g., clauses, sentences, and sentence groups), called arguments, in a document. Previous works focused on capturing the semantic interactions between arguments to recognize their discourse relations, ignoring important textual information in the surrounding contexts. However, in many cases, more than capturing semantic interactions from the texts of the two arguments are needed to identify their rhetorical relations, requiring mining more contextual clues. In this paper, we propose a method to convert the RST-style discourse trees in the training set into dependency-based trees and train a contextual evidence selector on these transformed structures. In this way, the selector can learn the ability to automatically pick critical textual information from the context (i.e., as evidence) for arguments to assist in discriminating their relations. Then we encode the arguments concatenated with corresponding evidence to obtain the enhanced argument representations. Finally, we combine original and enhanced argument representations to recognize their relations. In addition, we introduce auxiliary tasks to guide the training of the evidence selector to strengthen its selection ability. The experimental results on the Chinese CDTB dataset show that our method outperforms several state-of-the-art baselines in both micro and macro F1 scores.

Keywords discourse parsing      discourse relation recognition      contextual evidence selection     
Corresponding Author(s): Peifeng LI   
Just Accepted Date: 09 March 2023   Issue Date: 15 May 2023
 Cite this article:   
Sheng XU,Peifeng LI,Qiaoming ZHU. Incorporating contextual evidence to improve implicit discourse relation recognition in Chinese[J]. Front. Comput. Sci., 2024, 18(3): 183312.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-023-2503-4
https://academic.hep.com.cn/fcs/EN/Y2024/V18/I3/183312
Fig.1  Example of an RST-Tree
Fig.2  Heads of non-terminal nodes in the RST-Tree
Fig.3  The DEP-Tree converted from the RST-Tree in Fig.1. The edges in the tree express the parent-child relations (i.e., dependencies) between the EDUs
Fig.4  The framework of IRES model, including evidence selection and relation classification
Fig.5  Example of converting an RST-style tree to a dependency-based tree. In Fig.5(a), terminal nodes are EDUs, non-terminal nodes (spans) describe the relation between its children, and the directed edge indicates that the child is a Nucleus. Fig.5(b) is the dependency-based tree converted from the discourse tree in Fig.5(a).
Fig.6  Overview of the contextual evidence selection. We use a self-attention mask to control the ?ow of information between the words in the input sequence and those in the output sequence. (a) Contextual evidence selection; (b) self-attention mask
Fig.7  Details of the contextual evidence selector. Fig.7(a) shows the selector structure during the training step, while Fig.7(b) illustrates which components in the selector are used at inference time
Relation Train Test
Causality 885 81
Coordination 4365 485
Elaboration 1319 144
Transition 147 8
Tab.1  Statistics of the discourse relations in CDTB
Model Caus. Coor. Elab. Tran. Micro Macro
Bi-LSTM 24.5 80.9 55.7 ? 68.7 40.3
CNN 26.7 81.0 53.6 ? 70.3 41.2
GCN 25.5 81.8 50.0 11.8 70.3 43.2
Xu 30.8 81.5 56.2 15.4 71.0 46.3
BERT 44.1 84.6 67.6 25.0 76.7 55.7
Jiang 48.6 84.9 70.9 30.0 76.9 58.6
IRES 49.3 86.6 68.7 28.6 78.3 60.1
Golden 57.5 89.6 76.8 42.0 82.5 66.3
Tab.2  Performance of the six baselines and our IRES model with F1-scores
Model Temp. Caus. Comp. Expa. AVG
BERT 35.1 53.3 32.1 95.2 53.9
RoBERTa 33.3 45.5 23.0 94.6 49.1
BERT+evi 39.3 54.0 32.4 95.2 55.2
RoBERTa+evi 40.7 46.8 28.2 94.7 52.6
Tab.3  Performance of the BERT and RoBERTa models on top 4 classes of the HIT-CDTB
Model Original Auto Golden
Micro Macro Micro Macro Micro Macro
BiLSTM 68.7 40.3 73.3 47.8 75.5 50.2
CNN 70.3 41.2 71.2 53.1 72.7 56.4
GCN 70.3 43.2 73.3 52.2 77.7 59.3
Xu 71.0 46.3 73.5 54.9 78.7 59.9
Tab.4  Performance of non-Transformer baselines incorporating contextual evidence
Case Caus. Coor. Elab. Tran.
BERT Ours BERT Ours BERT Ours BERT Ours
Both Old 21.4 0.0 89.9 97.5 30.0 0.0 50.0 0.0
Include 8.3 8.3 89.1 96.1 35.0 15.0 0.0 0.0
One New 51.3 69.2 76.1 66.3 81.7 82.9 20.0 60.0
Both New 37.5 56.3 91.0 89.0 56.3 62.5 0.0 100.0
Tab.5  The recognition accuracies of four cases
Model Caus. Coor. Elab. Tran. Micro Macro
BERT 44.1 84.6 67.6 25.0 76.7 55.7
Repeat 44.3 85.5 65.5 22.2 76.9 54.8
Para 44.8 85.1 67.9 21.1 76.9 55.1
Context 45.6 85.5 66.9 22.2 77.0 55.3
Ours 49.3 86.6 68.7 28.6 78.3 60.1
Tab.6  Comparison of incorporating different governor texts for recognizing discourse relations
Model Caus. Coor. Elab. Tran. Micro Macro
Seq2Seq 46.5 84.4 66.0 18.2 76.5 54.9
+nuc +1.5 +2.2 ?0.3 +10.4 +1.6 +2.6
+rel ?6.8 +2.1 +2.6 +15.1 +1.6 +2.9
+nuc&rel +2.8 +2.2 +2.7 +10.4 +1.8 +5.2
Tab.7  Comparison of variants using different guidance conditions
Caus. Coor. Elab. Tran.
Caus. ? 42.0% 9.9% 2.5%
Coor. 4.3% ? 4.5% 2.5%
Elab. 7.6% 27.8% ? 1.4%
Tran. 0% 50.0% 0% ?
Tab.8  Percentages of misclassified samples
  
  
  
1 E, Pitler A Nenkova . Using syntax to disambiguate explicit discourse connectives in text. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers. 2009, 13−16
2 Lin Z, Kan M Y, Ng H T. Recognizing implicit discourse relations in the Penn discourse treebank. In: Proceedings of 2009 Conference on Empirical Methods in Natural Language Processing. 2009, 343−351
3 C, Wang B Wang . An end-to-end topic-enhanced self-attention network for social emotion classification. In: Proceedings of the Web Conference 2020. 2020, 2210−2219
4 Webber B, Popescu-Belis A, Tiedemann J. Proceedings of the third workshop on discourse in machine translation. In: Proceedings of the 3rd Workshop on Discourse in Machine Translation. 2017
5 J, Xu Z, Gan Y, Cheng J Liu . Discourse-aware neural extractive text summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 5021−5031
6 Liu Y, Li S. Recognizing implicit discourse relations via repeated reading: Neural networks with multi-level attention. In: Proceedings of 2016 Conference on Empirical Methods in Natural Language Processing. 2016, 1224−1233
7 F, Guo R, He D, Jin J, Dang L, Wang X Li . Implicit discourse relation recognition using neural tensor network with interactive attention and sparse learning. In: Proceedings of the 27th International Conference on Computational Linguistics. 2018, 547−558
8 Liu X, Ou J, Song Y, Jiang X. On the importance of word and sentence representation learning in implicit discourse relation classification. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence. 2021, 530
9 W, Xiang B, Wang L, Dai Y Mo . Encoding and fusing semantic connection and linguistic evidence for implicit discourse relation recognition. In: Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022. 2022, 3247−3257
10 L, Qin Z, Zhang H, Zhao Z, Hu E Xing . Adversarial connective-exploiting networks for implicit discourse relation classification. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017, 1006−1017
11 H P, Huang J J Li . Unsupervised adversarial domain adaptation for implicit discourse relation classification. In: Proceedings of the 23rd Conference on Computational Natural Language Learning. 2019, 686−695
12 Liu Y, Li S, Zhang X, Sui Z. Implicit discourse relation classification via multi-task neural networks. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 2750−2756
13 Lan M, Wang J, Wu Y, Niu Z Y, Wang H. Multi-task attention-based neural networks for implicit discourse relationship representation and identification. In: Proceedings of 2017 Conference on Empirical Methods in Natural Language Processing. 2017, 1299−1308
14 S, Xu P, Li F, Kong Q, Zhu G Zhou . Topic tensor network for implicit discourse relation recognition in Chinese. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 608−618
15 R, He J, Wang F, Guo Y Han . TransS-driven joint learning architecture for implicit discourse relation recognition. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 139−148
16 Jiang F, Fan Y, Chu X, Li P, Zhu Q. Not just classification: Recognizing implicit discourse relation on joint modeling of classification and generation. In: Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing. 2021, 2418−2431
17 W, Shi F, Yung R, Rubino V Demberg . Using explicit discourse connectives in translation for implicit discourse relation classification. In: Proceedings of the 8th International Joint Conference on Natural Language Processing. 2017, 484−495
18 Xu Y, Hong Y, Ruan H, Yao J, Zhang M, Zhou G. Using active learning to expand training data for implicit discourse relation recognition. In: Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing. 2018, 725−731
19 Z, Dou Y, Hong Y, Sun G Zhou . CVAE-based re-anchoring for implicit discourse relation classification. In: Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021. 2021, 1275−1283
20 Dai Z, Huang R. A regularization approach for incorporating event knowledge and coreference relations into neural discourse parsing. In: Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019, 2976−2987
21 F, Guo R, He J, Dang J Wang . Working memory-driven neural networks with a novel knowledge enhancement paradigm for implicit discourse relation recognition. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020, 7822−7829
22 Zhang Y, Meng F, Li P, Jian P, Zhou J. Context tracking network: Graph-based context modeling for implicit discourse relation recognition. In: Proceedings of 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021, 1592−1599
23 M, Isonuma J, Mori I Sakata . Unsupervised neural single-document summarization of reviews via learning latent discourse structure and its ranking. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 2142−2152
24 H, Karimi J Tang . Learning hierarchical discourse-level structure for fake news detection. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019, 3432−3442
25 Hirao T, Yoshida Y, Nishino M, Yasuda N, Nagata M. Single-document summarization as a tree knapsack problem. In: Proceedings of 2013 Conference on Empirical Methods in Natural Language Processing. 2013, 1515−1520
26 Y, Liu M Lapata . Learning structured text representations. Transactions of the Association for Computational Linguistics, 2018, 6: 63–75
27 E, Ferracane G, Durrett J J, Li K Erk . Evaluating discourse in structured text representations. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 646−653
28 R, Prasad N, Dinesh A, Lee E, Miltsakaki L, Robaldo A, Joshi B Webber . The Penn discourse TreeBank 2.0. In: Proceedings of the 6th International Conference on Language Resources and Evaluation. 2008, 2961−2968
29 L, Carlson D, Marcu M E Okurowski . Building a discourse-tagged corpus in the framework of rhetorical structure theory. In: Proceedings of the SIGDIAL 2001 Workshop, the 2nd Annual Meeting of the Special Interest Group on Discourse and Dialogue. 2001
30 E, Pitler A, Louis A Nenkova . Automatic sense prediction for implicit discourse relations in text. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 2009, 683−691
31 Y, Wang S, Li H Wang . A two-stage parsing method for text-level discourse analysis. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017, 184−188
32 H, Bai H Zhao . Deep enhanced representation for implicit discourse relation recognition. In: Proceedings of the 27th International Conference on Computational Linguistics. 2018, 571−583
33 X, Lin S, Joty P, Jwalapuram M S Bari . A unified linear-time framework for sentence-level discourse parsing. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 4190−4200
34 H, Ruan Y, Hong Y, Xu Z, Huang G, Zhou M Zhang . Interactively-propagative attention learning for implicit discourse relation recognition. In: Proceedings of the 28th International Conference on Computational Linguistics. 2020, 3168−3178
35 Y, Lu Y, Hong X, Li G Zhou . Implicit discourse relation recognition based on multi-granularity context fusion mechanism. In: Proceedings of the 19th Pacific Rim International Conference on Artificial Intelligence. 2022, 347−358
36 Z M, Zhou Y, Xu Z Y, Niu M, Lan J, Su C L Tan . Predicting discourse connectives for implicit discourse relation recognition. In: Proceedings of the COLING 2010. 2010, 1507−1514
37 J, Chen Q, Zhang P, Liu X, Qiu X Huang . Implicit discourse relation detection via a deep architecture with gated relevance network. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016, 1726−1735
38 W, Lei X, Wang M, Liu I, Ilievski X, He M Y Kan . SWIM: A simple word interaction model for implicit discourse relation recognition. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017, 4026−4032
39 Zhang B, Su J, Xiong D, Lu Y, Duan H, Yao J. Shallow convolutional neural network for implicit discourse relation recognition. In: Proceedings of 2015 Conference on Empirical Methods in Natural Language Processing. 2015, 2230−2235
40 L, Qin Z, Zhang H Zhao . A stacking gated neural architecture for implicit discourse relation classification. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016, 2263−2270
41 Dai Z, Huang R. Improving implicit discourse relation classification by modeling inter-dependencies of discourse units in a paragraph. In: Proceedings of 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018, 141−151
42 L H, Nguyen Ngo L, Van K, Than T H Nguyen . Employing the correspondence of relations and connectives to identify implicit discourse relations via label embeddings. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 4201−4207
43 C, Jiang T, Qian Z, Chen K, Tang S, Zhan T Zhan . Generating pseudo connectives with MLMs for implicit discourse relation recognition. In: Proceedings of the 18th Pacific Rim International Conference on Artificial Intelligence. 2021, 113−126
44 W, Xiang Z, Wang L, Dai B Wang . ConnPrompt: Connective-cloze prompt learning for implicit discourse relation recognition. In: Proceedings of the 29th International Conference on Computational Linguistics. 2022, 902−911
45 Li Y, Feng W, Sun J, Kong F, Zhou G. Building Chinese discourse corpus with connective-driven dependency tree structure. In: Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing. 2014, 2105−2114
46 Y, Zhou N Xue . The Chinese discourse TreeBank: A Chinese corpus annotated with discourse relations. Language Resources and Evaluation, 2015, 49( 2): 397–431
47 F, Kong G Zhou . A CDT-styled end-to-end Chinese discourse parser. ACM Transactions on Asian and Low-Resource Language Information Processing, 2017, 16( 4): 26
48 S, Rönnqvist N, Schenk C Chiarcos . A recurrent neural model with attention for the recognition of Chinese implicit discourse relations. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017, 256−262
49 Y, Liu J, Zhang C Zong . Memory augmented attention model for Chinese implicit discourse relation recognition. In: Proceedings of the 16th Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. 2017, 411−423
50 K, Munir H, Bai H, Zhao J Zhao . Memorizing all for implicit discourse relation recognition. Transactions on Asian and Low-Resource Language Information Processing, 2021, 21( 3): 53
51 Bhatia P, Ji Y, Eisenstein J. Better document-level sentiment analysis from RST discourse parsing. In: Proceedings of 2015 Conference on Empirical Methods in Natural Language Processing. 2015, 2212−2218
52 Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E. Hierarchical attention networks for document classification. In: Proceedings of 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016, 1480−1489
53 Y, Ji N A Smith . Neural discourse structure for text categorization. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017, 996−1005
54 T, Ishigaki H, Kamigaito H, Takamura M Okumura . Discourse-aware hierarchical attention network for extractive single-document summarization. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing. 2019, 497−506
55 S, Li L, Wang Z, Cao W Li . Text-level discourse dependency parsing. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 2014, 25−35
56 Yoshida Y, Suzuki J, Hirao T, Nagata M. Dependency-based discourse parser for single-document summarization. In: Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing. 2014, 1834−1839
57 D, Hewlett A, Lacoste L, Jones I, Polosukhin A, Fandrianto J, Han M, Kelcey D Berthelot . WikiReading: A novel large-scale language understanding task over wikipedia. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016, 1535−1545
58 L, Dong N, Yang W, Wang F, Wei X, Liu Y, Wang J, Gao M, Zhou H W Hon . Unified language model pre-training for natural language understanding and generation. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, 2019, 1170
59 Devlin J, Chang M W, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019, 4171−4186
60 A, Vaswani N, Shazeer N, Parmar J, Uszkoreit L, Jones A N, Gomez Ł, Kaiser I Polosukhin . Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 6000−6010
61 Z M, Ziegler L, Melas-Kyriazi S, Gehrmann A M Rush . Encoder-agnostic adaptation for conditional language generation. 2019, arXiv preprint arXiv: 1908.06938
62 Vries H, De F, Strub J, Mary H, Larochelle O, Pietquin A C Courville . Modulating early visual processing by language. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 6597−6607
63 M, Zhang Y, Song B, Qin T Liu . Chinese discourse relation recognition. Journal of Chinese Information Processing, 2013, 27( 6): 51–58
64 W H, Tian Y Q, Gao H W, Huang Z W, Li Z Y Zhang . Implicit discourse relation analysis based on multi-task Bi-LSTM. Journal of Chinese Information Processing, 2019, 33( 5): 47–53
65 J, Wei X, Ren X, Li W, Huang Y, Liao Y, Wang J, Lin X, Jiang X, Chen Q Liu . NEZHA: Neural contextualized representation for Chinese language understanding. 2019, arXiv preprint arXiv: 1909.00204
66 T, Miyato A M, Dai I Goodfellow . Adversarial training methods for semi-supervised text classification. 2016, arXiv preprint arXiv: 1605.07725
67 Y N, Dauphin A, Fan M, Auli D Grangier . Language modeling with gated convolutional networks. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 933−941
68 C, Raffel N, Shazeer A, Roberts K, Lee S, Narang M, Matena Y, Zhou W, Li P J Liu . Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 2020, 21( 1): 140
69 Y, Kishimoto Y, Murawaki S Kurohashi . Adapting BERT to implicit discourse relation classification with a focus on discourse connectives. In: Proceedings of the 12th Language Resources and Evaluation Conference. 2020, 1152−1158
70 Y T, Tang Y B, Li L, Liu Z H, Yu L Chen . Feature learning by distant supervision for fine-grained implicit discourse relation identification. Acta Scientiarum Naturalium Universitatis Pekinensis, 2019, 55( 1): 91–97
71 F, Guo R, He J Dang . Implicit discourse relation recognition via a BiLSTM-CNN architecture with dynamic chunk-based max pooling. IEEE Access, 2019, 7: 169281–169292
[1] FCS-22503-OF-SX_suppl_1 Download
[1] Jiaqi LI, Ming LIU, Bing QIN, Ting LIU. A survey of discourse parsing[J]. Front. Comput. Sci., 2022, 16(5): 165329-.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed