Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2025, Vol. 19 Issue (2) : 192306    https://doi.org/10.1007/s11704-023-3521-y
Artificial Intelligence
Exploring & exploiting high-order graph structure for sparse knowledge graph completion
Tao HE1, Ming LIU1,2(), Yixin CAO3, Zekun WANG1, Zihao ZHENG1, Bing QIN1,2
1. Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology, Harbin 150001, China
2. Peng Cheng Laboratory, Shenzhen 518000, China
3. SMU School of Computing and Information Systems, Singapore Management University, Singapore 178902, Singapore
 Download: PDF(17469 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Sparse Knowledge Graph (KG) scenarios pose a challenge for previous Knowledge Graph Completion (KGC) methods, that is, the completion performance decreases rapidly with the increase of graph sparsity. This problem is also exacerbated because of the widespread existence of sparse KGs in practical applications. To alleviate this challenge, we present a novel framework, LR-GCN, that is able to automatically capture valuable long-range dependency among entities to supplement insufficient structure features and distill logical reasoning knowledge for sparse KGC. The proposed approach comprises two main components: a GNN-based predictor and a reasoning path distiller. The reasoning path distiller explores high-order graph structures such as reasoning paths and encodes them as rich-semantic edges, explicitly compositing long-range dependencies into the predictor. This step also plays an essential role in densifying KGs, effectively alleviating the sparse issue. Furthermore, the path distiller further distills logical reasoning knowledge from these mined reasoning paths into the predictor. These two components are jointly optimized using a well-designed variational EM algorithm. Extensive experiments and analyses on four sparse benchmarks demonstrate the effectiveness of our proposed method.

Keywords knowledge graph completion      graph neural networks      reinforcement learning     
Corresponding Author(s): Ming LIU   
About author: Li Liu and Yanqing Liu contributed equally to this work.
Just Accepted Date: 14 December 2023   Issue Date: 22 April 2024
 Cite this article:   
Tao HE,Ming LIU,Yixin CAO, et al. Exploring & exploiting high-order graph structure for sparse knowledge graph completion[J]. Front. Comput. Sci., 2025, 19(2): 192306.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-023-3521-y
https://academic.hep.com.cn/fcs/EN/Y2025/V19/I2/192306
Fig.1  KGC results of different previous KG embedding models on FB15K-237 and its sparse subsets (60%, 40%, and 20% denote percentages of retained triples). The performance drops dramatically as we remove triples. (a) Hits@10; (b) MRR
Fig.2  An illustration of inducing rules from reasoning paths. To reduce the length of rules, we view loops within paths as pointless segments and remove them
Fig.3  Our framework consists of two modules: GNN-based model with the long range dependency convolution layer and MLN-RL model, which are jointly optimized by high-order knowledge distillation. Given a query (es,rq,?), MLN-RL first reasons paths and classify the paths into two parts according to whether the path is correct. The positive paths and negative path segments are applied to construct new edges to capture long range dependency explicitly. While the negative paths are fed into MLN to distill knowledge for the GNN-based model by variational EM algorithm
WD-singer NELL23K FB15K-237_10 FB15K-237_20 FB15K-237_30 FB15K-237_60 FB15K-237
#Entity 10282 22925 14541 14541 14541 14541 14541
#Relation 270 400 237 237 237 237 237
#Train Set 16142 24321 27211 54423 108846 163269 272115
#Dev Set 2163 4951 17535 17535 17535 17535 17535
#Test Set 2203 4944 20466 20466 20466 20466 20466
Avg In-degree 1.570 1.170 1.876 3.752 5.628 11.256 18.760
Tab.1  Summary statistics of datasets
FB15K-237_10 FB15K-237_20 WD-singer NELL23K
Hits@N Hits@N Hits@N Hits@N
@1 @10 MRR @1 @10 MRR @1 @10 MRR @1 @10 MRR
TransE 4.94 24.03 11.58 8.33 28.76 15.27 22.56 49.84 32.68 4.62 30.21 13.35
RotatE 5.64 17.16 9.52 10.40 28.52 16.43 31.43 45.96 36.79 9.63 28.13 15.75
ComplEx 9.61 22.77 13.92 10.69 26.17 15.81 29.87 43.53 34.94 14.21 32.57 20.26
TuckER 6.51 16.44 9.89 9.73 25.91 15.08 32.12 44.83 36.87 13.75 30.35 19.36
ConvE 10.56 23.90 15.69 11.38 29.51 17.27 31.46 47.37 37.22 14.44 37.55 22.73
SACN 9.60 22.56 13.98 11.36 28.82 17.07 28.49 43.70 34.10 14.36 35.01 21.22
RED-GNN 8.43 19.8 12.22 12.28 30.88 18.43 32.57 49.18 38.79 16.63 39.65 24.24
NBFNet 11.13 27.78 16.64 12.45 31.87 18.89 32.59 48.10 38.81 14.40 39.04 22.74
DacKGR 10.21 21.34 13.91 10.99 26.15 15.87 26.49 44.30 32.66 13.00 31.63 18.99
HoGRN ? ? ? ? ? ? ? 48.80 39.07 ? 39.98 24.56
-backbone
CompGCN 9.52 23.11 14.11 11.27 29.51 17.30 31.75 47.62 37.65 13.79 37.57 21.59
LR-GCN (w/o KD) 10.41 26.32 15.77 11.94 30.98 18.22 32.06 49.61 38.71 15.41 39.84 23.41
LR-GCN (w/o LRC) 10.60 26.25 15.89 11.59 30.11 17.74 31.89 48.28 38.00 15.99 39.11 23.53
LR-GCN 11.07 27.37 16.49 12.57 31.72 18.97 32.80 49.98 39.27 17.31 41.90 25.27
–Rela. Impr. +16.28% +18.43% +16.87% +11.54% +7.49% +9.65% +3.31% +4.96% +5.50% +25.53% +11.53% +17.04%
Tab.2  Experimental results on FB15K-237_10, FB15K-237_20, WD-singer, and NELL23K. The last line records the relative improvements of LR-GCN over CompGCN. Hits@N and MRR values are in percentage. “KD” denotes High-order Knowledge Distillation and “LRC” represents Long Range Convolution. The best score is in bold and the second is underlined
Fig.4  Improvements of LR-GCN on FB15K-237 and 4 sparse datasets against to CompGCN (60%, 30%, 20%, and 10% denote percentages of retained triples)
Fig.5  MRR results and entity frequency grouped by entity in-degree on NELL23K and FB15K-237_10. (a) NELL23K; (b) FB15K-237_10
WD-singer NELL23K
Hits@10 MRR Hits@10 MRR
CompGCN (K=1) 47.62 37.65 37.57 21.60
CompGCN (K=2) 47.98 36.55 37.34 21.13
CompGCN (K=3) 47.80 34.22 37.55 21.01
LR-GCN 49.98 39.27 41.90 25.27
Tab.3  Performance comparison of LR-GCN with CompGCN stacked with K=1,2,3 graph convolutional layers on WD-singer and NELL23K
DacKGR MLN-RL
Hits@1 MRR Hits@1 MRR
FB15K-237_10 10.21 13.91 10.33 13.97
WD-singer 26.49 32.66 28.05 33.65
NELL23K 13.00 18.99 13.02 19.05
Tab.4  Performances of MLN-RL compared with its base model DacKGR on FB15K-237_10, WD-singer and NELL23K. Values are in percentage
Path length
3 4 5 6
WD-singer 3.133 4.3 5.03 6.067
NELL23K 1.683 2.467 3.017 3.5
Tab.5  Training time per epoch on WD-singer and NELL23K. Each of these values is in minutes
Fig.6  Hits@1 and MRR results for different reasoning paths lengths on WD-singer. (a) Hits@1; (b) MRR
Fig.7  Blue edges denote the reasoning paths searched for corresponding queries. Orange and green nodes denote the predicted answers by the pre-trained GNN-based predictor before joint learning and the golden answers, respectively. (a) Case 1: (Intelligent dance music, parent_genre, ?); (b) Case 2: (David Nutter, nominated_for, ?)
  
  
  
  
  
  
β γ λ TT
FB15K-237_10 1.0 5.0 0.0 4 h
FB15K-237_20 2.0 5.0 0.0001 13.5 h
FB15K-237_30 1.0 1.0 0.001 12.5 h
FB15K-237_60 2.0 0.1 0.0 15.5 h
WD-singer 1.0 1.0 0.001 9 h
NELL23K 0.5 2.0 0.0 2 h
  Table A1 Part of best hyperparameter settings and running time of LR-GCN for different datasets. “TT” means “Training time”
1 X, Lv X, Han L, Hou J, Li Z, Liu W, Zhang Y, Zhang H, Kong S Wu . Dynamic anticipation and completion for multi-hop reasoning over sparse knowledge graph. In: Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020, 5694−5703
2 W, Chen Y, Cao F, Feng X, He Y Zhang . Explainable sparse knowledge graph completion via high-order graph reasoning network. 2022, arXiv preprint arXiv: 2207.07503
3 X, Xu Y, Zhu X, Wang N Zhang . How to unleash the power of large language models for few-shot relation extraction? In: Proceedings of the 4th Workshop on Simple and Efficient Natural Language Processing (SustaiNLP). 2023, 190−200
4 D, Sui X, Zeng Y, Chen K, Liu J Zhao . Joint entity and relation extraction with set prediction networks. IEEE Transactions on Neural Networks and Learning Systems, 2023, 1–12, doi:
5 S, Cao J, Shi L, Pan L, Nie Y, Xiang L, Hou J, Li B, He H Zhang . KQA Pro: a dataset with explicit compositional programs for complex question answering over knowledge base. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022, 6101−6119
6 M, Galkin Z, Zhu H, Ren J Tang . Inductive logical query answering in knowledge graphs. 2022, arXiv preprint arXiv: 2210.08008
7 D, Li Y, Li J, Zhang K, Li C, Wei J, Cui B Wang . C3KG: a Chinese commonsense conversation knowledge graph. In: Proceedings of Findings of the Association for Computational Linguistics: ACL 2022. 2022, 1369−1383
8 Z, Fei X, Zhou T, Gui Q, Zhang X Huang . LFKQG: a controlled generation framework with local fine-tuning for question generation over knowledge bases. In: Proceedings of the 29th International Conference on Computational Linguistics. 2022, 6575−6585
9 Z, Tan Z, Chen S, Feng Q, Zhang Q, Zheng J, Li M Luo . KRACL: contrastive learning with graph context modeling for sparse knowledge graph completion. In: Proceedings of the ACM Web Conference 2023. 2023, 2548−2559
10 D, Jin Y, Gong Z, Wang Z, Yu D, He Y, Huang W Wang . Graph neural network for higher-order dependency networks. In: Proceedings of the ACM Web Conference 2022. 2022, 1622−1630
11 C, Yang M, Liu V W, Zheng J Han . Node, motif and Subgraph: leveraging network functional blocks through structural convolution. In: Proceedings of 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). 2018, 47−52
12 T N, Kipf M Welling . Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations. 2017
13 J, Topping Giovanni F, Di B P, Chamberlain X, Dong M M Bronstein . Understanding over-squashing and bottlenecks on graphs via curvature. In: Proceedings of the 10th International Conference on Learning Representations. 2022
14 X V, Lin R, Socher C Xiong . Multi-hop knowledge graph reasoning with reward shaping. In: Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing. 2018, 3243−3253
15 M, Richardson P Domingos . Markov logic networks. Machine Learning, 2006, 62( 1): 107–136
16 C M Bishop . Pattern Recognition and Machine Learning. New York: Springer, 2006
17 S, Vashishth S, Sanyal V, Nitin P Talukdar . Composition-based multi-relational graph convolutional networks. In: Proceedings of the 8th International Conference on Learning Representations. 2020
18 M, Qu J Tang . Probabilistic logic neural networks for reasoning. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 7712−7722
19 A, Bordes N, Usunier A, Garcia-Durán J, Weston O Yakhnenko . Translating embeddings for modeling multi-relational data. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013, 2787−2795
20 Z, Sun Z H, Deng J Y, Nie J Tang . Rotate: Knowledge graph embedding by relational rotation in complex space. In: Proceedings of the 7th International Conference on Learning Representations. 2019
21 T, Trouillon J, Welbl S, Riedel É, Gaussier G Bouchard . Complex embeddings for simple link prediction. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning. 2016, 2071−2080
22 I, Balažević C, Allen T Hospedales . TuckER: Tensor factorization for knowledge graph completion. In: Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019, 5185−5194
23 T, Dettmers P, Minervini P, Stenetorp S Riedel . Convolutional 2D knowledge graph embeddings. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018, 221
24 C, Shang Y, Tang J, Huang J, Bi X, He B Zhou . End-to-end structure-aware convolutional networks for knowledge base completion. In: Proceedings of the 37th AAAI Conference on Artificial Intelligence. 2019, 3060−3067
25 Z, Zhu Z, Zhang L P A C, Xhonneux J Tang . Neural bellman-ford networks: A general graph neural network framework for link prediction. In: Proceedings of the 35th Conference on Neural Information Processing Systems. 2021, 29476−29490
26 Y, Zhang Q Yao . Knowledge graph reasoning with relational digraph. In: Proceedings of the ACM Web Conference 2022. 2022, 912−924
27 Z, Sun S, Vashishth S, Sanyal P, Talukdar Y Yang . A re-evaluation of knowledge graph completion methods. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 5516−5522
28 A, Rossi D, Barbosa D, Firmani A, Matinata P Merialdo . Knowledge graph embedding for link prediction: a comparative analysis. ACM Transactions on Knowledge Discovery from Data, 2021, 15( 2): 14
29 B, Yang W T, Yih X, He J, Gao L Deng . Embedding entities and relations for learning and inference in knowledge bases. In: Proceedings of the 3rd International Conference on Learning Representations. 2015
30 M, Schlichtkrull T N, Kipf P, Bloem Den Berg R, Van I, Titov M Welling . Modeling relational data with graph convolutional networks. In: Proceedings of the 15th European Semantic Web Conference. 2018, 593−607
31 R, Li Y, Cao Q, Zhu G, Bi F, Fang Y, Liu Q Li . How does knowledge graph embedding extrapolate to unseen data: a semantic evidence view. In: Proceedings of the 37th AAAI Conference on Artificial Intelligence. 2022, 5781−5791
32 G, Wan S, Pan C, Gong C, Zhou G Haffari . Reasoning like human: hierarchical reinforcement learning for knowledge graph reasoning. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence. 2021, 1926−1932
33 T, He T, Jiang Z, Zheng H, Zhu J, Zhang M, Liu S, Zhao B Qin . VEM2L: a plug-and-play framework for fusing text and structure knowledge on sparse knowledge graph completion. 2022, arXiv preprint arXiv: 2207.01528
34 L, Galárraga C, Teflioudi K, Hose F M Suchanek . Fast rule mining in ontological knowledge bases with AMIE+. The VLDB Journal, 2015, 24(6): 707−730
35 M, Qu J, Chen L P, Xhonneux Y, Bengio J Tang . RNNlogic: learning logic rules for reasoning on knowledge graphs. 2020, arXiv preprint arXiv: 2010, 04029,
36 G, Niu Y, Zhang B, Li P, Cui S, Liu J, Li X Zhang . Rule-guided compositional representation learning on knowledge graphs. In: Proceedings of the 34th AAAI conference on artificial intelligence. 2020, 2950−2958
37 G, Niu B, Li Y, Zhang S Pu . Perform like an engine: A closed-loop neural-symbolic learning framework for knowledge graph inference. In: Proceedings of the 29th International Conference on Computational Linguistics. 2021, 1391−1400
38 J, Xu J, Zhang X, Ke Y, Dong H, Chen C, Li Y Liu . P-INT: a path-based interaction model for few-shot knowledge graph completion. In: Proceedings of Findings of the Association for Computational Linguistics: EMNLP 2021. 2021, 385−394
39 F, Yang Z, Yang W W Cohen . Differentiable learning of logical rules for knowledge base reasoning. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 2316−2325
40 A, Sadeghian M, Armandpour P, Ding D Z Wang . DRUM: End-to-end differentiable rule mining on knowledge graphs. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 1375
41 P W, Wang D, Stepanova C, Domokos J Z Kolter . Differentiable learning of numerical rules in knowledge graphs. In: Proceedings of the 8th International Conference on Learning Representations. 2020
42 D, Zhang Z, Yuan H, Liu X, Lin H Xiong . Learning to walk with dual agents for knowledge graph reasoning. In: Proceedings of the 37th AAAI Conference on Artificial Intelligence. 2022, 5932−5941
43 D E, Rumelhart G E, Hinton R J Williams . Learning representations by back-propagating errors. Nature, 1986, 323( 6088): 533–536
44 Y, Zhang X, Chen Y, Yang A, Ramamurthy B, Li Y, Qi L Song . Efficient probabilistic logic reasoning with graph neural networks. In: Proceedings of the 8th International Conference on Learning Representations. 2020
[1] FCS-23521-OF-TH_suppl_1 Download
[1] Lei YUAN, Feng CHEN, Zongzhang ZHANG, Yang YU. Communication-robust multi-agent learning by adaptable auxiliary multi-agent adversary generation[J]. Front. Comput. Sci., 2024, 18(6): 186331-.
[2] Chengxing JIA, Fuxiang ZHANG, Tian XU, Jing-Cheng PANG, Zongzhang ZHANG, Yang YU. Model gradient: unified model and policy learning in model-based reinforcement learning[J]. Front. Comput. Sci., 2024, 18(4): 184339-.
[3] Yuya CUI, Degan ZHANG, Jie ZHANG, Ting ZHANG, Lixiang CAO, Lu CHEN. Multi-user reinforcement learning based task migration in mobile edge computing[J]. Front. Comput. Sci., 2024, 18(4): 184504-.
[4] Yongquan LIANG, Qiuyu SONG, Zhongying ZHAO, Hui ZHOU, Maoguo GONG. BA-GNN: Behavior-aware graph neural network for session-based recommendation[J]. Front. Comput. Sci., 2023, 17(6): 176613-.
[5] Yuan GAO, Xiang WANG, Xiangnan HE, Huamin FENG, Yongdong ZHANG. Rumor detection with self-supervised learning on texts and social graph[J]. Front. Comput. Sci., 2023, 17(4): 174611-.
[6] Jian AN, Siyuan WU, Xiaolin GUI, Xin HE, Xuejun ZHANG. A blockchain-based framework for data quality in edge-computing-enabled crowdsensing[J]. Front. Comput. Sci., 2023, 17(4): 174503-.
[7] Qiming FU, Zhechao WANG, Nengwei FANG, Bin XING, Xiao ZHANG, Jianping CHEN. MAML2: meta reinforcement learning via meta-learning for task categories[J]. Front. Comput. Sci., 2023, 17(4): 174325-.
[8] Xiaoqin ZHANG, Huimin MA, Xiong LUO, Jian YUAN. LIDAR: learning from imperfect demonstrations with advantage rectification[J]. Front. Comput. Sci., 2022, 16(1): 161312-.
[9] Hong QIAN, Yang YU. Derivative-free reinforcement learning: a review[J]. Front. Comput. Sci., 2021, 15(6): 156336-.
[10] Li ZHANG, Yuxuan CHEN, Wei WANG, Ziliang HAN, Shijian Li, Zhijie PAN, Gang PAN. A Monte Carlo Neural Fictitious Self-Play approach to approximate Nash Equilibrium in imperfect-information dynamic games[J]. Front. Comput. Sci., 2021, 15(5): 155334-.
[11] Peng YANG, Qi YANG, Ke TANG, Xin YAO. Parallel exploration via negatively correlated search[J]. Front. Comput. Sci., 2021, 15(5): 155333-.
[12] Yao QIN, Hua WANG, Shanwen YI, Xiaole LI, Linbo ZHAI. A multi-objective reinforcement learning algorithm for deadline constrained scientific workflow scheduling in clouds[J]. Front. Comput. Sci., 2021, 15(5): 155105-.
[13] Hongwei LI, Yingpeng HU, Yixuan CAO, Ganbin ZHOU, Ping LUO. Rich-text document styling restoration via reinforcement learning[J]. Front. Comput. Sci., 2021, 15(4): 154328-.
[14] Kok-Lim Alvin YAU, Kae Hsiang KWONG, Chong SHEN. Reinforcement learning models for scheduling in wireless networks[J]. Front Comput Sci, 2013, 7(5): 754-766.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed