Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2025, Vol. 19 Issue (1) : 191305    https://doi.org/10.1007/s11704-023-3339-7
Artificial Intelligence
D2-GCN: a graph convolutional network with dynamic disentanglement for node classification
Shangwei WU, Yingtong XIONG, Hui LIANG, Chuliang WENG()
School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
 Download: PDF(9090 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Classic Graph Convolutional Networks (GCNs) often learn node representation holistically, which ignores the distinct impacts from different neighbors when aggregating their features to update a node’s representation. Disentangled GCNs have been proposed to divide each node’s representation into several feature units. However, current disentangling methods do not try to figure out how many inherent factors the model should assign to help extract the best representation of each node. This paper then proposes D2-GCN to provide dynamic disentanglement in GCNs and present the most appropriate factorization of each node’s mixed features. The convergence of the proposed method is proved both theoretically and experimentally. Experiments on real-world datasets show that D2-GCN outperforms the baseline models concerning node classification results in both single- and multi-label tasks.

Keywords graph convolutional networks      dynamic disentanglement      label entropy      node classification     
Corresponding Author(s): Chuliang WENG   
Just Accepted Date: 01 December 2023   Issue Date: 03 April 2024
 Cite this article:   
Shangwei WU,Yingtong XIONG,Hui LIANG, et al. D2-GCN: a graph convolutional network with dynamic disentanglement for node classification[J]. Front. Comput. Sci., 2025, 19(1): 191305.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-023-3339-7
https://academic.hep.com.cn/fcs/EN/Y2025/V19/I1/191305
Fig.1  Effects of different K in DisenGCN on Cora. In each subgraph, the x-axis represents the number of training epochs, the y-axis represents training/test accuracies or training/test losses. (a) Training accuracies; (b) test accuracies; (c) training losses; (d) test losses
Fig.2  Changes in the kinds of predicted labels of three typical nodes’ neighbors across training epochs with different K in DisenGCN on the Cora dataset. (a) The test with vα; (b) the test with vβ; (c) the test with vγ
Fig.3  Counts of nodes with different numbers of predicted neighbor classes (i.e., nvi) at different training stages on the Cora dataset
Fig.4  Changes of K^val, Kˉval, and K~val during training with different K in DisenGCN on the Cora dataset. (a) Changes of K^val; (b) changes of Kˉval; (c) changes of K~val
Fig.5  Changes of nvi w.r.t. two typical nodes across layers with different K in DisenGCN on the Cora dataset. (a) The test with vα; (b) the test with vβ
  
  
Fig.6  Architecture of D2-GCN with two-level disentanglement during training, i.e., the epoch level and the layer level
Fig.7  Change of KLval during training with different K in DisenGCN. The dataset used here is Cora
Dataset Type #Nodes #Edges #Classes #Node features Task
Cora Citation 2,708 5,429 7 1,433 Single
Citeseer Citation 3,327 4,732 6 3,703 Single
PubMed Citation 19,717 44,338 3 500 Single
PPI Biological 3,890 76,584 50 ? Multi
POS Word Co-occurrence 4,777 184,812 40 ? Multi
BlogCatalog Social 10,312 333,983 39 ? Multi
Flickr Social 89,250 899,756 7 500 Multi
Tab.1  Dataset information. The last column implies whether the graphs in the task contain single or multiple labels
Dataset S Weight decay Learning rate Dropout rate L λ ΔK
Cora 6 0.001 0.061 0.41 6 5 2
Citeseer 6 0.086 0.090 0.26 5 5 2
PubMed 6 0.045 0.014 0.22 4 4 1
PPI 6 0.00059 0.0057 0.45 2 6 3
POS 6 0.00048 0.0780 0.45 2 4 3
BlogCatalog 6 0.00036 0.0079 0.40 4 4 4
Flickr 6 0.00025 0.0530 0.36 3 5 2
Tab.2  Hyperparameter settings
Model Dataset
Cora Citeseer PubMed
MLP [38] 55.1±0.25 46.5±0.33 71.4±0.46
ManiReg [39] 59.5±0.18 60.1±0.24 70.7±0.31
SemiEmb [40] 59.0±0.23 59.6±0.21 71.1±0.28
LP [41] 68.0±0.13 45.3±0.22 63.0±0.26
DeepWalk [35] 67.2±0.11 43.2±0.32 65.3±0.36
ICA [42] 75.1±0.23 69.1±0.26 73.9±0.31
Planetoid [43] 75.7±0.22 64.7±0.19 77.2±0.30
ChebNet [44] 81.2±0.12 69.8±0.23 74.4±0.25
GCN [1] 81.5±0.17 70.3±0.23 79.0±0.42
MoNet [45] 81.7±0.24 71.4±0.27 78.8±0.35
GAT [34] 83.0±0.15 72.5±0.18 79.0±0.22
MixHop [46] 82.3±0.23 72.1±0.18 81.0±0.26
DisenGCN [18] 83.2±0.20 71.5±0.25 79.9±0.19
IPGDN [19] 83.9±0.15 72.8±0.19 80.7±0.21
D2-GCN 85.4±0.11 73.5±0.12 81.8±0.14
Tab.3  Test accuracy results in single-label node classification on Cora, Citeseer, and PubMed (unit: percentage)
Model Dataset
PPI POS BlogCatalog Flickr
Macro-F1 DeepWalk 17.8±0.19 11.9±0.21 23.1±0.24 15.3±0.26
LINE 17.3±0.25 11.8±0.18 22.8±0.27 19.7±0.15
node2vec 17.9±0.19 12.9±0.25 23.6±0.31 22.4±0.27
GCN 17.7±0.16 20.1±0.20 24.4±0.28 27.2±0.18
GAT 19.2±0.13 16.1±0.22 26.8±0.25 30.6±0.23
DisenGCN 21.4±0.17 26.5±0.21 28.9±0.15 34.2±0.14
D2-GCN 22.3±0.11 27.8±0.14 29.6±0.12 35.7±0.17
Micro-F1 DeepWalk 20.7±0.20 49.3±0.22 38.8±0.26 30.6±0.17
LINE 21.1±0.23 49.0±0.16 38.5±0.24 34.2±0.19
node2vec 20.6±0.17 49.9±0.22 39.0±0.28 38.4±0.31
GCN 20.7±0.18 50.9±0.18 35.9±0.25 44.3±0.12
GAT 22.3±0.14 50.7±0.16 39.7±0.22 46.5±0.23
DisenGCN 25.7±0.18 54.1±0.21 41.8±0.16 49.2±0.16
D2-GCN 26.6±0.15 55.4±0.11 42.7±0.14 50.9±0.15
Tab.4  Test Macro-F1 and Micro-F1 scores in multi-label node classification on PPI, POS, and BlogCatalog (unit: percentage)
Fig.8  Loss curves in DisenGCN and D2-GCN on Cora, Citeseer, and PubMed. Here ‘trn’ or ‘tst’ means the loss on the training/test set. (a) On Cora; (b) on Citeseer; (c) on PubMed
Fig.9  K~val and K~dyna in DisenGCN and D2-GCN on Cora, Citeseer, and PubMed. Here K~dyna means the K altered by our model during training. (a) On Cora; (b) on Citeseer; (c) on PubMed
Fig.10  KLval in DisenGCN and D2-GCN on Cora, Citeseer, and PubMed. (a) On Cora; (b) on Citeseer; (c) on PubMed
Model Dataset
Cora Citeseer PubMed PPI POS BlogCatalog Flickr
DisenGCNHPO 81.42 79.07 445.29 2547.29 2735.17 8724.36 35672.13
D2-GCN 23.04 18.21 237.24 470.03 626.39 1337.25 3917.56
Tab.5  Comparison of total time cost between DisenGCNHPO and D2-GCN (unit: second)
Model Dataset
Cora Citeseer PubMed PPI POS BlogCatalog Flickr
DisenGCN 2066 2186 7852 2166 2316 4488 34928
D2-GCN 1938 2104 2884 1614 1534 1706 14772
Tab.6  Comparison of memory consumption between DisenGCN and D2-GCN (unit: MB)
Fig.11  Effects of λ on Cora. (a) Test accuracy; (b) K~val; (c) KLval
Fig.12  Effects of ΔK on Cora. (a) Test accuracy; (b) K~val; (c) KLval
Model Dataset
Cora Citeseer PubMed
D2-GCN(ED,Fd) 85.4±0.11 73.5±0.12 81.8±0.14
D2-GCN(ED,FΔd) 84.8±0.23 72.7±0.18 81.3±0.16
D2-GCN(LD,Fd) 82.7±0.15 72.2±0.22 81.0±0.13
D2-GCN(LD,FΔd) 82.6±0.17 71.8±0.12 80.8±0.21
D2-GCN(ED+LD,Fd) 83.3±0.11 73.2±0.24 81.2±0.17
D2-GCN(ED+LD,FΔd) 83.1±0.14 73.0±0.16 80.9±0.22
Tab.7  Test accuracy results of D2-GCN with different disentangling strategies in single-label tasks (unit: percentage)
Model Dataset
PPI POS BlogCatalog Flickr
Macro-F1 D2-GCN(ED,Fd) 21.3±0.16 26.5±0.12 28.4±0.15 34.0±0.17
D2-GCN(ED,FΔd) 21.5±0.12 26.8±0.14 28.5±0.11 34.2±0.16
D2-GCN(LD,Fd) 21.6±0.14 27.1±0.17 28.7±0.13 34.7±0.11
D2-GCN(LD,FΔd) 21.8±0.11 27.2±0.15 28.8±0.12 34.9±0.13
D2-GCN(ED+LD,Fd) 21.9±0.13 27.4±0.16 29.1±0.12 35.4±0.15
D2-GCN(ED+LD,FΔd) 22.3±0.11 27.8±0.14 29.6±0.12 35.7±0.17
Micro-F1 D2-GCN(ED,Fd) 25.5±0.13 53.8±0.11 41.5±0.10 49.6±0.14
D2-GCN(ED,FΔd) 25.6±0.14 54.0±0.12 41.6±0.11 49.8±0.17
D2-GCN(LD,Fd) 25.8±0.12 54.3±0.13 41.8±0.15 49.9±0.10
D2-GCN(LD,FΔd) 25.9±0.09 54.5±0.11 42.1±0.14 50.2±0.16
D2-GCN(ED+LD,Fd) 26.1±0.11 54.7±0.15 42.3±0.13 50.6±0.12
D2-GCN(ED+LD,FΔd) 26.6±0.15 55.4±0.11 42.7±0.14 50.9±0.15
Tab.8  Test Macro-F1 and Micro-F1 scores of D2-GCN with different disentangling strategies in multi-label tasks (unit: percentage)
Fig.13  Visualizations of DisenGCN’s node classification performance on the test set in Cora, Citeseer, and PubMed. The rectangles with red borders are used to highlight the ambiguous boundaries among different node classes. (a) DisenGCN on Cora; (b) DisenGCN on Citeseer; (c) DisenGCN on PubMed
Fig.14  Visualizations of D2-GCN’s node classification performance on the test set in Cora, Citeseer, and PubMed. (a) D2-GCN on Cora; (b) D2-GCN on Citeseer; (c) D2-GCN on PubMed
  
  
  
  
1 T N, Kipf M Welling . Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations. 2017
2 Y, Rong W, Huang T, Xu J Huang . DropEdge: towards deep graph convolutional networks on node classification. In: Proceedings of the 8th International Conference on Learning Representations. 2020
3 M, Zhang Y Chen . Link prediction based on graph neural networks. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 5171−5181
4 S, Yun S, Kim J, Lee J, Kang H J Kim . Neo-GNNs: neighborhood overlap-aware graph neural networks for link prediction. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021, 13683−13694
5 M, Zhang Z, Cui M, Neumann Y Chen . An end-to-end deep learning architecture for graph classification. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018, 4438−4445
6 Y, Yang Z, Feng M, Song X Wang . Factorizable graph convolutional networks. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020
7 S, Wu Y, Xiong C Weng . Dynamic depth-width optimization for capsule graph convolutional network. Frontiers of Computer Science, 2023, 17( 6): 176346
8 K, Liu X, Sun L, Jia J, Ma H, Xing J, Wu H, Gao Y, Sun F, Boulnois J Fan . Chemi-Net: a molecular graph convolutional network for accurate drug property prediction. International Journal of Molecular Sciences, 2019, 20( 14): 3389
9 M, Sun S, Zhao C, Gilvary O, Elemento J, Zhou F Wang . Graph convolutional networks for computational drug development and discovery. Briefings in Bioinformatics, 2020, 21( 3): 919–935
10 W, Jin J M, Stokes R T, Eastman Z, Itkin A V, Zakharov J J, Collins T S, Jaakkola R Barzilay . Deep learning identifies synergistic drug combinations for treating COVID-19. Proceedings of the National Academy of Sciences of the United States of America, 2021, 118( 39): e2105070118
11 R, Ying R, He K, Chen P, Eksombatchai W L, Hamilton J Leskovec . Graph convolutional neural networks for web-scale recommender systems. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018, 974−983
12 W, Fan Y, Ma Q, Li Y, He E, Zhao J, Tang D Yin . Graph neural networks for social recommendation. In: Proceedings of the World Wide Web Conference. 2019, 417−426
13 X, He K, Deng X, Wang Y, Li Y, Zhang M Wang . LightGCN: simplifying and powering graph convolution network for recommendation. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2020, 639−648
14 I, Higgins L, Matthey A, Pal C, Burgess X, Glorot M, Botvinick S, Mohamed A Lerchner . beta-VAE: learning basic visual concepts with a constrained variational framework. In: Proceedings of the 5th International Conference on Learning Representations. 2017
15 J, Song Y, Chen J, Ye X, Wang C, Shen F, Mao M Song . DEPARA: deep attribution graph for deep knowledge transferability. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 3922−3930
16 A A, Alemi I, Fischer J V, Dillon K Murphy . Deep variational information bottleneck. In: Proceedings of the 5th International Conference on Learning Representations. 2017
17 Z C Lipton . The mythos of model interpretability: in machine learning, the concept of interpretability is both important and slippery. Queue, 2018, 16( 3): 31–57
18 J, Ma P, Cui K, Kuang X, Wang W Zhu . Disentangled graph convolutional networks. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 4212−4221
19 Y, Liu X, Wang S, Wu Z Xiao . Independence promoted graph disentangled networks. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020, 4916−4923
20 P, Sen G, Namata M, Bilgic L, Getoor B, Gallagher T Eliassi-Rad . Collective classification in network data. AI Magazine, 2008, 29( 3): 93–106
21 S, Wasserman K Faust . Centrality and prestige. In: Wasserman S, Faust K. Social Network Analysis: Methods and Applications. Cambridge: Cambridge University Press, 1994, 169−219
22 P K, Chan S J Stolfo . Learning with non-uniform class and cost distributions: effects and a distributed multi-classifier approach. In: Proceedings of the Work Shop Notes KDD-98 Workshop on Distributed Data Mining. 1998, 1−9
23 K H, Brodersen C S, Ong K E, Stephan J M Buhmann . The balanced accuracy and its posterior distribution. In: Proceedings of the 20th International Conference on Pattern Recognition. 2010, 3121−3124
24 W, Luo Y, Li R, Urtasun R S Zemel . Understanding the effective receptive field in deep convolutional neural networks. In: Proceedings of the 29th Advances in Neural Information Processing Systems. 2016, 4898−4906
25 I, Sutskever O, Vinyals Q V Le . Sequence to sequence learning with neural networks. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014, 3104−3112
26 S, Kullback R A Leibler . On information and sufficiency. The Annals of Mathematical Statistics, 1951, 22( 1): 79–86
27 C E Shannon . A mathematical theory of communication. The Bell System Technical Journal, 1948, 27( 3): 379–423
28 B J, Breitkreutz C, Stark T, Reguly L, Boucher A, Breitkreutz M, Livstone R, Oughtred D H, Lackner J, Bähler V, Wood K, Dolinski M Tyers . The BioGRID interaction database: 2008 update. Nucleic Acids Research, 2008, 36: D637–D640
29 A, Liberzon A, Subramanian R, Pinchback H, Thorvaldsdóttir P, Tamayo J P Mesirov . Molecular signatures database (MSigDB) 3. 0. Bioinformatics, 2011, 27( 12): 1739–1740
30 Mahoney M. Large text compression benchmark. 2023
31 K, Toutanova D, Klein C D, Manning Y Singer . Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proceedings of the North American Chapter of the Association for Computational Linguistics. 2003, 252−259
32 L, Tang H Liu . Relational learning via latent social dimensions. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2009, 817−826
33 J, McAuley J Leskovec . Image labeling on a network: using social-network metadata for image classification. In: Proceedings of the 12th European Conference on Computer Vision. 2012, 828−841
34 P, Veličković G, Cucurull A, Casanova A, Romero P, Liò Y Bengio . Graph attention networks. In: Proceedings of the 6th International Conference on Learning Representations. 2018
35 B, Perozzi R, Al-Rfou S Skiena . DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014, 701−710
36 J, Tang M, Qu M, Wang M, Zhang J, Yan Q Mei . LINE: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web. 2015, 1067−1077
37 A, Grover J Leskovec . node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 855−864
38 F Rosenblatt . The perceptron: a probabilistic model for information storage and organization in the brain. Psychological Review, 1958, 65( 6): 386–408
39 M, Belkin P, Niyogi V Sindhwani . Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, 2006, 7: 2399–2434
40 J, Weston F, Ratle H, Mobahi R Collobert . Deep learning via semi-supervised embedding. In: Montavon G, Orr G B, Müller K R. Neural Networks: Tricks of the Trade. Berlin, Heidelberg: Springer, 2012
41 X, Zhu Z, Ghahramani J Lafferty . Semi-supervised learning using Gaussian fields and harmonic functions. In: Proceedings of the 20th International Conference on Machine Learning. 2003, 912−919
42 Q, Lu L Getoor . Link-based classification. In: Proceedings of the 20th International Conference on Machine Learning. 2003, 496−503
43 Z, Yang W, Cohen R Salakhutdinov . Revisiting semi-supervised learning with graph embeddings. In: Proceedings of the 33rd International Conference on Machine Learning. 2016, 40−48
44 M, Defferrard X, Bresson P Vandergheynst . Convolutional neural networks on graphs with fast localized spectral filtering. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 3844−3852
45 F, Monti D, Boscaini J, Masci E, Rodolà J, Svoboda M M Bronstein . Geometric deep learning on graphs and manifolds using mixture model CNNs. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 5115−5124
46 S, Abu-El-Haija B, Perozzi A, Kapoor N, Alipourfard K, Lerman H, Harutyunyan Steeg G, Ver A Galstyan . MixHop: higher-order graph convolutional architectures via sparsified neighborhood mixing. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 21−29
47 der Maaten L, van G Hinton . Visualizing data using t-SNE. Journal of Machine Learning Research, 2008, 9( 86): 2579–2605
48 J, Bruna W, Zaremba A, Szlam Y LeCun . Spectral networks and locally connected networks on graphs. In: Proceedings of the 2nd International Conference on Learning Representations. 2014
49 R, Levie F, Monti X, Bresson M M Bronstein . CayleyNets: graph convolutional neural networks with complex rational spectral filters. IEEE Transactions on Signal Processing, 2019, 67( 1): 97–109
50 W L, Hamilton R, Ying J Leskovec . Inductive representation learning on large graphs. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 1025−1035
51 H, Gao Z, Wang S Ji . Large-scale learnable graph convolutional networks. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018, 1416−1424
52 G E, Hinton A, Krizhevsky S D Wang . Transforming auto-encoders. In: Proceedings of the 21st International Conference on Artificial Neural Networks. 2011, 44−51
53 Z, Liu H, Zhang Z, Chen Z, Wang W Ouyang . Disentangling and unifying graph convolutions for skeleton-based action recognition. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 143−152
54 Y, Wang S, Tang Y, Lei W, Song S, Wang M Zhang . DisenHAN: disentangled heterogeneous graph attention network for recommendation. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2020, 1605−1614
55 Y, Qin Y, Wang F, Sun W, Ju X, Hou Z, Wang J, Cheng J, Lei M Zhang . DisenPOI: disentangling sequential and geographical influence for point-of-interest recommendation. In: Proceedings of the 16th ACM International Conference on Web Search and Data Mining. 2023, 508−516
56 Y, Wang Y, Qin F, Sun B, Zhang X, Hou K, Hu J, Cheng J, Lei M Zhang . DisenCTR: dynamic graph-based disentangled representation for click-through rate prediction. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2022
57 Y, Wang Y, Song S, Li C, Cheng W, Ju M, Zhang S Wang . DisenCite: graph-based disentangled representation learning for context-specific citation generation. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence. 2022, 11449−11458
58 J, Wu W, Shi X, Cao J, Chen W, Lei F, Zhang W, Wu X He . DisenKGAT: knowledge graph embedding with disentangled graph attention network. In: Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 2021, 2140−2149
59 I, Bae H G Jeon . Disentangled multi-relational graph convolutional network for pedestrian trajectory prediction. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence. 2021, 911−919
60 Z, Mu S, Tang J, Tan Q, Yu Y Zhuang . Disentangled motif-aware graph learning for phrase grounding. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence. 2021, 13587−13594
[1] FCS-23339-OF-SW_suppl_1 Download
[1] Jingyu LIU, Shi CHEN, Li SHEN. A comprehensive survey on graph neural network accelerators[J]. Front. Comput. Sci., 2025, 19(2): 192104-.
[2] Zhe XUE, Junping DU, Xin XU, Xiangbin LIU, Junfu WANG, Feifei KOU. Few-shot node classification via local adaptive discriminant structure learning[J]. Front. Comput. Sci., 2023, 17(2): 172316-.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed