Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2024, Vol. 18 Issue (1) : 181301    https://doi.org/10.1007/s11704-022-2457-y
Artificial Intelligence
Towards kernelizing the classifier for hyperbolic data
Meimei YANG1,2, Qiao LIU1,2, Xinkai SUN1,2, Na SHI1,2, Hui XUE1,2()
1. School of Computer Science and Engineering, Southeast University, Nanjing 210096, China
2. MOE Key Laboratory of Computer Science and Information Integration (Southeast University), Nanjing 210096, China
 Download: PDF(7817 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Data hierarchy, as a hidden property of data structure, exists in a wide range of machine learning applications. A common practice to classify such hierarchical data is first to encode the data in the Euclidean space, and then train a Euclidean classifier. However, such a paradigm leads to a performance drop due to distortion of data embedding in the Euclidean space. To relieve this issue, hyperbolic geometry is investigated as an alternative space to encode the hierarchical data for its higher ability to capture the hierarchical structures. Those methods cannot explore the full potential of the hyperbolic geometry, in the sense that such methods define the hyperbolic operations in the tangent plane, causing the distortion of data embeddings. In this paper, we develop two novel kernel formulations in the hyperbolic space, with one being positive definite (PD) and another one being indefinite, to solve the classification tasks in hyperbolic space. The PD one is defined via mapping the hyperbolic data to the Drury-Arveson (DA) space, which is a special reproducing kernel Hilbert space (RKHS). To further increase the discrimination of the classifier, an indefinite kernel is further defined in the Kreĭn spaces. Specifically, we design a 2-layer nested indefinite kernel which first maps hyperbolic data into the DA spaces, followed by a mapping from the DA spaces to the Kreĭn spaces. Extensive experiments on real-world datasets demonstrate the superiority of the proposed kernels.

Keywords data hierarchy      hyperbolic geometry      drury-arveson space      kreĭn space     
Corresponding Author(s): Hui XUE   
About author:

Changjian Wang and Zhiying Yang contributed equally to this work.

Just Accepted Date: 26 October 2022   Issue Date: 27 February 2023
 Cite this article:   
Meimei YANG,Qiao LIU,Xinkai SUN, et al. Towards kernelizing the classifier for hyperbolic data[J]. Front. Comput. Sci., 2024, 18(1): 181301.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-022-2457-y
https://academic.hep.com.cn/fcs/EN/Y2024/V18/I1/181301
Fig.1  The 7-layer tree encoded in hyperbolic space
Fig.2  The 4-layer tree encoded in hyperbolic space
Fig.3  The 4-layer tree encoded in Euclidean space
Notations Mathematical meaning
Hcn n-dimensional hyperbolic space with curvature c
Rn n-dimensional Euclidean space
R 1-dimensional Euclidean space
Ln n-dimensional Lorentz model with curvature ?1
Pcn n-dimensional Poincaré ball model with curvature c
Pn n-dimensional Poincaré ball model with curvature ?1
Bn n-dimensional open unit ball
K Kre?n space with indefinite inner product
H+,H? Hilbert spaces, where H+,H? span the Kre?n space K
DAn n-dimensional Drury-Arveson space
(xi,yi) data set, where xiPn and yi is the label of xi (1im)
K kernel matrix with element K(i,j)=khsig(xi,xj) (1im,1jm)
K+,K? kernel matrices, where K=K+?K?
x,z feature vectors
w the normal vector which determines the direction of the classification hyperplane in the SVM
b the bias which determines the distance between the classification hyperplane and the origin
ξi the slack variables of the SVM (1im)
f,f+,f? functions in the Kre?n space K, Hilbert spaces H+ and H?
?(?) mapping function, where ?(x) maps x from the original space Pn into a new feature space
k(?,?) kernel function k:Pn×PnR, where k(x,z) returns ??(x),?(z)?.
kda(?,?) DA kernel function kda:Pn×PnR, where kda(x,z)=(1??x,z?)?1
δda(?,?) function δda: DAn×DAnR, where δda(x,z) returns the metric value between x and z in DAn
ρ(?,?) function ρ: Pn×PnR, where ρ(x,z) returns the pseudohyperbolic metric between x and z in Pn
kdas(?,?) DA-Sigmoid kernel function kdas: Pn×PnR, where kdas(x,z)=tanh?(γ1??x,z?+θ), γ>0 and θ<0
φ(?) mapping function φ: PnK, where xPn and φ(x)K
φ1(?) mapping function φ1: PnDAn, where xPn and φ1(x)DAn
φ2(?) mapping function φ2: DAnK, where xDAn and φ2(x)K
Tab.1  Notation table
Fig.4  2-dimensional Lorentz model located in the 3-dimensional Kre?n space
Fig.5  The mapping process form Poincaré ball to the Kre?n space induced from DA-Sigmoid kernel
  
Fig.6  Linear classification of the 2-dimensional dataset Amazon Electronics Computers in Poincaré ball model
Fig.7  Classification of the dataset Amazon Electronics Computers by kernel SVM with DA kernel
Dataset Kernel
Dim Hyperbolic Euclidean
DA H-Tangent H-Linear Linear
Facebook #2 81.1±1.3 62.1±1.0? 61.5±0.3? 62.3±0.1?
#5 85.5±0.5 64.4±0.8? 62.6±0.6? 66.1±0.3?
#10 85.7±0.6 73.5±0.6? 66.5±1.6? 77.1±0.2?
#25 85.3±0.3 87.5±0.4° 74.1±2.2? 84.5±0.2?
Terrorist #2 58.7±4.3 50.0±1.2? 53.1±1.7? 50.7±0.3?
#5 68.1±2.0 49.0±2.1? 51.1±0.9? 52.8±1.3?
#10 69.3±2.0 51.2±2.3? 53.4±1.5? 60.4±1.1?
#25 68.2±1.4 52.5±1.9? 51.1±1.1? 61.8±1.6?
Wiki #2 53.5±1.8 17.4±1.4? 21.9±1.1? 19.3±0.2?
#5 66.3±1.1 41.1±2.2? 31.4±1.7? 37.8±1.0?
#10 65.7±0.5 47.8±2.3? 32.5±0.8? 53.4±0.8?
#25 66.7±1.1 62.4±1.0? 41.0±1.0? 62.0±0.9?
AC #2 69.4±1.7 68.6±3.2 56.1±0.3? 56.7±2.9?
#5 83.8±1.1 78.4±0.9? 64.4±2.1? 70.3±1.0?
#10 86.6±0.7 82.0±0.6? 71.9±1.6? 79.7±0.7?
#25 86.1±1.2 85.8±0.4 77.0±1.3? 81.7±0.6?
Cora ML #2 68.7±0.6 47.5±0.7? 44.8±1.4? 36.6±1.7?
#5 82.8±0.6 66.9±0.9? 61.7±1.6? 50.0±0.3?
#10 85.2±0.4 75.0±0.8? 68.2±1.6? 65.8±0.4?
#25 85.6±0.3 79.7±0.3? 74.3±1.3? 73.2±0.3?
Avg.ACC. 75.1 62.1 55.9 60.1
Top1 times 19 1 0 0
Tab.2  Average ACC of node classification on Facebook, Terrorist, Wiki, AC and Cora ML datasets. We use bold to indicate the best result. We use ?/° behind the result of the algorithm when our method is significantly superior or inferior to the compared algorithms per case (pairwise t-test at 0.05 significance level)
Dataset Kernel
Dim Hyperbolic Euclidean
DA H-Tangent H-Linear Linear
Facebook #2 85.0±1.0 70.8±3.5? 64.5±1.4? 73.8±0.1?
#5 95.6±0.4 77.2±2.8? 73.5±1.1? 83.7±0.3?
#10 96.3±0.1 88.0±0.9? 84.9±1.0? 87.8±0.2?
#25 96.3±0.2 91.5±0.3? 90.2±0.4? 93.2±0.3?
Terrorist #2 72.6±1.4 50.5±2.1? 63.4±0.9? 54.8±2.2?
#5 74.8±3.2 50.2±2.2? 62.2±1.6? 59.7±2.9?
#10 76.1±2.1 53.6±3.1? 65.2±1.8? 66.4±2.0?
#25 74.8±2.2 56.2±3.1? 65.9±3.4? 61.8±1.6?
Wiki #2 79.2±0.4 50.7±3.1? 57.9±0.7? 70.3±1.0?
#5 90.2±1.7 74.3±2.1? 78.6±0.9? 83.5±0.1?
#10 91.9±0.3 79.0±2.0? 80.1±0.5? 87.9±0.4?
#25 92.3±0.3 87.5±0.9? 84.4±0.4? 89.1±0.2?
AC #2 80.1±0.7 79.6±1.6 69.5±0.8? 78.5±0.9?
#5 93.1±0.6 83.7±4.1? 84.0±0.7? 81.5±0.7?
#10 95.0±0.2 92.6±0.5? 89.9±0.6? 90.5±0.6?
#25 95.0±0.5 94.1±0.3? 91.0±0.2? 92.2±0.4?
Cora ML #2 88.2±0.3 64.1±3.0? 70.9±0.2? 69.8±0.8?
#5 92.9±0.2 87.0±0.1? 86.1±0.1? 82.7±0.1?
#10 94.4±0.1 89.2±0.1? 88.4±0.1? 86.3±0.1?
#25 94.6±0.4 93.2±0.1? 91.7±0.3? 89.6±0.2?
Avg.AUC. 87.9 75.7 77.1 79.2
Top1 times 20 0 0 0
Tab.3  Average AUC of node classification on Facebook, Terrorist, Wiki, AC and Cora ML datasets. We use bold to indicate the best result. We use ?/° behind the result of the algorithm when our method is significantly superior or inferior to the compared algorithms per case (pairwise t-test at 0.05 significance level)
Dataset Kernel
Dim Hyperbolic Euclidean
DA H-Tangent H-Linear Linear
Facebook #2 70.3±1.0 42.5±1.5? 37.5±0.4? 41.9±0.3?
#5 86.3±1.2 50.6±2.2? 56.4±1.2? 57.1±1.1?
#10 87.8±0.3 71.3±1.9? 70.4±2.3? 70.6±0.4?
#25 87.9±0.6 82.1±0.7? 80.3±0.7? 82.9±0.3?
Terrorist #2 54.0±1.7 34.3±1.4? 46.3±0.6? 36.1±0.8?
#5 57.2±1.7 34.1±1.6? 45.5±2.1? 39.8±2.0?
#10 58.3±1.3 38.1±2.4? 47.9±2.0? 46.1±1.1?
#25 57.1±1.4 39.0±2.7? 47.6±2.4? 51.9±1.5?
Wiki #2 50.4±0.8 13.5±1.8? 12.3±0.4? 20.2±0.2?
#5 73.2±1.8 48.9±1.6? 53.1±1.1? 48.3±0.4?
#10 75.1±0.8 57.7±1.2? 57.1±0.9? 58.4±0.4?
#25 75.8±0.5 66.1±0.7? 61.6±1.1? 66.5±0.7?
AC #2 66.6±1.4 65.1±1.8? 55.6±0.6? 55.1±1.2?
#5 86.4±1.2 71.3±3.1? 70.5±0.6? 64.4±1.4?
#10 90.2±0.6 86.8±0.6? 82.5±0.9? 82.2±1.3?
#25 90.0±1.1 90.5±0.4 86.5±0.6? 84.9±0.9?
Cora ML #2 72.2±0.7 38.9±1.6? 44.2±0.7? 37.4±0.5?
#5 84.4±0.5 73.4±0.2? 72.2±0.4? 56.1±0.2?
#10 86.8±0.4 79.4±0.2? 78.4±0.3? 67.7±0.2?
#25 87.2±0.7 85.7±0.3? 83.2±0.6? 76.3±0.4?
Avg.AUPR. 74.9 62.1 59.5 57.2
Top1 times 19 1 0 0
Tab.4  Average AUPR of node classification on Facebook, Terrorist, Wiki, AC and Cora ML datasets. We use bold to indicate the best result. We use ?/° behind the result of the algorithm when our method is significantly superior or inferior to the compared algorithms per case (pairwise t-test at 0.05 significance level)
Dataset Kernel
Dim Hyperbolic Euclidean
DA-Sigmoid H-Poly H-RBF H-Laplace H-Binomial E-Sigmoid E-RBF E-Laplace E-Poly
Facebook #2 85.5±0.5 69.1±0.4? 73.9±1.3? 74.6±1.8? 66.0±1.3? 71.5±1.0? 67.2±0.4? 67.2±0.3? 66.0±0.9?
#5 90.5±0.3 82.3±0.6? 87.4±0.5? 87.6±0.5? 81.8±3.6? 86.4±0.9? 85.1±0.5? 85.3±0.6? 82.3±0.7?
#10 90.7±0.5 89.3±0.6? 87.7±1.3? 88.3±0.5? 87.3±3.5? 88.8±0.7? 86.6±1.2? 86.9±0.6? 86.2±0.6?
#25 91.0±0.5 91.3±0.4° 88.0±1.6? 88.2±0.4? 86.5±1.9? 90.2±0.1 88.0±0.8? 88.0±0.5? 88.8±0.7?
Terrorist #2 59.6±2.1 52.6±1.3? 59.0±4.7 60.5±3.9 51.3±3.0? 59.0±2.1 60.1±2.3 60.6±1.5 51.5±1.0?
#5 68.1±1.6 53.2±2.2? 64.6±3.3? 66.6±2.0? 57.6±7.7? 63.2±1.5? 62.9±1.4? 64.4±1.5? 56.4±1.2?
#10 68.2±2.3 59.6±1.5? 65.9±1.5 66.0±1.9? 61.4±8.0? 67.9±1.3 65.5±2.3? 66.3±2.1? 64.9±1.5?
#25 67.8±1.5 61.6±1.2? 64.8±1.9? 66.0±2.3? 60.9±5.4? 67.6±1.6 63.2±1.5? 63.0±2.2? 66.0±1.0?
Wiki #2 58.4±0.8 40.2±0.5? 38.6±1.9? 41.0±2.4? 27.5±3.1? 31.3±2.0? 23.6±1.3? 23.9±0.9? 19.3±0.7?
#5 71.6±0.9 63.7±0.7? 63.0±2.1? 64.8±1.8? 47.7±4.3? 62.6±0.8? 58.8±1.6? 59.3±1.4? 45.1±3.1?
#10 73.2±0.7 68.9±0.7? 66.5±1.5? 67.1±1.8? 49.8±5.1? 67.9±1.4? 66.3±1.0? 65.4±1.7? 54.1±4.1?
#25 74.0±1.3 72.2±0.9 67.9±1.6? 68.9±1.6? 55.5±4.2? 72.0±0.8? 69.0±1.6? 68.1±1.3? 65.1±1.4?
AC #2 79.1±0.6 73.6±0.9? 74.5±2.2? 75.5±1.1? 69.6±3.2? 73.4±1.0? 71.9±1.4? 72.5±1.0? 66.9±4.0?
#5 87.2±0.9 79.1±0.7? 81.8±9.9? 86.1±1.1? 82.9±2.1? 78.7±0.5? 77.6±0.8? 79.2±0.5? 73.6±1.9?
#10 89.8±0.4 88.0±0.8? 86.8±0.6? 87.6±0.6? 86.8±1.1? 85.3±1.0? 84.4±1.1? 84.9±0.8? 79.7±2.6?
#25 89.6±0.6 89.7±0.6 85.4±5.0? 87.3±0.5? 87.3±1.1? 87.3±0.3? 85.9±0.8? 86.7±0.4? 84.2±0.6?
Cora ML #2 71.5±0.6 42.2±3.9? 67.2±1.3? 68.8±0.9? 51.7±3.5? 61.3±0.8? 56.5±0.6? 57.0±0.7? 45.9±2.2?
#5 84.9±0.6 75.5±0.2? 83.3±0.7? 83.6±1.0? 79.5±1.7? 73.8±1.3? 71.0±0.6? 71.3±0.5? 57.2±1.6?
#10 86.1±0.3 82.7±0.4? 84.7±0.6? 85.0±0.4? 81.7±2.9? 77.1±1.5? 76.1±0.8? 76.7±0.9? 71.6±1.0?
#25 86.4±0.5 85.7±0.6? 84.8±0.5? 85.1±0.5? 83.6±1.3? 79.3±0.8? 69.5±1.3? 77.7±1.4? 76.4±1.2?
Avg.ACC. 78.7 71.1 73.8 74.9 67.8 72.2 69.5 70.2 65.0
Top1 times 17 2 0 0 0 0 0 1 0
Tab.5  Average ACC of node classification on Facebook, Terrorist, Wiki, AC and Cora ML datasets. We use bold to indicate the best result. We use ?/? behind the result of the algorithm when our method is significantly superior or inferior to the compared algorithms per case (pairwise t-test at 0.05 significance level)
Dataset Kernel
Dim Hyperbolic Euclidean
DA-Sigmoid H-Poly H-RBF H-Laplace H-Binomial E-Sigmoid E-RBF E-Laplace E-Poly
Facebook #2 91.5±0.6 82.0±0.3? 74.7±3.7? 83.0±3.0? 70.9±3.5? 82.2±4.1? 75.9±1.8? 80.9±2.1? 68.8±3.9?
#5 96.6±0.3 90.9±0.3? 94.1±0.6? 94.9±1.6? 90.5±0.6? 93.2±0.5? 91.8±0.7? 93.7±0.4? 91.2±0.2?
#10 96.7±0.4 94.2±0.5? 94.6±2.1? 95.9±0.4? 93.0±1.1? 95.0±0.4? 94.3±0.4? 95.3±0.3? 93.7±0.3?
#25 96.8±0.4 95.4±0.4? 95.4±0.3? 96.2±0.2? 93.3±0.3? 96.0±0.4? 96.1±0.4 96.4±0.3 94.9±0.7?
Terrorist #2 70.6±2.4 54.4±1.5? 58.0±4.2? 65.1±3.9? 56.7±5.1? 66.5±2.1? 65.0±2.9? 68.2±2.0? 55.5±1.7?
#5 77.3±2.2 62.9±1.7? 63.3±4.3? 70.4±2.1? 70.0±4.4? 71.5±2.3? 72.4±2.5? 75.6±1.5? 64.2±2.9?
#10 77.8±2.0 67.1±1.9? 65.9±4.8? 69.9±1.7? 74.2±4.9? 91.3±0.5? 90.6±0.4? 92.3±0.3? 89.6±0.3?
#25 78.2±1.2 70.7±1.6? 67.5±6.5? 70.5±3.8? 68.4±4.7? 74.9±1.5? 72.8±2.4? 75.0±2.0? 72.2±2.0?
Wiki #2 87.1±0.4 74.3±0.4? 70.4±1.6? 74.1±3.3? 69.0±3.7? 78.5±1.4? 64.9±1.0? 72.2±0.5? 61.9±2.5?
#5 93.6±0.4 89.0±0.2? 87.3±2.1? 89.9±1.9? 80.2±2.3? 89.4±0.2? 84.9±0.7? 88.6±0.4? 82.2±4.0?
#10 94.1±0.3 91.8±0.4? 90.2±1.4? 90.8±1.8? 79.7±5.0? 91.3±0.5? 90.6±0.4? 92.3±0.3? 89.6±0.3?
#25 94.1±1.0 93.2±0.3? 90.2±1.2? 91.5±1.2? 82.5±3.1? 91.7±0.5? 91.8±0.3? 92.5±0.2? 89.4±0.4?
AC #2 89.1±0.3 84.6±0.6? 83.0±1.9? 85.8±0.8? 81.4±1.7? 91.7±0.5? 91.8±0.3? 92.5±0.2? 89.4±0.4?
#5 95.8±0.4 91.4±0.4? 93.0±2.5? 94.6±0.7? 91.0±1.4? 90.4±0.3? 88.7±0.5? 90.5±0.4? 83.7±6.6?
#10 96.9±0.3 96.0±0.4? 94.8±2.0? 96.5±0.3? 94.5±0.6? 94.5±0.5? 94.6±0.8? 95.3±0.4? 92.1±0.6?
#25 96.7±0.4 96.3±0.3? 92.3±2.0? 96.4±0.2? 94.3±1.2? 95.5±0.3? 96.0±0.4? 96.1±0.3? 93.1±0.8?
Cora ML #2 90.7±0.2 76.4±2.2? 84.9±1.0? 88.2±0.3? 80.2±4.1? 84.4±0.5? 75.8±1.1? 79.5±0.8? 61.0±4.0?
#5 95.8±0.2 90.6±0.1? 94.0±0.3? 95.1±0.3? 91.8±0.8? 90.8±1.6? 88.1±0.2? 90.3±0.3? 86.2±0.3?
#10 96.7±0.2 95.1±0.2? 95.3±0.3? 96.3±0.2? 93.3±1.0? 92.1±1.4? 92.2±0.3? 93.1±0.3? 90.3±0.3?
#25 96.7±0.2 96.8±0.2 96.7±0.2 96.5±0.2 93.6±2.7? 94.0±0.3? 90.4±0.9? 93.4±0.5? 91.8±0.8?
Avg.AUC. 90.6 84.7 84.3 87.1 82.4 86.6 84.1 86.6 80.6
Top1 times 19 1 0 0 0 0 0 0 0
Tab.6  Average AUC of node classification on Facebook, Terrorist, Wiki, AC and Cora ML datasets. We use bold to indicate the best result. We use ?/? behind the result of the algorithm when our method is significantly superior or inferior to the compared algorithms per case (pairwise t-test at 0.05 significance level)
Dataset Kernel
Dim Hyperbolic Euclidean
DA-Sigmoid H-Poly H-RBF H-Laplace H-Binomial E-Sigmoid E-RBF E-Laplace E-Poly
Facebook #2 76.1±0.9 51.4±2.2? 55.8±5.9 64.1±4.1? 46.7±2.8? 61.1±3.0? 55.8±1.2? 60.0±2.2? 45.8±2.1?
#5 90.3±0.6 73.8±0.9? 87.0±0.8? 88.4±2.7? 79.4±1.9? 80.4±1.4? 81.9±0.8? 83.7±0.8? 75.8±0.3?
#10 90.0±0.8 86.2±1.0? 87.8±3.6? 89.9±0.4 83.1±2.6? 86.4±1.4? 87.4±0.7? 88.9±0.6? 83.7±1.4?
#25 90.5±0.8 89.0±0.6? 89.6±0.6? 90.7±0.5? 84.7±1.5? 88.3±1.0? 89.5±0.6? 89.9±0.5? 85.8±1.8?
Terrorist #2 51.5±3.3 37.5±1.0? 43.3±4.0? 47.5±2.5? 41.5±3.6? 48.1±1.7? 48.2±2.3? 50.0±1.7 37.4±1.4?
#5 60.9±3.4 44.2±2.3? 50.1±4.2? 54.7±1.0? 52.8±3.5? 53.5±2.0? 55.8±2.1? 58.5±1.7? 46.7±2.3?
#10 61.2±2.7 49.2±2.6? 53.1±4.5? 53.4±1.9? 55.7±3.7? 56.7±2.2? 57.3±2.3? 59.5±2.6 54.0±1.4?
#25 61.0±2.2 52.9±2.2? 53.6±5.1? 55.4±3.6? 52.3±2.9? 58.7±1.7? 56.9±2.2? 59.1±2.2 55.4±2.8?
Wiki #2 60.9±1.0 35.6±0.7? 41.2±2.2? 45.3±3.0? 33.4±3.4? 34.1±1.1? 28.2±1.0? 32.9±0.5? 22.8±0.7?
#5 77.6±1.0 64.5±0.2? 69.0±2.8? 73.0±2.7? 52.9±1.9? 64.2±1.1? 65.9±0.7? 69.1±0.8? 54.9±2.2?
#10 78.1±0.8 71.7±1.4? 74.5±1.7? 75.7±2.2? 56.1±4.6? 69.0±1.8? 74.9±0.9? 76.4±0.7? 63.3±1.2?
#25 77.9±1.3 75.3±0.8? 74.2±2.1? 76.3±1.7? 60.4±3.0? 73.7±1.0? 77.5±1.0? 78.7±0.5° 65.4±1.1?
AC #2 80.2±0.7 68.9±1.3? 72.9±2.4? 75.6±2.0? 66.1±2.2? 72.1±0.8? 67.4±1.4? 68.3±1.3? 58.9±3.2?
#5 92.1±0.8 81.9±0.9? 88.6±3.7? 89.5±2.4? 83.7±2.4? 81.3±0.7? 80.0±0.5? 82.6±0.6? 72.6±5.4?
#10 94.2±0.4 92.9±0.7? 91.8±1.7? 93.7±0.6? 89.7±1.6? 89.6±1.4? 90.8±1.1? 91.6±0.7? 84.2±1.6?
#25 93.7±0.7 93.4±0.4 92.3±2.0? 93.5±0.6 89.7±2.2? 91.1±0.9? 92.5±0.6? 93.1±0.4 86.0±1.6?
Cora ML #2 73.8±0.5 45.9±2.7? 69.0±1.7? 72.5±0.5? 59.8±4.0? 60.9±1.0? 54.9±1.2? 57.8±0.9? 36.4±3.2?
#5 88.4±0.6 77.8±0.4? 87.2±0.7? 88.6±0.4 80.4±1.8? 77.7±2.0? 76.2±0.7? 77.6±0.4? 66.4±0.6?
#10 90.4±0.4 87.2±0.7? 89.0±0.6? 90.5±0.4 84.2±2.1? 80.9±2.1? 82.8±0.6? 83.7±0.6? 75.6±0.7?
#25 90.6±0.4 91.0±0.4 89.9±0.4? 91.0±0.2° 85.5±2.5? 85.1±0.8? 79.1±1.3? 84.4±1.1? 79.1±2.0?
Avg.AUPR. 79.0 68.5 73.0 75.5 66.9 67.7 70.2 72.3 62.5
Top1 times 16 1 0 3 0 0 0 1 0
Tab.7  Average AUPR of node classification on Facebook, Terrorist, Wiki, AC and Cora ML datasets. We use bold to indicate the best result. We use ?/? behind the result of the algorithm when our method is significantly superior or inferior to the compared algorithms per case (pairwise t-test at 0.05 significance level)
Fig.8  The study of the parameter γ of the DA-Sigmoid kernel on the Facebook, Terrorist, Wiki, AC and Cora ML datasets. We report the of average ACC, AUC and AUPR for node classification task, and the data is embedded in 2, 5, 10 and 25-dimensional (2, 5, 10, 25) hyperbolic space. (a) Facebook (#2); (b) Facebook (#5); (c) Facebook (#10); (d) Facebook (#25); (e) Terrorist (#2); (f) Terrorist (#5); (g) Terrorist (#10); (h) Terrorist (#25); (i) Wiki (#2); (j) Wiki (#5); (k) Wiki (#10); (l) Wiki (#25); (m) AC (#2); (n) AC (#5); (o) AC (#10); (p) AC (#25); (q) Cora ML (#2); (r) Cora ML (#5); (s) Cora ML (#10); (t) Cora ML (#25)
  
  
  
  
  
1 J, Gong Z, Teng Q, Teng H, Zhang L, Du S, Chen Z A, Bhuiyan J, Li M, Liu H Ma . Hierarchical graph transformer-based deep learning model for large-scale multi-label text classification. IEEE Access, 2020, 8: 30885–30896
2 Q, Wang Z, Mao B, Wang L Guo . Knowledge graph embedding: a survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering, 2017, 29( 12): 2724–2743
3 X, Du Y Xia . Natural images enhancement using structure extraction and retinex. In: Proceedings of the 20th International Conference on Advanced Concepts for Intelligent Vision Systems. 2020, 408−420
4 S, Kim C, Song J, Jang J Paik . Edge-aware image filtering using a structure-guided CNN. IET Image Processing, 2020, 14( 3): 472–479
5 J, Long X, Feng X, Zhu J, Zhang G Gou . Efficient superpixel-guided interactive image segmentation based on graph theory. Symmetry, 2018, 10( 5): 169
6 H, Wang F, Zhang M, Zhang J, Leskovec M, Zhao W, Li Z Wang . Knowledge-aware graph neural networks with label smoothness regularization for recommender systems. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2019, 968−977
7 R, Ying R, He K, Chen P, Eksombatchai W L, Hamilton J Leskovec . Graph convolutional neural networks for web-scale recommender systems. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018, 974−983
8 S, Lee S, Park M, Kahng S G Lee . PathRank: a novel node ranking measure on a heterogeneous graph for recommender systems. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management. 2012, 1637−1641
9 R, Ma P, Fang T, Drummond M Harandi . Adaptive poincaré point to set distance for few-shot classification. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence. 2022, 1926−1934
10 J, Sun Y, Xie H, Zhang C Faloutsos . Less is more: compact matrix decomposition for large sparse graphs. In: Proceedings of the 7th SIAM International Conference on Data Mining. 2007, 366−377
11 B, Perozzi R, Al-Rfou S Skiena . DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014, 701−710
12 E, Hajiramezanali A, Hasanzadeh N, Duffield K, Narayanan M, Zhou X Qian . Variational graph recurrent neural networks. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 960
13 N, Linial E, London Y Rabinovich . The geometry of graphs and some of its algorithmic applications. Combinatorica, 1995, 15( 2): 215–245
14 D, Krioukov F, Papadopoulos M, Kitsak A, Vahdat M Boguñá . Hyperbolic geometry of complex networks. Physical Review E, 2010, 82( 3): 036106
15 M, Nickel D Kiela . Poincaré embeddings for learning hierarchical representations. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 6341−6350
16 G, Alanis-Lobato P, Mier M A Andrade-Navarro . Efficient embedding of complex networks to hyperbolic space via their Laplacian. Scientific Reports, 2016, 6: 30108
17 B P, Chamberlain J, Clough M P Deisenroth . Neural embeddings of graphs in hyperbolic space. 2017, arXiv preprint arXiv: 1705.10359
18 O E, Ganea G, Bécigneul T Hofmann . Hyperbolic entailment cones for learning hierarchical embeddings. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 1632−1641
19 M, Nickel D Kiela . Learning continuous hierarchies in the Lorentz model of hyperbolic geometry. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 3779−3788
20 Sa C, De A, Gu C, Ré F Sala . Representation tradeoffs for hyperbolic embeddings. In: Proceedings of Machine Learning Research, 2018, 80: 4460−4469
21 R, Suzuki R, Takahama S Onoda . Hyperbolic disk embeddings for directed acyclic graphs. In: Proceedings of International Conference on Machine Learning. 2019, 6066−6075
22 I, Balažević C, Allen T Hospedales . Multi-relational Poincaré graph embeddings. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 401
23 R, Sonthalia A C Gilbert . Tree! I am no tree! I am a low dimensional hyperbolic embedding. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 72
24 M, Weber M, Zaheer A S, Rawat A, Menon S Kumar . Robust large-margin learning in hyperbolic space. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1499
25 H, Cho B, DeMeo J, Peng B Berger . Large-margin classification in hyperbolic space. In: Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics. 2019, 1832−1840
26 P, Fang M, Harandi L Petersson . Kernel methods in hyperbolic spaces. In: Proceedings of 2021 IEEE/CVF International Conference on Computer Vision (ICCV). 2021, 10665−10674
27 R Rochberg . Complex hyperbolic geometry and Hilbert spaces with complete pick kernels. Journal of Functional Analysis, 2019, 276( 5): 1622–1679
28 S Cucerzan . Large-scale named entity disambiguation based on Wikipedia data. In: Proceedings of 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL). 2007, 708−716
29 B, Rozemberczki R, Davies R, Sarkar C Sutton . GEMSEC: graph embedding with self clustering. In: Proceedings of 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 2019, 65−72
30 B, Zhao P, Sen L Getoor . Entity and relationship labeling in affiliation networks. In: Proceedings of ICML Workshop on Statistical Network Analysis. 2006
31 O, Shchur M, Mumme A, Bojchevski S Günnemann . Pitfalls of graph neural network evaluation. 2018, arXiv preprint arXiv: 1811.05868
32 A, Bojchevski S Günnemann . Deep Gaussian embedding of graphs: unsupervised inductive learning via ranking. In: Proceedings of the 6th International Conference on Learning Representations. 2018
33 J R Parker . Notes on complex hyperbolic geometry. Preprint, 2003
34 J G Ratcliffe . Foundations of Hyperbolic Manifolds. New York: Springer, 1994
35 W M Goldman . Complex Hyperbolic Geometry. Oxford: Oxford University Press, 1999
36 J, Kim C D Scott . L2 kernel classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32( 10): 1822–1831
37 B, Bordelon A, Canatar C Pehlevan . Spectrum dependent learning curves in kernel regression and wide neural networks. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 96
38 Z, Kang L, Wen W, Chen Z Xu . Low-rank kernel learning for graph-based clustering. Knowledge-Based Systems, 2019, 163: 510–517
39 S W, Ober C E, Rasmussen der Wilk M van . The promises and pitfalls of deep kernel learning. In: Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence. 2021, 1206−1216
40 P, Fang J, Zhou S K, Roy P, Ji L, Petersson M Harandi . Attention in attention networks for person retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44( 9): 4626–4641
41 C S, Ong X, Mary S, Canu A J Smola . Learning with non-positive kernels. In: Proceedings of the 21st International Conference on Machine Learning. 2004
42 A A Ungar . The hyperbolic Pythagorean theorem in the Poincaré disc model of hyperbolic geometry. The American Mathematical Monthly, 1999, 106( 8): 759–763
43 O, Shalit O Shalit . Operator theory and function theory in Drury–Arveson space and its quotients. In: Alpay D, ed. Operator Theory. Basel: Springer, 2014, 1−50
44 N, Arcozzi R, Rochberg E, Sawyer B D Wick . Distance functions for reproducing kernel Hilbert spaces. Contemp., 2011, 547: 25−53
45 A A Ungar . From Pythagoras to Einstein: the hyperbolic Pythagorean theorem. Foundations of Physics, 1998, 28( 8): 1283–1321
46 G S, Birman A A Ungar . The hyperbolic derivative in the Poincaré ball model of hyperbolic geometry. Journal of Mathematical Analysis and Applications, 2001, 254( 1): 321–333
47 T Dray . The Geometry of Special Relativity. Boca Raton: CRC Press, 2012
48 G, Loosli S, Canu C S Ong . Learning SVM in Kreĭn spaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38( 6): 1204–1216
49 H M, Xu H, Xue X H, Chen Y Y Wang . Solving indefinite kernel support vector machine with difference of convex functions programming. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017, 2782−2788
50 D, Oglic T Gärtner . Learning in reproducing kernel Krein spaces. In: Proceedings of the International Conference on Machine Learning. 2018, 3856−3864
51 J, McAuley C, Targett Q, Shi den Hengel A van . Image-based recommendations on styles and substitutes. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2015, 43−52
52 A K, McCallum K, Nigam J, Rennie K Seymore . Automating the construction of internet portals with machine learning. Information Retrieval, 2000, 3( 2): 127–163
53 A A Deshmukh . Kernel approximation. Stats 608, 2015, 1–3
54 J Platt . Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in Large Margin Classifiers, 1999, 10( 3): 61–74
55 B, Schölkopf A J Smola . Learning with Kernels: support vector machines, regularization, optimization, and beyond. MIT Press, 2002
56 C C, Chang C J Lin . LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2011, 2( 3): 27
[1] FCS-22457-OF-MY_suppl_1 Download
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed