A primal perspective for indefinite kernel SVM problem

doi:10.1007/s11704-018-8148-z

Front. Comput. Sci.

2020, Vol. 14

Issue (2) : 349-363 https://doi.org/10.1007/s11704-018-8148-z

RESEARCH ARTICLE

A primal perspective for indefinite kernel SVM problem

Hui XUE^1,²(

), Haiming XU^1,², Xiaohong CHEN³, Yunyun WANG⁴

¹. School of Computer Science and Engineering, Southeast University, Nanjing 210096, China
². Key Laboratory of Computer Network and Information Integration (Southeast University), MOE, Nanjing 210096, China
³. College of Science, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
⁴. School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210046, China

Download: PDF(533 KB)
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Abstract

Indefinite kernel support vector machine (IKSVM) has recently attracted increasing attentions in machine learning. Since IKSVM essentially is a non-convex problem, existing algorithms either change the spectrum of indefinite kernel directly but risking losing some valuable information or solve the dual form of IKSVM whereas suffering from a dual gap problem. In this paper, we propose a primal perspective for solving the problem. That is, we directly focus on the primal form of IKSVM and present a novel algorithm termed as IKSVM-DC for binary and multi-class classification. Concretely, according to the characteristics of the spectrum for the indefinite kernel matrix, IKSVM-DC decomposes the primal function into the subtraction of two convex functions as a difference of convex functions (DC) programming. To accelerate convergence rate, IKSVM-DC combines the classical DC algorithm with a line search step along the descent direction at each iteration. Furthermore, we construct a multi-class IKSVM model which can classify multiple classes in a unified form. A theoretical analysis is then presented to validate that IKSVM-DC can converge to a local minimum. Finally, we conduct experiments on both binary and multi-class datasets and the experimental results show that IKSVM-DC is superior to other state-of-the-art IKSVM algorithms.

Keywords indefinite kernel support vector machine multi-class classification non-convex optimization

Corresponding Author(s): Hui XUE

Just Accepted Date: 30 November 2018 Online First Date: 17 September 2019 Issue Date: 16 October 2019

Cite this article:

Hui XUE,Haiming XU,Xiaohong CHEN, et al. A primal perspective for indefinite kernel SVM problem[J]. Front. Comput. Sci., 2020, 14(2): 349-363.

URL:

https://academic.hep.com.cn/fcs/EN/10.1007/s11704-018-8148-z
https://academic.hep.com.cn/fcs/EN/Y2020/V14/I2/349

1	C Cortes. Support-vector network. Machine Learning Journal, 1995, 20(3): 273–297 https://doi.org/10.1007/BF00994018
2	N Cristianini, J Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge: Cambridge University Press, 2000 https://doi.org/10.1017/CBO9780511801389
3	Y Y Li, R Q Lu. Locality preserving projection on SPD matrix Lie group: algorithm and analysis. Science China Information Sciences, 2018, 61(9): 092104 https://doi.org/10.1007/s11432-017-9233-4
4	L R Ma, D D Song, L J Liao, J Wang. PSVM: a preference-enhanced SVM model using preference data for classification. Science China Information Sciences, 2017, 60(12): 122103 https://doi.org/10.1007/s11432-016-9020-4
5	H Saigo, J-P Vert, N Ueda, T Akutsu. Protein homology detection using string alignment kernels. Bioinformatics, 2004, 20(11): 1682–1689 https://doi.org/10.1093/bioinformatics/bth141
6	C Wang, Y Song, H Li, M Zhang, J Han. Text classification with heterogeneous information network kernels. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 2130–2136
7	V Vapnik. The Nature of Statistical Learning Theory. Germany: Springer Science & Business Media, 2013
8	F Suard, A Rakotomamonjy, A Bensrhair. Kernel on bag of paths for measuring similarity of shapes. In: Proceedings of European Symposium on Artificial Neural Networks. 2007, 355–360
9	Y Chen, E Garcia, M Gupta, A Rahimi, L Cazzanti. Similarity-based classification: concepts and algorithms. Journal of Machine Learning Research, 2009, 10(3): 747–776
10	Y Chen, M Gupta. Fusing similarities and kernels for classification, In: Proceedings of the 12th International Conference on IEEE Information Fusion. 2009, 474–481
11	E Pkalska, B Haasdonk. Kernel discriminant analysis for positive definite and indefinite kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(6): 1017–1032 https://doi.org/10.1109/TPAMI.2008.290
12	H Sun, Q Wu. Least square regression with indefinite kernels and coefficient regularization. Applied and Computational Harmonic Analysis, 2011, 30(1): 96–109 https://doi.org/10.1016/j.acha.2010.04.001
13	Y Ying, C Campbell, M Girolami. Analysis of SVMwith indefinite kernels. In: Proceedings of the 22nd International Conference on Neural Information Processing Systems. 2009, 2205–2213
14	G Loosli, S Canu. Non positive SVM. Technical Report, 2010
15	B Haasdonk. Feature space interpretation of SVMs with indefinite kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(4): 482–492 https://doi.org/10.1109/TPAMI.2005.78
16	E Pekalska, P Paclik, R P Duin. A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research, 2001, 2(12): 175–211
17	T Graepel, R Herbrich, P Bollmann-Sdorra, K Obermayer. Classification on pairwise proximity data. In: Proceedings of the 13th Conference on Neural Information Processing Systems. 1999, 438–444
18	V Roth, J Laub, M Kawanabe, J M Buhmann. Optimal cluster preserving embedding of nonmetric proximity data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(12): 1540–1551 https://doi.org/10.1109/TPAMI.2003.1251147
19	R Luss, A d’Aspremont. Support vector machine classification with indefinite kernels. In: Proceedings of the 21st Conference on Neural Information Processing Systems. 2008, 953–960
20	J Chen, J Ye. Training SVM with indefinite kernels. In: Proceedings of the 25th International Conference on Machine Learning. 2008, 136–143 https://doi.org/10.1145/1390156.1390174
21	Y Chen, M R Gupta, B Recht. Learning kernels from indefinite similarities. In: Proceedings of the 26th International Conference on Machine Learning. 2009, 145–152 https://doi.org/10.1145/1553374.1553393
22	S Gu, Y Guo. Learning SVM classifiers with indefinite kernels. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence. 2012, 942–948
23	H T Lin, C J Lin. A study on sigmoid kernels for SVM and the training of non-PSD kernels by SMO-type methods. Neural Computation, 2003, 3: 1–32
24	F B Akoa. Combining DC algorithms (DCAs) and decomposition techniques for the training of nonpositive-semidefinite kernels. IEEE Transactions on Neural Networks, 2008, 19(11): 1854–1872 https://doi.org/10.1109/TNN.2008.2003299
25	C S Ong, X Mary, S Canu, A J Smola. Learning with non-positive kernels. In: Proceedings of the 21st International Conference on Machine Learning. 2004, 81 https://doi.org/10.1145/1015330.1015443
26	G Loosli, S Canu, C S Ong. Learning SVM in Kre˘ın spaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(6): 1204–1216 https://doi.org/10.1109/TPAMI.2015.2477830
27	I M Alabdulmohsin, X Gao, X Zhang. Support vector machines with indefinite kernels. In: Proceedings of the 6th Asian Conference on Machine Learning. 2014, 32–47
28	S B Kotsiantis, I Zaharakis, P Pintelas. Supervised machine learning: a review of classification techniques. Emerging Artificial Intelligence Applications in Computer Engineering, 2007, 160: 3–24 https://doi.org/10.1007/s10462-007-9052-3
29	J Friedman. Another approach to polychotomous classification. Technical Report, Department of Statistics, Stanford University, 1996
30	U Krebel. Pairwise classification and support vector machines. Advances in Kernel Methods: Support Vector Learning, 1999, 255–268
31	L Bottou, C Cortes, J S Denker, H Drucker, Z Guyon, L D Jackel, Y Le Cun, U A Muller, E Sackinger, P Simard, U Vapnik. Comparison of classifier methods: a case study in handwritten digit recognition. In: Proceedings of the 12th IAPR International Conference on Pattern Recognition. 1994, 77–82
32	T G Dietterich, G Bakiri. Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 1995, 2: 263–286 https://doi.org/10.1613/jair.105
33	E L Allwein, R E Schapire, Y Singer. Reducing multiclass to binary: a unifying approach for margin classifiers. Journal of Machine Learning Research, 2000, 1(12): 113–141
34	J C Platt, N Cristianini, J Shawe-Taylor. Large margin dags for multiclass classification. In: Proceedings of the 12th International Conference on Neural Information Processing Systems. 1999, 547–553
35	B Scholkopf, A J Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Massackusetts: MIT Press, 2001
36	P D Tao, L T H An. Convex analysis approach to DC programming: theory, algorithms and applications. Acta Mathematica Vietnamica, 1997, 22(1): 289–355
37	T P Dinh, H A Le Thi. Recent advances in DC programming and DCA. In: Nguyen N T, Le-Thi H A, eds. Transactions on Computational Intelligence XIII. Springer, Berlin, Heidelbery, 2014, 1–37 https://doi.org/10.1007/978-3-642-54455-2_1
38	B Piot, M Geist, O Pietquin. Difference of convex functions programming for reinforcement learning. In: Proceedings of the 27th Conference on Neural Information Processing Systems. 2014, 2519–2527
39	H M Xu, H Xue, X H Chen, Y Y Wang. Solving indefinite kernel support vector machine with difference of convex functions programming. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017, 2782–2788
40	F J A Artacho, R M Fleming, P T Vuong. Accelerating the DC algorithm for smooth functions. Mathematical Programming, 2018, 169(1): 95–118 https://doi.org/10.1007/s10107-017-1180-1
41	Y Nesterov. Introductory Lectures on Convex Optimization: A Basic Course. Germany: Springer Science & Business Media, 2013
42	Z Wang, X Xue. Multi-class support vector machine. In: Ma Y, Guo G, eds. Support Vector Machines Applications. Springer, Cham, 2014, 23–48 https://doi.org/10.1007/978-3-319-02300-7_2
43	V N Vapnik. Statistical Learning Theory. Wiley New York: JohnWiley & Sons, Inc., 1998
44	J Weston, C Watkins. Multi-class support vector machines. Technical Report, Department of Computer Science, Royal Holloway, University of London, 1998
45	K Crammer, Y Singer. On the learnability and design of output codes for multi-class problems. Machine Learning, 2002, 47(2): 201–233 https://doi.org/10.1023/A:1013637720281
46	C Blake, C J Merz. UCI repository of machine learning databases, 1998
47	G Ratsch, T Onoda, K R Muller. Soft margins for AdaBoost. Machine Learning, 2001, 42(3): 287–320 https://doi.org/10.1023/A:1007618119488
48	R P Duin, E Pekalska. The dissimilarity representation for pattern recognition: a tutorial. Technical Report, 2009
49	A Mosek. The MOSEK optimization software. Google Website, 2010
50	G Wu, E Y Chang, Z Zhang. An analysis of transformation on nonpositive semidefinite similarity matrix for kernel machines. In: Proceedings of the 22nd International Conference on Machine Learning. 2005, 1245–1256
51	C C Chang, C J Lin. LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3): 27 https://doi.org/10.1145/1961189.1961199
52	M L Zhang, Z H Zhou. Exploiting unlabeled data to enhance ensemble diversity. DataMining and Knowledge Discovery, 2013, 26(1): 98–129 https://doi.org/10.1007/s10618-011-0243-9

[1]

Article highlights

Download

[1]	Hui XUE, Sen LI, Xiaohong CHEN, Yunyun WANG. A maximum margin clustering algorithm based on indefinite kernels[J]. Front. Comput. Sci., 2019, 13(4): 813-827.
[2]	Xu YU,Jing YANG,Zhiqiang XIE. Training SVMs on a bound vectors set based on Fisher projection[J]. Front. Comput. Sci., 2014, 8(5): 793-806.
[3]	Shangfei WANG,Shan HE,Yue WU,Menghua HE,Qiang JI. Fusion of visible and thermal images for facial expression recognition[J]. Front. Comput. Sci., 2014, 8(2): 232-242.
[4]	Lean YU, Shouyang WANG, Kin Keung LAI. Developing an SVM-based ensemble learning system for customer risk identification collaborating with customer relationship management[J]. Front Comput Sci Chin, 2010, 4(2): 196-203.
[5]	Baoliang LU, Xiaolin WANG, Masao UTIYAMA. Incorporating prior knowledge into learning by dividing training data[J]. Front Comput Sci Chin, 2009, 3(1): 109-122.

Viewed

Full text

Abstract

Cited

Shared

Discussed