Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2020, Vol. 14 Issue (2) : 349-363    https://doi.org/10.1007/s11704-018-8148-z
RESEARCH ARTICLE
A primal perspective for indefinite kernel SVM problem
Hui XUE1,2(), Haiming XU1,2, Xiaohong CHEN3, Yunyun WANG4
1. School of Computer Science and Engineering, Southeast University, Nanjing 210096, China
2. Key Laboratory of Computer Network and Information Integration (Southeast University), MOE, Nanjing 210096, China
3. College of Science, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
4. School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210046, China
 Download: PDF(533 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Indefinite kernel support vector machine (IKSVM) has recently attracted increasing attentions in machine learning. Since IKSVM essentially is a non-convex problem, existing algorithms either change the spectrum of indefinite kernel directly but risking losing some valuable information or solve the dual form of IKSVM whereas suffering from a dual gap problem. In this paper, we propose a primal perspective for solving the problem. That is, we directly focus on the primal form of IKSVM and present a novel algorithm termed as IKSVM-DC for binary and multi-class classification. Concretely, according to the characteristics of the spectrum for the indefinite kernel matrix, IKSVM-DC decomposes the primal function into the subtraction of two convex functions as a difference of convex functions (DC) programming. To accelerate convergence rate, IKSVM-DC combines the classical DC algorithm with a line search step along the descent direction at each iteration. Furthermore, we construct a multi-class IKSVM model which can classify multiple classes in a unified form. A theoretical analysis is then presented to validate that IKSVM-DC can converge to a local minimum. Finally, we conduct experiments on both binary and multi-class datasets and the experimental results show that IKSVM-DC is superior to other state-of-the-art IKSVM algorithms.

Keywords indefinite kernel      support vector machine      multi-class classification      non-convex optimization     
Corresponding Author(s): Hui XUE   
Just Accepted Date: 30 November 2018   Online First Date: 17 September 2019    Issue Date: 16 October 2019
 Cite this article:   
Hui XUE,Haiming XU,Xiaohong CHEN, et al. A primal perspective for indefinite kernel SVM problem[J]. Front. Comput. Sci., 2020, 14(2): 349-363.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-018-8148-z
https://academic.hep.com.cn/fcs/EN/Y2020/V14/I2/349
1 C Cortes. Support-vector network. Machine Learning Journal, 1995, 20(3): 273–297
https://doi.org/10.1007/BF00994018
2 N Cristianini, J Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge: Cambridge University Press, 2000
https://doi.org/10.1017/CBO9780511801389
3 Y Y Li, R Q Lu. Locality preserving projection on SPD matrix Lie group: algorithm and analysis. Science China Information Sciences, 2018, 61(9): 092104
https://doi.org/10.1007/s11432-017-9233-4
4 L R Ma, D D Song, L J Liao, J Wang. PSVM: a preference-enhanced SVM model using preference data for classification. Science China Information Sciences, 2017, 60(12): 122103
https://doi.org/10.1007/s11432-016-9020-4
5 H Saigo, J-P Vert, N Ueda, T Akutsu. Protein homology detection using string alignment kernels. Bioinformatics, 2004, 20(11): 1682–1689
https://doi.org/10.1093/bioinformatics/bth141
6 C Wang, Y Song, H Li, M Zhang, J Han. Text classification with heterogeneous information network kernels. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 2130–2136
7 V Vapnik. The Nature of Statistical Learning Theory. Germany: Springer Science & Business Media, 2013
8 F Suard, A Rakotomamonjy, A Bensrhair. Kernel on bag of paths for measuring similarity of shapes. In: Proceedings of European Symposium on Artificial Neural Networks. 2007, 355–360
9 Y Chen, E Garcia, M Gupta, A Rahimi, L Cazzanti. Similarity-based classification: concepts and algorithms. Journal of Machine Learning Research, 2009, 10(3): 747–776
10 Y Chen, M Gupta. Fusing similarities and kernels for classification, In: Proceedings of the 12th International Conference on IEEE Information Fusion. 2009, 474–481
11 E Pkalska, B Haasdonk. Kernel discriminant analysis for positive definite and indefinite kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(6): 1017–1032
https://doi.org/10.1109/TPAMI.2008.290
12 H Sun, Q Wu. Least square regression with indefinite kernels and coefficient regularization. Applied and Computational Harmonic Analysis, 2011, 30(1): 96–109
https://doi.org/10.1016/j.acha.2010.04.001
13 Y Ying, C Campbell, M Girolami. Analysis of SVMwith indefinite kernels. In: Proceedings of the 22nd International Conference on Neural Information Processing Systems. 2009, 2205–2213
14 G Loosli, S Canu. Non positive SVM. Technical Report, 2010
15 B Haasdonk. Feature space interpretation of SVMs with indefinite kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(4): 482–492
https://doi.org/10.1109/TPAMI.2005.78
16 E Pekalska, P Paclik, R P Duin. A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research, 2001, 2(12): 175–211
17 T Graepel, R Herbrich, P Bollmann-Sdorra, K Obermayer. Classification on pairwise proximity data. In: Proceedings of the 13th Conference on Neural Information Processing Systems. 1999, 438–444
18 V Roth, J Laub, M Kawanabe, J M Buhmann. Optimal cluster preserving embedding of nonmetric proximity data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(12): 1540–1551
https://doi.org/10.1109/TPAMI.2003.1251147
19 R Luss, A d’Aspremont. Support vector machine classification with indefinite kernels. In: Proceedings of the 21st Conference on Neural Information Processing Systems. 2008, 953–960
20 J Chen, J Ye. Training SVM with indefinite kernels. In: Proceedings of the 25th International Conference on Machine Learning. 2008, 136–143
https://doi.org/10.1145/1390156.1390174
21 Y Chen, M R Gupta, B Recht. Learning kernels from indefinite similarities. In: Proceedings of the 26th International Conference on Machine Learning. 2009, 145–152
https://doi.org/10.1145/1553374.1553393
22 S Gu, Y Guo. Learning SVM classifiers with indefinite kernels. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence. 2012, 942–948
23 H T Lin, C J Lin. A study on sigmoid kernels for SVM and the training of non-PSD kernels by SMO-type methods. Neural Computation, 2003, 3: 1–32
24 F B Akoa. Combining DC algorithms (DCAs) and decomposition techniques for the training of nonpositive-semidefinite kernels. IEEE Transactions on Neural Networks, 2008, 19(11): 1854–1872
https://doi.org/10.1109/TNN.2008.2003299
25 C S Ong, X Mary, S Canu, A J Smola. Learning with non-positive kernels. In: Proceedings of the 21st International Conference on Machine Learning. 2004, 81
https://doi.org/10.1145/1015330.1015443
26 G Loosli, S Canu, C S Ong. Learning SVM in Kre˘ın spaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(6): 1204–1216
https://doi.org/10.1109/TPAMI.2015.2477830
27 I M Alabdulmohsin, X Gao, X Zhang. Support vector machines with indefinite kernels. In: Proceedings of the 6th Asian Conference on Machine Learning. 2014, 32–47
28 S B Kotsiantis, I Zaharakis, P Pintelas. Supervised machine learning: a review of classification techniques. Emerging Artificial Intelligence Applications in Computer Engineering, 2007, 160: 3–24
https://doi.org/10.1007/s10462-007-9052-3
29 J Friedman. Another approach to polychotomous classification. Technical Report, Department of Statistics, Stanford University, 1996
30 U Krebel. Pairwise classification and support vector machines. Advances in Kernel Methods: Support Vector Learning, 1999, 255–268
31 L Bottou, C Cortes, J S Denker, H Drucker, Z Guyon, L D Jackel, Y Le Cun, U A Muller, E Sackinger, P Simard, U Vapnik. Comparison of classifier methods: a case study in handwritten digit recognition. In: Proceedings of the 12th IAPR International Conference on Pattern Recognition. 1994, 77–82
32 T G Dietterich, G Bakiri. Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 1995, 2: 263–286
https://doi.org/10.1613/jair.105
33 E L Allwein, R E Schapire, Y Singer. Reducing multiclass to binary: a unifying approach for margin classifiers. Journal of Machine Learning Research, 2000, 1(12): 113–141
34 J C Platt, N Cristianini, J Shawe-Taylor. Large margin dags for multiclass classification. In: Proceedings of the 12th International Conference on Neural Information Processing Systems. 1999, 547–553
35 B Scholkopf, A J Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Massackusetts: MIT Press, 2001
36 P D Tao, L T H An. Convex analysis approach to DC programming: theory, algorithms and applications. Acta Mathematica Vietnamica, 1997, 22(1): 289–355
37 T P Dinh, H A Le Thi. Recent advances in DC programming and DCA. In: Nguyen N T, Le-Thi H A, eds. Transactions on Computational Intelligence XIII. Springer, Berlin, Heidelbery, 2014, 1–37
https://doi.org/10.1007/978-3-642-54455-2_1
38 B Piot, M Geist, O Pietquin. Difference of convex functions programming for reinforcement learning. In: Proceedings of the 27th Conference on Neural Information Processing Systems. 2014, 2519–2527
39 H M Xu, H Xue, X H Chen, Y Y Wang. Solving indefinite kernel support vector machine with difference of convex functions programming. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017, 2782–2788
40 F J A Artacho, R M Fleming, P T Vuong. Accelerating the DC algorithm for smooth functions. Mathematical Programming, 2018, 169(1): 95–118
https://doi.org/10.1007/s10107-017-1180-1
41 Y Nesterov. Introductory Lectures on Convex Optimization: A Basic Course. Germany: Springer Science & Business Media, 2013
42 Z Wang, X Xue. Multi-class support vector machine. In: Ma Y, Guo G, eds. Support Vector Machines Applications. Springer, Cham, 2014, 23–48
https://doi.org/10.1007/978-3-319-02300-7_2
43 V N Vapnik. Statistical Learning Theory. Wiley New York: JohnWiley & Sons, Inc., 1998
44 J Weston, C Watkins. Multi-class support vector machines. Technical Report, Department of Computer Science, Royal Holloway, University of London, 1998
45 K Crammer, Y Singer. On the learnability and design of output codes for multi-class problems. Machine Learning, 2002, 47(2): 201–233
https://doi.org/10.1023/A:1013637720281
46 C Blake, C J Merz. UCI repository of machine learning databases, 1998
47 G Ratsch, T Onoda, K R Muller. Soft margins for AdaBoost. Machine Learning, 2001, 42(3): 287–320
https://doi.org/10.1023/A:1007618119488
48 R P Duin, E Pekalska. The dissimilarity representation for pattern recognition: a tutorial. Technical Report, 2009
49 A Mosek. The MOSEK optimization software. Google Website, 2010
50 G Wu, E Y Chang, Z Zhang. An analysis of transformation on nonpositive semidefinite similarity matrix for kernel machines. In: Proceedings of the 22nd International Conference on Machine Learning. 2005, 1245–1256
51 C C Chang, C J Lin. LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3): 27
https://doi.org/10.1145/1961189.1961199
52 M L Zhang, Z H Zhou. Exploiting unlabeled data to enhance ensemble diversity. DataMining and Knowledge Discovery, 2013, 26(1): 98–129
https://doi.org/10.1007/s10618-011-0243-9
[1] Article highlights Download
[1] Hui XUE, Sen LI, Xiaohong CHEN, Yunyun WANG. A maximum margin clustering algorithm based on indefinite kernels[J]. Front. Comput. Sci., 2019, 13(4): 813-827.
[2] Xu YU,Jing YANG,Zhiqiang XIE. Training SVMs on a bound vectors set based on Fisher projection[J]. Front. Comput. Sci., 2014, 8(5): 793-806.
[3] Shangfei WANG,Shan HE,Yue WU,Menghua HE,Qiang JI. Fusion of visible and thermal images for facial expression recognition[J]. Front. Comput. Sci., 2014, 8(2): 232-242.
[4] Lean YU, Shouyang WANG, Kin Keung LAI. Developing an SVM-based ensemble learning system for customer risk identification collaborating with customer relationship management[J]. Front Comput Sci Chin, 2010, 4(2): 196-203.
[5] Baoliang LU, Xiaolin WANG, Masao UTIYAMA. Incorporating prior knowledge into learning by dividing training data[J]. Front Comput Sci Chin, 2009, 3(1): 109-122.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed