|
|
A maximum margin clustering algorithm based on indefinite kernels |
Hui XUE1,2( ), Sen LI1,2, Xiaohong CHEN3, Yunyun WANG4 |
1. School of Computer Science and Engineering, Southeast University, Nanjing 210096, China 2. Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing 210096, China 3. College of Science, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China 4. Department of Computer Science and Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210046, China |
|
|
Abstract Indefinite kernels have attracted more and more attentions in machine learning due to its wider application scope than usual positive definite kernels. However, the research about indefinite kernel clustering is relatively scarce. Furthermore, existing clustering methods are mainly designed based on positive definite kernels which are incapable in indefinite kernel scenarios. In this paper, we propose a novel indefinite kernel clustering algorithm termed as indefinite kernel maximum margin clustering (IKMMC) based on the state-of-the-art maximum margin clustering (MMC) model. IKMMC tries to find a proxy positive definite kernel to approximate the original indefinite one and thus embeds a new F-norm regularizer in the objective function to measure the diversity of the two kernels, which can be further optimized by an iterative approach. Concretely, at each iteration, given a set of initial class labels, IKMMC firstly transforms the clustering problem into a classification one solved by indefinite kernel support vector machine (IKSVM) with an extra class balance constraint and then the obtained prediction labels will be used as the new input class labels at next iteration until the error rate of prediction is smaller than a prespecified tolerance. Finally, IKMMC utilizes the prediction labels at the last iteration as the expected indices of clusters. Moreover, we further extend IKMMC from binary clustering problems to more complexmulti-class scenarios. Experimental results have shown the superiority of our algorithms.
|
Keywords
indefinite kernel
maximum margin clustering
support vector machine
kernel method
|
Corresponding Author(s):
Hui XUE
|
Just Accepted Date: 10 May 2018
Online First Date: 13 May 2019
Issue Date: 29 May 2019
|
|
1 |
A MAndrew. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge: Cambridge University Press, 2000
|
2 |
NAronszajn. Theory of reproducing kernels. Transactions of the American Mathematical Society, 1950, 68(3): 337–404
https://doi.org/10.2307/1990404
|
3 |
HXue, SChen, QYang. Discriminatively regularized least-squares classification. Pattern Recognition, 2009, 42(1): 93–104
https://doi.org/10.1016/j.patcog.2008.07.010
|
4 |
HXue, SChen, JHuang. Discriminative indefinite kernel classifier from pairwise constraints and unlabeled data. In: Proceedings of International Conference on Pattern Recognition. 2012, 497–500
|
5 |
JHuang, HXue, YZhai. Semi-supervised discriminatively regularized classifier with pairwise constraints. In: Proceedings of Pacific Rim International Conference on Artificial Intelligence. 2012, 112–123
https://doi.org/10.1007/978-3-642-32695-0_12
|
6 |
ZWang, SChen, HXue, Z Pan. A novel regularization learning for single-view patterns: multi-view discriminative regularization. Neural Processing Letters, 2010, 31(3): 159–175
https://doi.org/10.1007/s11063-010-9132-2
|
7 |
BHaasdonk, E Pekalska. Indefinite kernel fisher discriminant. In: Proceedings of International Conference on Pattern Recognition. 2008, 1–4
https://doi.org/10.1109/ICPR.2008.4761718
|
8 |
S SHo, PDai, FRudzicz. Manifold learning for multivariate variablelength sequences with an application to similarity search. IEEE Transactions on Neural Networks and Learning Systems, 2016, 27(6): 1333–1344
https://doi.org/10.1109/TNNLS.2015.2399102
|
9 |
CLi, LLin, WZuo, S Yan, JTang. Sold: sub-optimal low-rank decomposition for efficient video segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 5519–5527
|
10 |
D WJacobs, D Weinshall, YGdalyahu. Classification with nonmetric distances: image retrieval and class representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(6): 583–600
https://doi.org/10.1109/34.862197
|
11 |
F MSchleif, PTino. Indefinite proximity learning: a review. Neural Computation, 2015, 27(10): 2039–2096
https://doi.org/10.1162/NECO_a_00770
|
12 |
SLiwicki, S Zafeiriou, GTzimiropoulos, MPantic. Efficient online subspace learning with an indefinite kernel for visual tracking and recognition. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(10): 1624–1636
https://doi.org/10.1109/TNNLS.2012.2208654
|
13 |
CLiu. Gabor-based kernel PCA with fractional power polynomial models for face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(5): 572–581
https://doi.org/10.1109/TPAMI.2004.1273927
|
14 |
GWu, E YChang, ZZhang. An analysis of transformation on nonpositive semidefinite similarity matrix for kernel machines. In: Proceedings of the 22nd International Conference on Machine Learning. 2005, 8
|
15 |
IAlabdulmohsin, XGao, X ZZhang. Support vector machines with indefinite kernels. In: Proceedings of the 6th Asian Conference on Machine Learning. 2015, 32–47
|
16 |
TGraepel, R Herbrich, PBollmann-Sdorra, KObermayer. Classification on pairwise proximity data. In: Proceedings of the 11th Conference on Neural Information Processing Systems. 1998, 438–444
|
17 |
VRoth, JLaub, MKawanabe, J M Buhmann. Optimal cluster preserving embedding of nonmetric proximity data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(12): 1540–1551
https://doi.org/10.1109/TPAMI.2003.1251147
|
18 |
RLuss, A d’Aspremont. Support vector machine classification with indefinite kernels. In: Proceedings of the 20th International Conference on Neural Information Processing Systems. 2007, 953–960
|
19 |
IWaldspurger, A d’Aspremont, SMallat. Phase recovery, maxcut and complex semidefinite programming. Mathematical Programming, 2015, 149(1–2): 47–81
https://doi.org/10.1007/s10107-013-0738-9
|
20 |
JChen, JYe. Training SVM with indefinite kernels. In: Proceedings of the 25th International Conference on Machine Learning. 2008, 136–143
https://doi.org/10.1145/1390156.1390174
|
21 |
AAuslender. An exact penalty method for nonconvex problems covering, in particular, nonlinear programming, semidefinite programming, and second-order cone programming. SIAM Journal on Optimization, 2015, 25(3): 1732–1759
https://doi.org/10.1137/130912190
|
22 |
YChen, M RGupta, BRecht. Learning kernels from indefinite similarities. In: Proceedings of the 26th Annual International Conference on Machine Learning. 2009, 145–152
https://doi.org/10.1145/1553374.1553393
|
23 |
SGu, YGuo. Learning SVM classifiers with indefinite kernels. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence. 2012, 942–948
|
24 |
H TLin, C JLin. A study on sigmoid kernels for SVM and the training of non-PSD kernels by SMO-type methods. Neural Computation, 2003, 3: 1–32
|
25 |
BHaasdonk. Feature space interpretation of SVMs with indefinite kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(4): 482–492
https://doi.org/10.1109/TPAMI.2005.78
|
26 |
GLoosli, C SOng, SCanu. Technical report: SVM in Krein spaces. Machine Learning, 2013
|
27 |
C SOng. Kernels: regularization and optimization. Doctoral Thesis, The Australian National University, 2011
|
28 |
GLoosli, SCanu, C SOng. Learning SVM in Krein spaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(6): 1204–1216
https://doi.org/10.1109/TPAMI.2015.2477830
|
29 |
H MXu, HXue, XChen, Y Y Wang. Solving indefinite kernel support vector machine with difference of convex functions programming. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017, 2782–2788
|
30 |
HXue, YSong, H MXu. Multiple indefinite kernel learning for feature selection. In: Proceedings of International Joint Conferences on Artificial Intelligence. 2017, 3210–3216
https://doi.org/10.24963/ijcai.2017/448
|
31 |
LXu, J Neufeld, BLarson, DSchuurmans. Maximum margin clustering. Advances in Neural Information Processing Systems, 2005, 17: 1537–1544
|
32 |
KZhang, I WTsang, J TKwok. Maximum margin clustering made practical. IEEE Transactions on Neural Networks, 2009, 20(4): 583–596
https://doi.org/10.1109/TNN.2008.2010620
|
33 |
BZhao, J TKwok, CZhang. Multiple kernel clustering. In: Proceedings of the 2009 SIAM International Conference on Data Mining. 2009, 638–649
https://doi.org/10.1137/1.9781611972795.55
|
34 |
FWang, BZhao, CZhang. Linear time maximum margin clustering. IEEE Transactions on Neural Networks, 2010, 21(2): 319–332
https://doi.org/10.1109/TNN.2009.2036998
|
35 |
X LZhang, JWu. Linearithmic time sparse and convex maximum margin clustering. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2012, 42(6): 1669–1692
https://doi.org/10.1109/TSMCB.2012.2197824
|
36 |
Y FLi, I WTsang, JKwok, Z H Zhou. Tighter and convex maximum margin clustering. In: Proceedings of International Conference on Artificial Intelligence and Statistics. 2009, 344–351
|
37 |
JWu, X LZhang. Sparse kernel maximum margin clustering. Neural Network World, 2011, 21(6): 551–574
https://doi.org/10.14311/NNW.2011.21.033
|
38 |
RHettich, K O Kortanek. Semi-infinite programming: theory, methods, and applications. SIAM Review, 1993, 35(3): 380–429
https://doi.org/10.1137/1035089
|
39 |
A JSmola, S V N Vishwanathan, THofmann. Kernel methods for missing variables. In: Proceedings of the 10th International Workshop on Artificial Intelligence & Statistics. 2005, 325–334
|
40 |
TJoachims, TFinley, C N JYu. Cutting-plane training of structural SVMs. Machine Learning, 2009, 77(1): 27–59
https://doi.org/10.1007/s10994-009-5108-8
|
41 |
GGan, CMa, JWu. Data Clustering: Theory, Algorithms, and Applications. Philadelphia: SIAM, Society for Industrial and Applied Mathematics, 2007
https://doi.org/10.1137/1.9780898718348
|
42 |
K BDuan, S S Keerthi. Which is the best multiclass SVMmethod? An empirical study. In: Proceedings of International Workshop on Multiple Classifier Systems. 2005, 278–285
https://doi.org/10.1007/11494683_28
|
43 |
MFilippone, F Camastra, FMasulli, SRovetta. A survey of kernel and spectral methods for clustering. Pattern Recognition, 2008, 41(1): 176–190
https://doi.org/10.1016/j.patcog.2007.05.018
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|