Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2019, Vol. 13 Issue (4) : 802-812    https://doi.org/10.1007/s11704-018-7128-7
RESEARCH ARTICLE
Enhancing subspace clustering based on dynamic prediction
Ratha PECH1,2, Dong HAO1,2, Hong CHENG3, Tao ZHOU1,2()
1. Complex Lab, University of Electronic Science and Technology of China, Chengdu 611731, China
2. Big Data Research Center, University of Electronic Science and Technology of China, Chengdu 611731, China
3. Center for Robotics, University of Electronic Science and Technology of China, Chengdu 611731, China
 Download: PDF(854 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

In high dimensional data, many dimensions are irrelevant to each other and clusters are usually hidden under noise. As an important extension of the traditional clustering, subspace clustering can be utilized to simultaneously cluster the high dimensional data into several subspaces and associate the low-dimensional subspaces with the corresponding points. In subspace clustering, it is a crucial step to construct an affinity matrix with block-diagonal form, in which the blocks correspond to different clusters. The distance-based methods and the representation-based methods are two major types of approaches for building an informative affinity matrix. In general, it is the difference between the density inside and outside the blocks that determines the efficiency and accuracy of the clustering. In this work, we introduce a well-known approach in statistic physics method, namely link prediction, to enhance subspace clustering by reinforcing the affinity matrix.More importantly,we introduce the idea to combine complex network theory with machine learning. By revealing the hidden links inside each block, we maximize the density of each block along the diagonal, while restrain the remaining non-blocks in the affinity matrix as sparse as possible. Our method has been shown to have a remarkably improved clustering accuracy comparing with the existing methods on well-known datasets.

Keywords subspace clustering      link prediction      blockdiagonal matrix      low-rank representation      sparse representation     
Corresponding Author(s): Tao ZHOU   
Just Accepted Date: 25 September 2017   Online First Date: 04 September 2018    Issue Date: 29 May 2019
 Cite this article:   
Ratha PECH,Dong HAO,Hong CHENG, et al. Enhancing subspace clustering based on dynamic prediction[J]. Front. Comput. Sci., 2019, 13(4): 802-812.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-018-7128-7
https://academic.hep.com.cn/fcs/EN/Y2019/V13/I4/802
1 RVidal. Subspace clustering. IEEE Signal Processing Magazine, 2010, 28(2): 52–68
https://doi.org/10.1109/MSP.2010.939739
2 A YNg, M IJordan, YWeiss. On spectral clustering: analysis and an algorithm. Advances in Neural Information Processing Systems, 2002, 2: 849–856
3 L UVon. A tutorial on spectral clustering. Statistics and Computing, 2007, 17(4): 395–416
https://doi.org/10.1007/s11222-007-9033-z
4 JShi, JMalik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8): 888–905
https://doi.org/10.1109/34.868688
5 JCosteira, TKanade. A multi-body factorization method for motion analysis. In: Proceedings of the 5th International Conference on Computer Vision. 1995, 1071–1076
https://doi.org/10.1109/ICCV.1995.466815
6 AClauset, CMoore, M E JNewman. Hierarchical structure and the prediction of missing links in networks. Nature, 2008, 453(7191): 98–101
https://doi.org/10.1038/nature06830
7 LLü, MMedo, C HYeung, Y C Zhang, Z KZhang, TZhou. Recommender systems. Physics Reports, 2012, 519(1): 1–49
https://doi.org/10.1016/j.physrep.2012.02.006
8 DLiben-Nowell, J Kleinberg. The link-prediction problem for social networks. Journal of the Association for Information Science and Technology, 2007, 58(7): 1019–1031
https://doi.org/10.1002/asi.20591
9 EElhamifar, RVidal. Sparse subspace clustering. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 2790–2797
https://doi.org/10.1109/CVPR.2009.5206547
10 EElhamifar, RVidal. Sparse subspace clustering: algorithm, theory, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(11): 2765–2781
https://doi.org/10.1109/TPAMI.2013.57
11 GLiu, ZLin, YYu. Robust subspace segmentation by low-rank representation. In: Proceedings of the 27th International Conference on Machine Learnin. 2010, 663–670
12 GLiu, ZLin, SYan, J Sun, YYu, YMa. Robust recovery of subspace structures by low-rank representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 171–184
https://doi.org/10.1109/TPAMI.2012.88
13 SWei, YYu. Subspace segmentation with a minimal squared frobenius norm representation. In: Proceedings of International Conference on Pattern Recognition. 2012, 3509–3512
14 HZhang, ZYi, XPeng. fLRR: fast low-rank representation using Frobenius-norm. Electronics Letters, 2014, 5013: 936–938
https://doi.org/10.1049/el.2014.1396
15 GMichael, B Stephen. CVX: Matlab software for disciplined convex programming, version 2.1, Recent Advances in Learning and Control, 2008
16 GMichael, B Stephen. Graph Implementations for Nonsmooth Convex Programs. Recent Advances in Learning and Control, London: Springer-Verlag Limited, 2008, 95–110
17 JLiu, SJi, JYe. SLEP: sparse learning with efficient projections. Arizona State University, 2009, 6(491): 7
18 J FCai, E J Candès, ZShen. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 2010, 20(4): 1956–1982
https://doi.org/10.1137/080738970
19 JWright, AGanesh, SRao, Y Peng, YMa. Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization. In: Proceedings of the 22nd International Conference on Neural Information Processing Systems. 2009, 2080–2088
20 ZLin, AGanesh, JWright, L Wu, MChen, YMa. Fast convex optimization algorithms for exact recovery of a corrupted low-rank matrix. Computational Advances in Multi-Sensor Adaptive Processing, 2009, 61(6): 1–18
21 ZLin, MChen, YMa. The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. UIUC Technical Report UILU-ENG-09-2215, 2010
22 F DCarlson, ESobel , G SWatson. Linear relationships between variables affected by errors. Biometrics, 1966, 22(2): 252–267
https://doi.org/10.2307/2528517
23 ATikhonov. Solution of incorrectly formulated problems and the regularization method. Soviet Math., 1963, 4: 1035–1038
24 YChen, LZhang, ZYi. A Novel low rank representation algorithm for subspace clustering. International Journal of Pattern Recognition and Artificial Intelligence, 2016, 30(4): 1650007
https://doi.org/10.1142/S0218001416500075
25 JFeng, ZLin, HXu, SYan. Robust subspace segmentation with blockdiagonal prior. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014, 3818–3825
26 GLiu, SYan. Latent low-rank representation for subspace segmentation and feature extraction. In: Proceedings of the IEEE International Conference on Computer Vision. 2011, 1615–1622
https://doi.org/10.1109/ICCV.2011.6126422
27 RLiu, ZLin, FDe la Torre , ZSu. Fixed-rank representation for unsupervised visual learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2012, 598–605
28 LLü, TZhou. Link prediction in complex networks: a survey. Physica A: Statistical Mechanics and its Applications, 2011, 390(6): 1150–1170
https://doi.org/10.1016/j.physa.2010.11.027
29 GCasella, R LBerger. Statistical Inference. Pacific Grove, CA: Duxbury, 2002.
30 SRedner . Networks: teasing out the missing links. Nature, 2008, 453(7191): 47–48
https://doi.org/10.1038/453047a
31 MSales-Pardo, R Guimera, A AMoreira, L A NAmaral. Extracting the hierarchical organization of complex systems. Proceedings of the National Academy of Sciences, 2007, 104(39): 15224–15229
https://doi.org/10.1073/pnas.0703740104
32 LGetoor, N Friedman, DKoller, APfeffer. Learning Probabilistic Relational Models. Relational Data Mining, Springer, Berlin, Hedelberg, 2001, 307–335
33 DHeckerman, D M Chickering, CMeek, RRounthwaite , CKadie. Dependency networks for inference, collaborative filtering, and data visualization. Journal of Machine Learning Research, 2000, 1(Oct): 49–75
34 BTaskar, PAbbeel, DKoller. Discriminative probabilistic models for relational data. In: Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence. 2002, 485–492
35 E ALeicht, PHolme, M E JNewman. Vertex similarity in networks. Physical Review E, 2006, 73(2): 026120
https://doi.org/10.1103/PhysRevE.73.026120
36 ERavasz, A LSomera, D AMongru, Z N Oltvai, A LBarabási. Hierarchical organization of modularity in metabolic networks. Science, 2002, 297(5586): 1551–1555
https://doi.org/10.1126/science.1073374
37 TSørensen. A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on Danish commons. Biologiske Skrifter, 1948, 5: 1–34
38 TZhou, L Lü, Y CZhang. Predicting missing links via local information. The European Physical Journal B-Condensed Matter and Complex Systems, 2009, 71(4): 623–630
https://doi.org/10.1140/epjb/e2009-00335-8
39 RPech, DHao, LPan, H Cheng, TZhou. Link prediction via matrix completion. EPL (Europhysics Letters), 2017, 117(3): 38002
https://doi.org/10.1209/0295-5075/117/38002
40 GJeh, JWidom. SimRank: a measure of structural-context similarity. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2002, 538–543
https://doi.org/10.1145/775047.775126
41 LKatz. A new status index derived from sociometric analysis. Psychometrika, 1953, 18(1): 39–43
https://doi.org/10.1007/BF02289026
42 WLiu, L Lü. Link prediction based on local random walk. EPL (Europhysics Letters), 2010, 89(5): 58007
https://doi.org/10.1209/0295-5075/89/58007
43 LLü, C HJin, TZhou. Similarity index based on local paths for link prediction of complex networks. Physical Review E. 2009, 80(4): 046122
https://doi.org/10.1103/PhysRevE.80.046122
44 M E JNewman. Clustering and preferential attachment in growing networks. Physical Review E, 2001, 64(2): 025102
https://doi.org/10.1103/PhysRevE.64.025102
45 TMurata, S Moriyasu. Link prediction of social networks based on weighted proximity measures. In: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. 2007, 85–88
https://doi.org/10.1109/WI.2007.52
46 XPeng, LZhang, ZYi. Constructing l2-graph for subspace learning and segmentation. 2012, arXiv preprint arXiv:1209.0841
47 XZheng, DCai, XHe, W YMa, XLin. Locality preserving clustering for image database. In: Proceedings of the 12th Annual ACM International Conference on Multimedia. 2004, 885–891
https://doi.org/10.1145/1027527.1027731
48 K CLee, JHo, D JKriegman. Acquiring linear subspaces for face recognition under variable lighting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(5): 684–698
https://doi.org/10.1109/TPAMI.2005.92
49 J JHull. A database for handwritten text recognition research. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1994, 16(5): 550–554
https://doi.org/10.1109/34.291440
50 W NStreet, W H Wolberg, O LMangasarian. Nuclear feature extraction for breast tumor diagnosis. In: Proceedings of International Society for Optics and Photonics on Biomedical Image Processing and Biomedical Visualization. 1993, 861–870
51 J PSiebert. Vehicle recognition using rule based methods. Project Report, 1987
52 R C BMadeo, C A MLima, S MPeres. Gesture unit segmentation using support vector machines: segmenting gestures from rest positions. In: Proceedings of the 28th Annual ACM Symposium on Applied Computing. 2013, 46–52
https://doi.org/10.1145/2480362.2480373
53 YZhao, G Karypis. Criterion functions for document clustering: experiments and analysis. Citeseer: Technical Report, 2001
54 DCai, XHe, JHan. Document clustering using locality preserving indexing. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(12): 1624–1637
https://doi.org/10.1109/TKDE.2005.198
[1] Ruidong YAN, Yi LI, Deying LI, Weili WU, Yongcai WANG. SSDBA: the stretch shrink distance based algorithm for link prediction in social networks[J]. Front. Comput. Sci., 2021, 15(1): 151301-.
[2] Zhiyong YU, Xiangping ZHENG, Fangwan HUANG, Wenzhong GUO, Lin SUN, Zhiwen YU. A framework based on sparse representation model for time series prediction in smart city[J]. Front. Comput. Sci., 2021, 15(1): 151305-.
[3] Xuejun WANG, Feilong CAO, Wenjian WANG. Adaptive sparse and dense hybrid representation with nonconvex optimization[J]. Front. Comput. Sci., 2020, 14(4): 144306-.
[4] Hongpeng YIN,Jinxing LI,Yi CHAI,Simon X. YANG. A survey on distributed compressed sensing: theory and applications[J]. Front. Comput. Sci., 2014, 8(6): 893-904.
[5] Zhisong PAN,Zhantao DENG,Yibing WANG,Yanyan ZHANG. Dimensionality reduction via kernel sparse representation[J]. Front. Comput. Sci., 2014, 8(5): 807-815.
[6] CHEN Lifei, JIANG Qingshan. An extended EM algorithm for subspace clustering[J]. Front. Comput. Sci., 2008, 2(1): 81-86.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed