Please wait a minute...
Frontiers of Electrical and Electronic Engineering

ISSN 2095-2732

ISSN 2095-2740(Online)

CN 10-1028/TM

Front Elect Electr Eng Chin    2009, Vol. 4 Issue (1) : 15-19    https://doi.org/10.1007/s11460-009-0019-9
RESEARCH ARTICLE
Text clustering based on fusion of ant colony and genetic algorithms
Yun ZHANG(), Boqin FENG, Shouqiang MA, Lianmeng LIU
School of Electronics and Information Engineering, Xi’an Jiaotong University, Xi’an 710049, China
 Download: PDF(145 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Focusing on the problem that the ant colony algorithm gets into stagnation easily and cannot fully search in solution space, a text clustering approach based on the fusion of the ant colony and genetic algorithms is proposed. The four parameters that influence the performance of the ant colony algorithm are encoded as chromosomes, thereby the fitness function, selection, crossover and mutation operator are designed to find the combination of optimal parameters through a number of iteration, and then it is applied to text clustering. The simulation results show that compared with the classical k-means clustering and the basic ant colony clustering algorithm, the proposed algorithm has better performance and the value of F-Measure is enhanced by 5.69%, 48.60% and 69.60%, respectively, in 3 test data sets. Therefore, it is more suitable for processing a larger dataset.

Keywords ant colony clustering      genetic algorithm      fusion      text clustering     
Corresponding Author(s): ZHANG Yun,Email:xjtu.cloud@gmail.com   
Issue Date: 05 March 2009
 Cite this article:   
Boqin FENG,Shouqiang MA,Lianmeng LIU, et al. Text clustering based on fusion of ant colony and genetic algorithms[J]. Front Elect Electr Eng Chin, 2009, 4(1): 15-19.
 URL:  
https://academic.hep.com.cn/fee/EN/10.1007/s11460-009-0019-9
https://academic.hep.com.cn/fee/EN/Y2009/V4/I1/15
datasetnumber of documentsnumber of categoriestopics
dataset 13004gnp, gold, jobs, ship
dataset 25606coffee, crude, grain, Interest, money-supply, trade
dataset 312408coffee, crude, gold, interest, money-supply, ship, sugar, gnp
Tab.1  Description of text dataset
algorithmF-Measure
dataset 1dataset 2dataset 3
initc=1initc=2initc=1initc=2initc=1initc=2
k-means0.7724380.9638410.4332190.879980.427970.857516
ACA0.7724380.9638410.4577110.8801740.6719050.857479
ACGA0.8163560.9638410.6437760.9065930.7258490.865642
Tab.2  -Measure value of 3 clustering algorithms on 3 datasets
algorithmSSE
dataset 1dataset 2dataset 3
initc=1initc=2initc=1initc=2initc=1initc=2
k-means118.632105.434239.56184.583436.445321.072
ACA118.632105.434290.777184.621370.97321.07
ACGA117.44105.434215.216199.757389.805321.104
Tab.3  SSE value of 3 clustering algorithms on 3 datasets
Fig.1  -Measure of different clustering algorithms
Fig.2  SSE of different clustering algorithms
F-measure
dataset 1dataset 2dataset 3
the scope of parameters in Ref. [9]0.5072910.3310470.521478
the scope of parameters in this paper0.8163560.6437760.725849
Tab.4  Optimal -measure values obtained by ACGA (Initc=1)
1 Liu Y C, Wang X L, Xu Z M, Guan Y. A survey of document clustering. Journal of Chinese Information Processing , 2006, 20(3): 55–62 (in Chinese)
2 Sasaki M, Shinnou H. Spam detection using text clustering. In: Proceedings of the 2005 International Conference on Cyberworlds (CW’05), Singapore . 2005, 316–319
3 He F, Ding X Q. Combining text clustering and retrieval for corpus adaptation. Proceedings of SPIE . 2007, 6500: 65000P1–7
4 Dorigo M, Blum C. Ant colony optimization theory: a survey. Theoretical Computer Science , 2005, 344(2-3): 243–278
doi: 10.1016/j.tcs.2005.05.020
5 Zhu X L, Li J Z. An ant colony system-based optimization scheme of data mining. In: Proceedings of the 6th International Conference on Intelligent Systems Design and Applications (ISDA’06), Jinan, Shandong, China . 2006, 400–403
6 van Rijsbergen C J. Information Retrieval. 2nd ed. London: Butterworths, 1979
7 Wu C M, Chen Z, Jiang M. The research on initialization of ants system and configuration of parameters for different TSP problems in ant algorithm. Acta Electronica Sinica , 2006, 34(8): 1530–1533 (in Chinese)
8 Huang Y Q, Liang C Y, Zhang X D. Parameter establishment of an ant system based on uniform design. Control and Decision , 2006, 21(1): 93–96 (in Chinese)
9 Duan H B. Ant Algorithm–Theory and Its Applications. Beijing: Science Press, 2005 (in Chinese)
[1] R. AMBIKAIRAJAH, B. T. PHUNG, J. RAVISHANKAR. The fusion of classifier outputs to improve partial discharge classification[J]. Front Elect Electr Eng, 2012, 7(4): 391-398.
[2] Xiaojun SUN, Guangming YAN. Time-varying optimal distributed fusion white noise deconvolution estimator[J]. Front Elect Electr Eng, 2012, 7(3): 318-325.
[3] Abdelkader KANSSAB, Abdelhalim ZAOUI, Mouloud FELIACHI. Modeling and optimization of induction cooking by the use of magneto-thermal finite element analysis and genetic algorithms[J]. Front Elect Electr Eng, 2012, 7(3): 312-317.
[4] Zhiyuan LIU, Maosong SUN. Can prior knowledge help graph-based methods for keyword extraction?[J]. Front Elect Electr Eng, 2012, 7(2): 242-253.
[5] Junxia GU, Xiaoqing DING, Shenjing WANG. Action recognition from arbitrary views using 3D-key-pose set[J]. Front Elect Electr Eng, 2012, 7(2): 224-241.
[6] Hongxia ZHANG, Yanning ZHANG, Zhe GUO, Zenggang LIN, Chao ZHANG. 3D face recognition based on principal axes registration and fusing features[J]. Front Elect Electr Eng Chin, 2011, 6(2): 347-352.
[7] Guizhu FENG, Wei CHEN, Zhigang CAO. An efficient cooperative sensing scheme with bandwidth-limited control channel[J]. Front Elect Electr Eng Chin, 2010, 5(4): 456-463.
[8] Siyang SUN, Yinghua LU, Jinling ZHANG, Fangming RUAN, . Genetic algorithm optimization of broadband microstrip antenna[J]. Front. Electr. Electron. Eng., 2010, 5(2): 185-187.
[9] Xiaojun SUN, Zili DENG. Distributed fusion white noise deconvolution estimators[J]. Front Elect Electr Eng Chin, 2009, 4(3): 270-277.
[10] Peng CHENG, Qiufeng WU, Qionghai DAI. Application layer multicast routing solution based on genetic algorithms[J]. Front Elect Electr Eng Chin, 2009, 4(1): 43-46.
[11] SUN Xiaojun, ZHANG Peng, DENG Zili. Self-tuning decoupled fusion Kalman filter based on the Riccati equation[J]. Front. Electr. Electron. Eng., 2008, 3(4): 459-464.
[12] WANG Jiangtao, YANG Jingyu. Relative discriminant coefficient based multi-cue fusion for robust object tracking[J]. Front. Electr. Electron. Eng., 2008, 3(3): 274-282.
[13] LI Min, FENG Xiangchu. Image restoration using total variation and anisotropic diffusion equation[J]. Front. Electr. Electron. Eng., 2007, 2(4): 400-403.
[14] SUN Lijuan, GUO Jian, LU Kai, WANG Ruchuan. Topology control based on quantum genetic algorithm in sensor networks[J]. Front. Electr. Electron. Eng., 2007, 2(3): 326-329.
[15] HU Ning, ZHANG Deyun. Optimized placement of nodes for target detection in sensor networks[J]. Front. Electr. Electron. Eng., 2007, 2(2): 167-171.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed