|
|
Polygene-based evolutionary algorithms with frequent pattern mining |
Shuaiqiang WANG1,2( ), Yilong YIN3( ) |
1. Research Center of Big Data Applications, Qilu University of Technology, Jinan 250100, China 2. Department of Computer Science and Information Systems, University of Jyvaskyla, Jyvaskyla 40014, Finland 3. School of Computer Science and Technology, Shandong University, Jinan 250101, China |
|
|
Abstract In this paper, we introduce polygene-based evolution, a novel framework for evolutionary algorithms (EAs) that features distinctive operations in the evolutionary process. In traditional EAs, the primitive evolution unit is a gene, wherein genes are independent components during evolution. In polygene-based evolutionary algorithms (PGEAs), the evolution unit is a polygene, i.e., a set of co-regulated genes. Discovering and maintaining quality polygenes can play an effective role in evolving quality individuals. Polygenes generalize genes, and PGEAs generalize EAs. Implementing the PGEA framework involves three phases: (I) polygene discovery, (II) polygene planting, and (III) polygene-compatible evolution. For Phase I, we adopt an associative classificationbased approach to discover quality polygenes. For Phase II, we perform probabilistic planting to maintain the diversity of individuals. For Phase III, we incorporate polygenecompatible crossover and mutation in producing the next generation of individuals. Extensive experiments on function optimization benchmarks in comparison with the conventional and state-of-the-art EAs demonstrate the potential of the approach in terms of accuracy and efficiency improvement.
|
Keywords
polygenes
evolutionary algorithms
function optimization
associative classification
data mining
|
Corresponding Author(s):
Shuaiqiang WANG
|
Just Accepted Date: 19 October 2016
Online First Date: 06 March 2018
Issue Date: 21 September 2018
|
|
1 |
Wang S Q, Gao B J, Wang S L, Cao G B, and Yin Y L. Polygenebased evolution: a novel framework for evolutionary algorithms. In: Proceedings of the 21st ACM Conference on Information and Knowledge Management. 2012, 2263–2266
|
2 |
Boughanem M, Tamine L. Query optimization using an improved genetic algorithm. In: Proceedings of the 9th International Conference on Information and Knowledge Management. 2000, 368–373
|
3 |
Venkatraman S, Yen G G. A generic framework for constrained optimization using genetic algorithms. IEEE Transactions on Evolutionary Computation, 2005, 9(4): 424–435
https://doi.org/10.1109/TEVC.2005.846817
|
4 |
Malek H, Ebadzadeh M M, Rahmati M. Three new fuzzy neural networks learning algorithms based on clustering, training error and genetic algorithm. Applied Intelligence, 2012, 37(2): 280–289
https://doi.org/10.1007/s10489-011-0327-7
|
5 |
Zafra A, Ventura S. Multi-objective genetic programming for multiple instance learning. In: Proceedings of the 18th European Conference on Machine Learning. 2007, 790–797
https://doi.org/10.1007/978-3-540-74958-5_81
|
6 |
Chang D X, Zhang X D, Zheng C W. A genetic algorithm with gene rearrangement for k-means clustering. Pattern Recognition, 2009, 42(7): 1210–1222
https://doi.org/10.1016/j.patcog.2008.11.006
|
7 |
Özyer T, Alhajj R. Parallel clustering of high dimensional data by integrating multi-objective genetic algorithm with divide and conquer. Applied Intelligence, 2009, 31(3): 318–331
https://doi.org/10.1007/s10489-008-0129-8
|
8 |
Wang S Q, Ma J, and Liu J M. Learning to rank using evolutionary computation: Immune programming or genetic programming? In: Proceedings of the 18th ACM Conference on Information and Knowledge Management. 2009, 1879–1882
https://doi.org/10.1145/1645953.1646254
|
9 |
Kaya MAlhajj R. Utilizing genetic algorithms to optimize membership functions for fuzzy weighted association rules mining. Applied Intelligence, 2006, 24(1): 7–15
https://doi.org/10.1007/s10489-006-6925-0
|
10 |
Weale T, Seitzer J. EVOC: a music generating system using genetic algorithms. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence. 2003, 1383–1384
|
11 |
Bryden K M, Ashlock D A, Corns S M, Willson S J. Graph-based evolutionary algorithms. IEEE Transactions on Evolutionary Computation, 2006, 10(5): 550–567
https://doi.org/10.1109/TEVC.2005.863128
|
12 |
Ishibuchi H, Tsukamoto N, Nojima Y. Diversity improvement by nongeometric binary crossover in evolutionary multiobjective optimization. IEEE Transactions on Evolutionary Computation, 2010, 14(6): 985–998
https://doi.org/10.1109/TEVC.2010.2043365
|
13 |
Qu B Y, Suganthan P N, and Liang J J. Differential evolution with neighborhood mutation for multimodal optimization. IEEE Transactions on Evolutionary Computation, 2012, 16(5): 601–614
https://doi.org/10.1109/TEVC.2011.2161873
|
14 |
Mabu S, Hirasawa K, Hu J L. A graph-based evolutionary algorithm: genetic network programming (GNP) and its extension using reinforcement learning. Evolutionary Computation, 2007, 15(3): 369–398
https://doi.org/10.1162/evco.2007.15.3.369
|
15 |
Hu T, Chen Y Z P, Banzhaf W. WiMAX network planning using adaptive-population-size genetic algorithm. In: Proceedings of the International Conference on Applications of Evolutionary Computation. 2010, 31–40
https://doi.org/10.1007/978-3-642-12242-2_4
|
16 |
Zhang J, Chung H S H, Lo W L. Clustering-based adaptive crossover and mutation probabilities for genetic algorithms. IEEE Transactions on Evolutionary Computation, 2007, 11(3): 326–335
https://doi.org/10.1109/TEVC.2006.880727
|
17 |
Cross A D J, Myers R, Hancock E R. Convergence of a hill-climbing genetic algorithm for graph matching. Pattern Recognition, 2000, 33(11): 1863–1880
https://doi.org/10.1016/S0031-3203(99)00171-5
|
18 |
Tantar A A, Melab N, Talbi E G. A grid-based genetic algorithm combined with an adaptive simulated annealing for protein structure prediction. Soft Computing, 2008, 12(12): 1185–1198
https://doi.org/10.1007/s00500-008-0298-8
|
19 |
Chen Y P, Goldberg D E. Introducing start expression genes to the linkage learning genetic algorithm. In: Proceedings of the 7th International Conference on Parallel Problem Solving from Nature. 2002, 351–360
https://doi.org/10.1007/3-540-45712-7_34
|
20 |
Chen Y P, Peng W C, Jian M C. Particle swarm optimization with recombination and dynamic linkage discovery. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 2007, 37(6): 1460–1470
https://doi.org/10.1109/TSMCB.2007.904019
|
21 |
Goldman B W, Tauritz D R. Linkage tree genetic algorithms: Variants and analysis. In: Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation Conference. 2012, 625–632
https://doi.org/10.1145/2330163.2330252
|
22 |
Shao K Y, Li F, Jiang B Y, Wang N, Zhang H Y, Li W C. Neural network optimization based on improved diploidic genetic algorithm. In: Proceedings of the International Conference on Machine Learning and Cybernetics. 2010, 1470–1475
https://doi.org/10.1109/ICMLC.2010.5580839
|
23 |
Manjari K M, Gallagher M. Variable screening for reduced dependency modelling in gaussian-based continuous estimation of distribution algorithms. In: Proceedings of IEEE Congress on Evolutionary Computation. 2012, 1–8
|
24 |
Rastegar R. On the optimal convergence probability of univariate estimation of distribution algorithms. Evolutionary Computation, 2011, 19(2): 225–248
https://doi.org/10.1162/EVCO_a_00022
|
25 |
Lawrence E. Henderson’s Dictionary of Biology. New York: Pearson/ Prentice Hal, 2005
|
26 |
Lewis R. Human Genetics: Concepts and Applications. New York: McGraw Hill, 2002
|
27 |
Mather K M, Jinks J L. Biometrical Genetics. 3rd ed. London: Chapman and Hall, 1982
https://doi.org/10.1007/978-1-4899-3406-2
|
28 |
Beurton P J, Falk R, and Rheinberger H J. The Concept of the Gene in Development and Evolution. Cambridge: Cambridge University Press, 2000
https://doi.org/10.1017/CBO9780511527296
|
29 |
Gilbert S F. Developmental Biology. 6th ed. Sunderland, MA: Sinauer Associates, 2000
|
30 |
Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 1993, 207–216
https://doi.org/10.1145/170035.170072
|
31 |
Liu B, Hsu W, Ma Y M. Integrating classification and association rule mining. In: Proceedings of the 4th ACM SIGKDD International Conference on Knowledge Discovery in Databases. 1998, 443–447
|
32 |
Holland J H. Adaptation in Natural and Artificial Systems. Cambridge, MA: The MIT Press, 1975
|
33 |
Han J W, Pei J, Yin Y W. Mining frequent patterns without candidate generation. In: Proceeding of ACM SIGMOD International Conference on Management of Data. 2000, 1–12
https://doi.org/10.1145/342009.335372
|
34 |
Agarwal R, Aggarwal C C, Prasad V V V. A tree projection algorithm for generation of frequent itemsets. Journal of Parallel and Distributed Computing, 2001, 61(3): 350–371
https://doi.org/10.1006/jpdc.2000.1693
|
35 |
Hämäläinen W. Statapriori: an efficient algorithm for searching statistically significant association rules. Knowledge and Information Systems, 2010, 23(3): 373–399
https://doi.org/10.1007/s10115-009-0229-8
|
36 |
Beil F, Ester M, Xu X W. Frequent term-based text clustering. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery in Databases. 2002, 436–442
https://doi.org/10.1145/775047.775110
|
37 |
Leung C W, Chan S C, Chung F L. A collaborative filtering framework based on fuzzy association rules and multiple-level similarity. Knowledge and Information Systems, 2006, 10(3): 357–381
https://doi.org/10.1007/s10115-006-0002-1
|
38 |
Yan X F, Yu P S, Han J W. Graph indexing: a frequent structure-based approach. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 2004, 335–346
https://doi.org/10.1145/1007568.1007607
|
39 |
Teredesai A, Ahmad M A, Kanodia J, Gaborski R S. Comma: a framework for integrated multimedia mining using multi-relational associations. Knowledge and Information Systems, 2006, 10(2): 135–162
https://doi.org/10.1007/s10115-005-0221-x
|
40 |
Punin J, Krishnamoorthy M, Zaki M. Web Usage Mining: Languages and Algorithms. Berlin: Springer-Verlag, 2001
|
41 |
Liu C, Fei L, Yan X F, Han J W, Midkiff S P. Statistical debugging: a hypothesis testing-based approach. IEEE Transactions on Software Engineering, 2006, 32(10): 831–848
https://doi.org/10.1109/TSE.2006.105
|
42 |
Han J W, Cheng H, Xin D, Yan X F. Frequent pattern mining: Current status and future directions. Data Mining and Knowledge Discovery, 2007, 15(1): 55–86
https://doi.org/10.1007/s10618-006-0059-1
|
43 |
Dong G Z, Li J Y. Efficient mining of emerging patterns: discovering trends and differences. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery in Databases. 1999, 43–52
https://doi.org/10.1145/312129.312191
|
44 |
Li J Y, Dong G Z, Ramamohanarao K. Making use of the most expressive jumping emerging patterns for classification. In: Proceeding of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2000, 131–145
https://doi.org/10.1007/3-540-45571-X_29
|
45 |
Li W M, Han J W, Pei J. CMAR: accurate and efficient classification based on multiple class-association rules. In: Proceeding of the International Conference on Data Mining. 2001, 369–376
|
46 |
Yin X X, Han J W. CPAR: classification based on predictive association rules. In: Proceeding of SIAM International Conference on Data Mining. 2003, 331–335
https://doi.org/10.1137/1.9781611972733.40
|
47 |
Cong G. Mining top-k covering rule groups for gene expression data. In: Proceedings of the 24th ACM SIGMOD International Conference on Management of Data. 2005, 670–681
https://doi.org/10.1145/1066157.1066234
|
48 |
Ting C K, Zeng W M, Lin T C. Linkage discovery through data mining. IEEE Computational Intelligence Magazine, 2010, 5(1): 10–13
https://doi.org/10.1109/MCI.2009.935310
|
49 |
Chen Y P, Goldberg D E. Convergence time for the linkage learning genetic algorithm. Evolutionary Computation, 2005, 13(3): 279–302
https://doi.org/10.1162/1063656054794806
|
50 |
Ng K P, Wong K C. A new diploid scheme and dominance change mechanism for non-stationary function optimization. In: Proceedings of the 6th International Conference on Genetic Algorithms. 1995, 159–166
|
51 |
Baluja S. Population-based incremental learning: a method for integrating genetic search based function optimization and competitive learning. Technical Report CMU-CS-94-163, 1994
|
52 |
Tang K, Yao X, Suganthan P N, MacNish C, Chen Y P, Chen C M, Yang Z. Benchmark functions for the CEC’2008 special session and competition on large scale global optimization. Technical Report, 2007
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|