Influence of clustering pre-processing on genetically generated fuzzy knowledge bases
Abstract
Automatic knowledge base generation using techniques such as genetic algorithms tend to be highly dependent on the quality and size of the learning data. First of all, large data sets can lead to unnecessary time loss, when smaller data sets could describe the problem as well. Second of all, the presence of noise and outliers can cause the learning algorithm to degenerate. Clustering techniques allow compressing and filtering the data, thus making the generation of fuzzy knowledge bases faster and more accurate. Different clustering algorithms are compared and the validation of the results through a theoretical 3D surface, shows that when compressing the data to 5% of its original size, clustering algorithms accelerate the learning process by up to 94%. Moreover, when the learning data contains noise and/or a large amount of outliers, clustering algorithms can make the results more stable and improve the fitness of the obtained FKBs.
Keywords
References
[1] S. Achiche, M. Balazinski, L. Baron. Real/ binary-like coded genetic algorithm to automatically generate fuzzy knowledge bases. The 4-th International Conference on Control and Automation, June 2003.[2] S. Achiche, M. Balazinski, L. Baron. Multi-combinative strategy to avoid premature convergence in genetically-generated fuzzy knowledge Bases. Journal of Theoretical and Applied Mechanics, 42(3): 417-444, 2004.
[3] M. Balazinski, M. Bellerose, E. Czogala. Application of fuzzy logic techniques to the selection of cutting parameters in machining processes. International Journal for Fuzzy Sets and Systems, 61: 307- 317, 1993.
[4] L. Baron., S. Achiche, M. Balazinski. Fuzzy decisions system knowledge base generation using a genetic algorithm. International Journal of Approximate Reasoning, pp. 25-148, 2001.
[5] T. Calinski, J. Harabasz. A dendrite method for cluster analysis. Communications in Statistics, 3: 1-27,1974