**AUTHORS:**P. Perumal, M. Dilip

**Download as PDF**

**ABSTRACT:**
Data clustering helps one discern the structure of and simplify the complexity of massive
quantities of data. It is a common technique for statistical data analysis and is used in many fields,
including machine learning, data mining, pattern recognition, image analysis, and bioinformatics. The
well-known K-means algorithm, which has been successfully applied to many practical clustering
problems, suffers from several drawbacks due to its choice of initializations. However, its performance
depends on the initial state of centroids and may trap in local optima. The gravitational search algorithm
(GSA) is one effective method for find optimal solution. The GSA-KM algorithm helps the k means
algorithm to escape from local optima and also increases the convergence speed of the GSA algorithm.
A hybrid technique based on combining the K-means algorithm, Gravitational Search algorithm, Nelder–
Mead simplex search, and particle swarm optimization, called KM–GSA-NM–PSO, is proposed. The
KM-GSA–NM–PSO searches for cluster centers of an arbitrary data set as does the K-means algorithm,
but it can effectively and efficiently find the global optima. The new KM–GSA-NM–PSO algorithm is
tested on UCI repository data sets, and its performance is compared with those of K means and KMGSA
clustering algorithms. Enhancement can be made to this algorithm such as image segmentation and
university time tabling.

**KEYWORDS:**
Clustering, K-Means, Gravitational Search Algorithm, Nelder-Mead Simplex Search,
Particle Swam Optimization, Ant Colony Optimization

**REFERENCES:**

[1] Abdolreza Hatamlou, Salwani Abdullah, and Hossein Nezamabadi-pour, A combined approach for clustering based on K-means and gravitational search algorithms, Swarm and Evolutionary Computation, Vol.6, pp.47- 52,2012.

[2] Yi-Tung Kao, Erwie Zahara and I-Wei Kao, A hybridized approach to data clustering, Expert Systems with Applications, Vol.34, Issue 3, pp. 1754- 1762, 2008.

[3] L.E. Agustı´n-Blas, S. Salcedo-Sanz, S. Jiménez-Fernández, L. Carro-Calvo, J. Del Ser and J.A. Portilla-Figueras, A new grouping genetic algorithm for clustering problems, Expert Systems with Applications, Vol.39, Issue 10, pp.9695- 9703, 2012.

[4] Minghao Yin, Yanmei Hu, Fengqin Yang, Xiangtao Li and Wenxiang Gu, A novel hybrid K-harmonic means and gravitational search algorithm approach for clustering, Expert Systems with Applications, Vol.38, Issue 8, pp. 9319- 9324, 2011.

[5] Hua Jiang, Shenghe Yi, Jing Li, Fengqin Yang and Xin Hu, Ant clustering algorithm with K-harmonic means clustering, Expert Systems with Applications, Vol.37, Issue 12, pp. 8679- 8684, 2010.

[6] Esmat Rashedi, Hossein Nezamabadi-pour and Saeid Saryazdi, GSA: A Gravitational Search Algorithm, Information Sciences, Vol.179, Issue 13, pp. 2232-2248, 2009.

[7] J. Kennedy and R. Eberhart, Particle swarm optimization, Neural Networks. Proceedings, IEEE International Conference, pp.1942-1948, 1995.

[8] P. Jin, Y.L. Zhu and K.Y. Hu, A clustering algorithm for data mining based on swarm intelligence, Proceedings of the Sixth International Conference on Machine Learning and Cybernetics, ICMLC, 2007.

[9] B. Saglam, et al., A mixed-integer programming approach to the clustering problem with an application in customer segmentation, European Journal of Operational Research, 173 (3), pp. 866– 879, 2006.

[10]A.K. Jain, Data clustering: 50 years Beyond Kmeans, Pattern Recognition Letters, 31 (8), pp.651–666, 2010.

[11] C. Ching-Yi and Y. Fun, Particle swarm optimization algorithm and its application to clustering analysis, In Proceeding of the 2004 IEEE International conference on Networking Sensing and Control, pp.798-794, 2004.

[12] E.W. Forgy, Cluster analysis of multivariate data: efficiency versus interpret-ability of classifications, Biometrics, Vol. 21 (1965), pp.768-769, 1965.

[13]L. Kaufman and P.J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis, John Wiley & Sons, New York, 1990.

[14]Y.-T. Kao, E. Zahara and I.W. Kao, A hybridized approach to data clustering, Expert Systems with Applications, Vol. 34 (3), pp.1754–1762, 2008.

[15]S.Z. Selim and M.A. Ismail, K-meanstype algorithms: a generalized convergence theorem and characterization of local optimality, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.6 (1), pp.81–87, 1984.

[16]S.Z. Selim and K. Alsultan, A simulated annealing algorithm for the clustering problem, Pattern Recognition, vol. 24 (10), pp.1003–1008, 1991.

[17]K.S. Al-Sultan and A Tabu, search approach to the clustering problem, Pattern Recognition, vol.28 (9), pp.1443– 1451, 1995.

[18]C.S. Sung and H.W. Jin, A tabu-searchbased heuristic for clustering, Pattern Recognition, vol. 33 (5), pp.849–858, 2000.

[19]U. Maulik and S. Bandyopadhyay, Genetic algorithm-based clustering technique, Pattern Recognition, Vol.33 (9) pp.1455–1465, 2000.

[20]K. Krishna and M. Narasimha Murty, Genetic K -means algorithm, IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), vol.29 (3), pp.433–439, 1993.

[21]A.K.Qin and P.N. Suganthan, Kernel neural gas algorithms with application to cluster analysis, In: 17th International Conference on Pattern Recognition (ICPR 2004), Vol.4, pp.617-620, 2004.

[22]A.K. Qin and P.N. Suganthan, A robust neural gas algorithm for clustering analysis, in: Proceedings of International Conference on Intelligent Sensing and Information Processing, ICISIP 2004, pp.342-347, 2004.

[23]P.S. Shelokar, V.K. Jayaraman and B.D. Kulkarni, An ant colony approach for clustering, Analytica Chimica Acta, Vol.509 (2), pp.187–195, 2004.

[24]D. Karaboga and C. Ozturk, A novel clustering approach: artificial bee colony (ABC) algorithm, Applied Soft Computing, Vol.11 (1), pp.652–657, 2011.

[25]M. Fathian, B. Amiri and A. Maroosi, Application of honey-bee mating optimization algorithm on clustering, Applied Mathematics and Computation, Vol.190 (2), pp1502–1513, 2007.

[26]A. Hatamlou, S. Abdullah and M. Hatamlou, Data clustering using big bang-big crunch algorithm, In: Communications in Computer and Information Science, pp. 383–388, 2011.