CFP last date
15 May 2024
Reseach Article

Kohonen Self Organizing Map with Modified K-means clustering For High Dimensional Data Set

by Madhusmita Mishra, H.s. Behera
International Journal of Applied Information Systems
Foundation of Computer Science (FCS), NY, USA
Volume 2 - Number 3
Year of Publication: 2012
Authors: Madhusmita Mishra, H.s. Behera
10.5120/ijais12-450310

Madhusmita Mishra, H.s. Behera . Kohonen Self Organizing Map with Modified K-means clustering For High Dimensional Data Set. International Journal of Applied Information Systems. 2, 3 ( May 2012), 34-39. DOI=10.5120/ijais12-450310

@article{ 10.5120/ijais12-450310,
author = { Madhusmita Mishra, H.s. Behera },
title = { Kohonen Self Organizing Map with Modified K-means clustering For High Dimensional Data Set },
journal = { International Journal of Applied Information Systems },
issue_date = { May 2012 },
volume = { 2 },
number = { 3 },
month = { May },
year = { 2012 },
issn = { 2249-0868 },
pages = { 34-39 },
numpages = {9},
url = { https://www.ijais.org/archives/volume2/number3/143-0310/ },
doi = { 10.5120/ijais12-450310 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2023-07-05T10:43:23.705926+05:30
%A Madhusmita Mishra
%A H.s. Behera
%T Kohonen Self Organizing Map with Modified K-means clustering For High Dimensional Data Set
%J International Journal of Applied Information Systems
%@ 2249-0868
%V 2
%N 3
%P 34-39
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Since it was first proposed, it is amazing to notice how K-Means algorithm has survive over the years. It has been one among the well known algorithms for data clustering in the field of data mining. Day in and day out new algorithms are evolving for data clustering purposes but none can be as fast and accurate as the K-Means algorithm. But in spite of its huge speed, accuracy and simplicity K-Means has suffered from some of its own problem. Such as, the exact number of cluster is not known prior to clustering. The other thing that is causing problem is that it is quite sensitive to initial centroids. Not just that, K-Means fails to give optimum result when it comes to clustering high dimensional data set because its complexity tends to make things more complicated when more number of dimensions are added. In Data Mining this problem is known as "Curse of High Dimensionality". Here in our paper we proposed a new Modified K-Means algorithm that will overcome the problem faced by the standard K-Means algorithm. We proposed the use of Kohonen Self Organizing Map (KSOM) so as to visualize exact number of clusters before clustering and genetic algorithm is applied for initialization. The Kohonen Self Organizing Map (KSOM) with Modified K-Means algorithm is tested on an iris data set and its performance is compared with other clustering algorithm and is found out to be more accurate, with less number of classification and quantization errors and can be applied even for high dimensional dataset.

References
  1. Dash, R. et. al , "A Hybridized k-Means Clustering Algorithm for High Dimensional Dataset", International Journal of Engineering, Science and Technology, vol. 2, No. 2, pp. 59-66,2010.
  2. Behera, H. S. et al, "An improved hybridized k-means clustering algorithm(IHKMCA) for high dimensional dataset and it's performance analysis" International journal of Computer science & Engineering,Vol-3 no-2,pp 1183-1190,2011.
  3. Vesanto, J. and Alhoniemi, E. , "Clustering of the Self-Organizing Map", IEEE Transactions on Neural Networks, Vol. 11, No. 3, May 2000, pp. 586-600.
  4. Vesanto J. , "SOM-based data visualization methods", Intell, Data Analysis, vol. 3, No. 2, pp. 111-126, 1999.
  5. M. N. M and Moheb, E. , "Hybrid Self Organizing Map for Overlapping Clusters", International Journal of Signal Processing, Image Processing and Pattern Recognition,pp-11-20.
  6. Bohling, J. , "Dimension Reduction And Cluster Analysis", EECS 833, 6 March 2006.
  7. Yedla, M. et al, "Enhancing K means algorithm with improved initial center", (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 1 (2) , pp- 121-125,2010.
  8. Fahim A. M. , et al, "An efficient k-means with good initial starting points", Georgian Electronic Scientific Journal: Computer Science and Telecommunications, Vol. 2, No. 19, pp. 47-57,2009.
  9. Zhang, C. , Xia, S. , et al, "K-means Clustering Algorithm with Improved Initial Center," Second International Workshop on Knowledge Discovery and Data Mining, wkdd, pp. 790-792,2009.
  10. Bashar Al Shboul et. al "Initializing K-Means Clustering Algorithm by using Genetic Algorithm" , World Academy of Science, Engineering and Technology 54 2009.
Index Terms

Computer Science
Information Sciences

Keywords

K-means Kohonen Self Organizing Map Genetic Algorithm Curse Of Dimensionality Classification Error