A Fast Deterministic Kmeans Initialization

Omar Kettani; Faical Ramdani

Call for Paper

June Edition

IJAIS solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 15 May 2024

Submit your paper

Know more

The week's pick

Analysis of ANN Training Algorithms for Hand Geometry-Based Access Control

Kazeem B. Adedeji Apena Waliu O. Adu Michael R.

Random Articles

Computer Simulation of Chaotic Systems

Apr

2017

Strong Dominating Sets of Lexicographic Product Graph of Cayley Graphs with Arithmetic Graphs

December

2013

Sentiment Analysis of Google Play Store Reviews using Support Vector Machines

Jan

2024

A Procedure for the Analysis of Multivariate Factors Affecting Electricity Consumption

Dec

2017

Reseach Article

A Fast Deterministic Kmeans Initialization

by Omar Kettani, Faical Ramdani

International Journal of Applied Information Systems

Foundation of Computer Science (FCS), NY, USA

Volume 12 - Number 2

Year of Publication: 2017

Authors: Omar Kettani, Faical Ramdani

10.5120/ijais2017451683

Omar Kettani, Faical Ramdani . A Fast Deterministic Kmeans Initialization. International Journal of Applied Information Systems. 12, 2 ( May 2017), 6-11. DOI=10.5120/ijais2017451683

@article{ 10.5120/ijais2017451683,

author = { Omar Kettani, Faical Ramdani },

title = { A Fast Deterministic Kmeans Initialization },

journal = { International Journal of Applied Information Systems },

issue_date = { May 2017 },

volume = { 12 },

number = { 2 },

month = { May },

year = { 2017 },

issn = { 2249-0868 },

pages = { 6-11 },

numpages = {9},

url = { https://www.ijais.org/archives/volume12/number2/984-2017451683/ },

doi = { 10.5120/ijais2017451683 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2023-07-05T19:07:56.282663+05:30

%A Omar Kettani

%A Faical Ramdani

%T A Fast Deterministic Kmeans Initialization

%J International Journal of Applied Information Systems

%@ 2249-0868

%V 12

%N 2

%P 6-11

%D 2017

%I Foundation of Computer Science (FCS), NY, USA

Abstract

The k-means algorithm remains one of the most widely used clustering methods, in spite of its sensitivity to the initial settings. This paper explores a simple, computationally low, deterministic method which provides k-means with initial seeds to cluster a given data set. It is simply based on computing the means of k samples with equal parts taken from the given data set. We test and compare this method to the related well know kkz initialization algorithm for k-means, using both simulated and real data, and find it to be more efficient in many cases.

References

Aloise D., Deshpande A., Hansen P., Popat P.: NP-hardness of Euclidean sum-of- squares clustering. Machine Learning, 75, 245 - 249 (2009).
Lloyd., S. P. (1982). "Least squares quantization in PCM". IEEE Transactions on Information Theory 28 (2): 129–137. doi:10.1109/TIT.1982.1056489.
Peña J.M., Lozano J.A., Larrañaga P.: An empirical comparison of four initialization methods for the k-means algorithm. Pattern Recognition Letters, 20(10), 1027 - 1040 (1999).
4.. Forgy E., "Cluster analysis of multivariate data: Efficiency vs. interpretability of classifications". Biometrics, 21, 768 - 769 (1965).
Arthur D., Vassilvitskii S.: k-means++: the advantages of careful seeding. In: Proceedings of the 18th annual ACM-SIAM Symp. on Disc. Alg, pp. 1027 - 1035 (2007).
Bahmani B., Moseley B., Vattani A., Kumar R., Vassilvitskii S.:Scalable K-means++. In: Proceedings of the VLDB Endowment (2012).
I. Katsavounidis, C.-C. J. Kuo, Z. Zhang, A New Initialization Technique for Generalized Lloyd Iteration, IEEE Signal Processing Letters 1 (10) (1994) 144–146.
8.Asuncion, A. and Newman, D.J. (2007). UCI Machine LearningRepository[http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California, School of Information and Computer Science.
L. Kaufman and P. J. Rousseeuw. Finding groups in Data: “an Introduction to Cluster Analysis”. Wiley, 1990.

Index Terms

Computer Science

Information Sciences

Keywords

k-means initialization kkz