IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

An Empirical Study on Initializing Centroid in K-Means Clustering for Feature Selection

An Empirical Study on Initializing Centroid in K-Means Clustering for Feature Selection
View Sample PDF
Author(s): Amit Saxena (Guru Ghasidas Vishwavidyalaya, India), John Wang (Montclair State University, USA)and Wutiphol Sintunavarat (Thammasat University, Thailand)
Copyright: 2021
Volume: 13
Issue: 1
Pages: 16
Source title: International Journal of Software Science and Computational Intelligence (IJSSCI)
Editor(s)-in-Chief: Brij Gupta (Asia University, Taichung City, Taiwan)and Andrew W.H. Ip (University of Saskatchewan, Canada)
DOI: 10.4018/IJSSCI.2021010101

Purchase

View An Empirical Study on Initializing Centroid in K-Means Clustering for Feature Selection on the publisher's website for pricing and purchasing information.

Abstract

One of the main problems in K-means clustering is setting of initial centroids which can cause misclustering of patterns which affects clustering accuracy. Recently, a density and distance-based technique for determining initial centroids has claimed a faster convergence of clusters. Motivated from this key idea, the authors study the impact of initial centroids on clustering accuracy for unsupervised feature selection. Three metrics are used to rank the features of a data set. The centroids of the clusters in the data sets, to be applied in K-means clustering, are initialized randomly as well as by density and distance-based approaches. Extensive experiments are performed on 15 datasets. The main significance of the paper is that the K-means clustering yields higher accuracies in majority of these datasets using proposed density and distance-based approach. As an impact of the paper, with fewer features, a good clustering accuracy can be achieved which can be useful in data mining of data sets with thousands of features.

Related Content

. © 2024.
Dingju Zhu, Jianbin Tan, Guangbo Luo, Haoxiang Gu, Zhanhao Ye, Renfeng Deng, Keyi He, KaiLeung Yung, Andrew W. H. Ip. © 2023. 16 pages.
Mohammad Alauthman, Ahmad al-Qerem, Someah Alangari, Ali Mohd Ali, Ahmad Nabo, Amjad Aldweesh, Issam Jebreen, Ammar Almomani, Brij B. Gupta. © 2023. 24 pages.
Dilip Kumar Jang Bahadur Saini, Anupama Mishra, Dhirendra Siddharth, Pooja Joshi, Ritika Bansal, Shavi Bansal, Kwok Tai Chui. © 2023. 20 pages.
Piyush Bagla, Kuldeep Kumar. © 2023. 14 pages.
Charles Shi Tan. © 2023. 19 pages.
Irfan M. Leghari, Syed Asif Ali. © 2023. 11 pages.
Body Bottom