Abstract:
Unsupervised machine learning is a powerful technique for performing
clustering, which involves identifying patterns or similarities within a dataset and
grouping them into distinct clusters or subgroups. Various clustering methods are
available, including K-means, hierarchical clustering, and density-based
clustering. Among these, K-means is widely used for efficiently solving
clustering problems. This work aims to enhance the performance of the K-means
algorithm by introducing a novel method for selecting the initial centroids, thereby minimizing randomness and reducing the number of iterations needed to
reach optimal results. The proposed method, named Eye-means, emulates the
natural ocular process of estimating initial centroids. To achieve this goal,
supervised machine learning was employed to train models on graphs with
labeled data points, where each graph contains a set of points and a label
indicating the centroid determined by K-means. Hundreds of such labeled graphs were used to train the model to predict the location of centroids. The objective is
to produce a model capable of predicting centroids with greater accuracy than the
traditional random initialization used in K-means. Experimental results indicate that the proposed method outperforms random initialization in terms of the
number of iterations required to achieve an optimal solution.