.

DG-Means – A Superior Greedy Algorithm for Clustering Distributed Data

LAUR Repository

Show simple item record

dc.contributor.author Assaf, Ali
dc.date.accessioned 2022-10-31T10:37:29Z
dc.date.available 2022-10-31T10:37:29Z
dc.date.copyright 2022 en_US
dc.date.issued 2022-07-26
dc.identifier.uri http://hdl.handle.net/10725/14179
dc.description.abstract Clustering is the process of dividing a set of objects into several classes in which each class is composed of similar objects. Traditional centralized clustering algorithms target those objects that are located in the same site, whereas it cannot perform on distributed objects. Distributed clustering algorithms, however, can fulfil this gap. They extract a classification model from the distributed objects even when they are in different sites and locations. In today’s life, and due to the trend of storing data on different locations and sites, the popularity of distributed data is getting tremendously booming. It seems to be one of the most prevailing fields in the coming decades, especially with the huge amount of data propagating throughout the web. Even though a lot of research and work was done on this topic, it is still considered in its infantry because of the challenges that is still popping up such as bandwidth limitation, transferring data to single site and many others. In this work, we present DG-means, which is a greedy algorithm that performs on distributed sets of data. Three datasets - Wholesale dataset, Banknotes dataset, and Iris dataset are used to compare multiple distributed clustering algorithms on different matrices: runtime execution, stability, and accuracy. DG-means exhibited superior performance when compared to the other algorithms. en_US
dc.language.iso en en_US
dc.subject Data mining en_US
dc.subject Cluster analysis -- Data processing en_US
dc.subject Computer algorithms en_US
dc.subject Lebanese American University -- Dissertations en_US
dc.subject Dissertations, Academic en_US
dc.title DG-Means – A Superior Greedy Algorithm for Clustering Distributed Data en_US
dc.type Thesis en_US
dc.term.submitted Summer en_US
dc.author.degree MS in Computer Science en_US
dc.author.school SAS en_US
dc.author.idnumber 202000822 en_US
dc.author.commembers Habre, Samer
dc.author.commembers Kaddoura, Sanaa
dc.author.department Computer Science And Mathematics en_US
dc.description.physdesc 1 online resource (x, 58 leaves): col. ill. en_US
dc.author.advisor Haraty, Ramzi
dc.keywords Clustering en_US
dc.keywords K-means en_US
dc.keywords Distributed clustering en_US
dc.keywords G-means en_US
dc.keywords Data mining en_US
dc.description.bibliographiccitations Includes bibliographical references (leaves 53-58). en_US
dc.identifier.doi https://doi.org/10.26756/th.2022.492
dc.author.email ali.assaf04@lau.edu.lb en_US
dc.identifier.tou http://libraries.lau.edu.lb/research/laur/terms-of-use/thesis.php en_US
dc.publisher.institution Lebanese American University en_US
dc.author.affiliation Lebanese American University en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search LAUR


Advanced Search

Browse

My Account