.

A Survey of Data Clustering Techniques

LAUR Repository

Show simple item record

dc.contributor.author Sobeh, Salma
dc.date.accessioned 2023-10-19T09:37:46Z
dc.date.available 2023-10-19T09:37:46Z
dc.date.copyright 2023 en_US
dc.date.issued 2023-05-19
dc.identifier.uri http://hdl.handle.net/10725/15079
dc.description.abstract In the fourth industrial revolution era of today, individuals encounter an immense volume of information daily. The digital world is rich in data like IoT, social media, healthcare, business, cryptocurrencies, cybersecurity, etc. The situation can become problematic as these vast amounts of data require significant storage capacity, which leads to challenges in executing tasks such as analytical operations, processing operations, and retrieval operations that are time-consuming and arduous. To effectively analyze and utilize this data, AI particularly machine learning, and deep learning, can provide a practical solution. Clustering, an unsupervised learning technique, aims to identify a specific number of clusters to effectively categorize the data through data grouping. Hence, clustering is related to many fields and is used in various applications that deal with large datasets. This survey examines seven widely recognized clustering techniques, namely k-means, G-means, DBSCAN, Agglomerative hierarchical clustering, Two-stage density (DBSCAN and k-means) algorithm, Two-levels (DBSCAN and hierarchical) clustering algorithm, and Two-stage MeanShift and K-means clustering algorithm and compares them over a real dataset - The Blockchain dataset, including prominent cryptocurrencies like Binance, Bitcoin, Doge, and Ethereum, under several metrics such as silhouette coefficient, Calinski-Harabasz, Davies-Bouldin Index, time complexity, and entropy. en_US
dc.language.iso en en_US
dc.subject Cluster analysis -- Data processing en_US
dc.subject Computational intelligence en_US
dc.subject Mathematical optimization en_US
dc.subject Algorithms en_US
dc.subject Lebanese American University -- Dissertations en_US
dc.subject Dissertations, Academic en_US
dc.title A Survey of Data Clustering Techniques en_US
dc.type Thesis en_US
dc.term.submitted Spring en_US
dc.author.degree MS in Computer Science en_US
dc.author.school SAS en_US
dc.author.idnumber 201505469 en_US
dc.author.commembers Habre, Samer
dc.author.department Computer Science And Mathematics en_US
dc.description.physdesc 1 online resource (x, 78 leaves):ill. (some col.) en_US
dc.author.advisor Haraty, Ramzi
dc.keywords Clustering en_US
dc.keywords K-means en_US
dc.keywords G-means en_US
dc.keywords DBSCAN en_US
dc.keywords Agglomerative clustering en_US
dc.keywords Two-stage density clustering en_US
dc.keywords and Two-stage (MeanShift and K-means) clustering algorithm en_US
dc.description.bibliographiccitations Includes bibliographical references (leaves 71-78) en_US
dc.identifier.doi https://doi.org/10.26756/th.2023.580
dc.author.email salma.sobeh@lau.edu.lb en_US
dc.identifier.tou http://libraries.lau.edu.lb/research/laur/terms-of-use/thesis.php en_US
dc.publisher.institution Lebanese American University en_US
dc.author.affiliation Lebanese American University en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search LAUR


Advanced Search

Browse

My Account