dc.contributor.author |
Sobeh, Salma |
|
dc.date.accessioned |
2023-10-19T09:37:46Z |
|
dc.date.available |
2023-10-19T09:37:46Z |
|
dc.date.copyright |
2023 |
en_US |
dc.date.issued |
2023-05-19 |
|
dc.identifier.uri |
http://hdl.handle.net/10725/15079 |
|
dc.description.abstract |
In the fourth industrial revolution era of today, individuals encounter an immense volume of information daily. The digital world is rich in data like IoT, social media, healthcare, business, cryptocurrencies, cybersecurity, etc. The situation can become problematic as these vast amounts of data require significant storage capacity, which leads to challenges in executing tasks such as analytical operations, processing operations, and retrieval operations that are time-consuming and arduous. To effectively analyze and utilize this data, AI particularly machine learning, and deep learning, can provide a practical solution. Clustering, an unsupervised learning technique, aims to identify a specific number of clusters to effectively categorize the data through data grouping. Hence, clustering is related to many fields and is used in various applications that deal with large datasets. This survey examines seven widely recognized clustering techniques, namely k-means, G-means, DBSCAN, Agglomerative hierarchical clustering, Two-stage density (DBSCAN and k-means) algorithm, Two-levels (DBSCAN and hierarchical) clustering algorithm, and Two-stage MeanShift and K-means clustering algorithm and compares them over a real dataset - The Blockchain dataset, including prominent cryptocurrencies like Binance, Bitcoin, Doge, and Ethereum, under several metrics such as silhouette coefficient, Calinski-Harabasz, Davies-Bouldin Index, time complexity, and entropy. |
en_US |
dc.language.iso |
en |
en_US |
dc.subject |
Cluster analysis -- Data processing |
en_US |
dc.subject |
Computational intelligence |
en_US |
dc.subject |
Mathematical optimization |
en_US |
dc.subject |
Algorithms |
en_US |
dc.subject |
Lebanese American University -- Dissertations |
en_US |
dc.subject |
Dissertations, Academic |
en_US |
dc.title |
A Survey of Data Clustering Techniques |
en_US |
dc.type |
Thesis |
en_US |
dc.term.submitted |
Spring |
en_US |
dc.author.degree |
MS in Computer Science |
en_US |
dc.author.school |
SAS |
en_US |
dc.author.idnumber |
201505469 |
en_US |
dc.author.commembers |
Habre, Samer |
|
dc.author.department |
Computer Science And Mathematics |
en_US |
dc.description.physdesc |
1 online resource (x, 78 leaves):ill. (some col.) |
en_US |
dc.author.advisor |
Haraty, Ramzi |
|
dc.keywords |
Clustering |
en_US |
dc.keywords |
K-means |
en_US |
dc.keywords |
G-means |
en_US |
dc.keywords |
DBSCAN |
en_US |
dc.keywords |
Agglomerative clustering |
en_US |
dc.keywords |
Two-stage density clustering |
en_US |
dc.keywords |
and Two-stage (MeanShift and K-means) clustering algorithm |
en_US |
dc.description.bibliographiccitations |
Includes bibliographical references (leaves 71-78) |
en_US |
dc.identifier.doi |
https://doi.org/10.26756/th.2023.580 |
|
dc.author.email |
salma.sobeh@lau.edu.lb |
en_US |
dc.identifier.tou |
http://libraries.lau.edu.lb/research/laur/terms-of-use/thesis.php |
en_US |
dc.publisher.institution |
Lebanese American University |
en_US |
dc.author.affiliation |
Lebanese American University |
en_US |