Text-based framework for spam detection in Twitter. (c2017)

LAUR Repository

Show simple item record

dc.contributor.author Halawi, Bahia M.
dc.date.accessioned 2017-11-08T10:20:12Z
dc.date.available 2017-11-08T10:20:12Z
dc.date.copyright 2017 en_US
dc.date.issued 2017-11-08
dc.date.submitted 2017-05-11
dc.identifier.uri http://hdl.handle.net/10725/6553
dc.description.abstract Due to the inevitable popularity of twitter, as well as its ability to transport messages into sparse communities, spammers tend to take twitter for granted in spreading their commercial messages. Moreover, different spammers behave in various manners. Some of them adopted behavioral approaches; others made use of content entropy while many others explored bait behaviors. Previous related works look at this problem from the perspective of studying a tweet along with its metadata, performing different statistical and profiling activities in order to infer about spam. However, these approaches do not pay attention to the limitations placed over twitter’s streaming API, minimizing user’s abilities to extracting follower and followees’ data. Also, many of the approaches violate user privacy by investigating personal data about him/her without previous consent. This thesis is dedicated to studying the relationship between tweets shared by different users, particularly, content considered as spam vs. legitimate. Moreover, we will overcome the above mentioned limitations by developing a set of Message to Message analysis approaches. First, we will deploy the cosine vector similarity and later the natural language toolkit and co-occurrence model to enhance the correctness in detection. However, due to spammer’s creativity in building organic messages, hardly looking similar to old messages, these models suffer from limitations. That is why, we elaborate the use of ontologies in detecting spam over twitter during events. Our experimental results will demonstrate the efficiency of analyzing spam content/semantic relationships over twitter through ontologies. en_US
dc.language.iso en en_US
dc.subject Lebanese American University -- Dissertations en_US
dc.subject Dissertations, Academic en_US
dc.subject Spam filtering (Electronic mail) en_US
dc.subject Twitter en_US
dc.subject Ontologies (Information retrieval) en_US
dc.subject Spam (Electronic mail) en_US
dc.title Text-based framework for spam detection in Twitter. (c2017) en_US
dc.type Thesis en_US
dc.term.submitted Spring en_US
dc.author.degree MS in Computer Science en_US
dc.author.school SAS en_US
dc.author.idnumber 200903423 en_US
dc.author.commembers Otork, Hadi
dc.author.commembers Hamdan, May
dc.author.department Computer Science and Mathematics en_US
dc.description.embargo N/A en_US
dc.description.physdesc 1 hard copy: xii, 78 leaves; 30 cm. available at RNL. en_US
dc.author.advisor Mourad, Azzam
dc.keywords Event spammers en_US
dc.keywords Honey pots en_US
dc.keywords Hashtags en_US
dc.keywords Entropy en_US
dc.keywords Semantic en_US
dc.keywords Ontology en_US
dc.description.bibliographiccitations Bibliography : leaves 75-78. en_US
dc.identifier.doi https://doi.org/10.26756/th.2017.21 en_US
dc.author.email bahia.halawi@lau.edu.lb en_US
dc.identifier.tou http://libraries.lau.edu.lb/research/laur/terms-of-use/thesis.php en_US
dc.publisher.institution Lebanese American University en_US
dc.author.affiliation Lebanese American University en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search LAUR

Advanced Search


My Account