.

Indexing Arabic texts using association rule data mining

LAUR Repository

Show simple item record

dc.contributor.author Haraty, Ramzi A.
dc.contributor.author Nasrallah, Rouba
dc.date.accessioned 2019-03-15T13:22:42Z
dc.date.available 2019-03-15T13:22:42Z
dc.date.copyright 2019 en_US
dc.date.issued 2019-03-15
dc.identifier.issn 0737-8831 en_US
dc.identifier.uri http://hdl.handle.net/10725/10223
dc.description.abstract Purpose The purpose of this paper is to propose a new model to enhance auto-indexing Arabic texts. The model denotes extracting new relevant words by relating those chosen by previous classical methods to new words using data mining rules. Design/methodology/approach The proposed model uses an association rule algorithm for extracting frequent sets containing related items – to extract relationships between words in the texts to be indexed with words from texts that belong to the same category. The associations of words extracted are illustrated as sets of words that appear frequently together. Findings The proposed methodology shows significant enhancement in terms of accuracy, efficiency and reliability when compared to previous works. Research limitations/implications The stemming algorithm can be further enhanced. In the Arabic language, we have many grammatical rules. The more we integrate rules to the stemming algorithm, the better the stemming will be. Other enhancements can be done to the stop-list. This is by adding more words to it that should not be taken into consideration in the indexing mechanism. Also, numbers should be added to the list as well as using the thesaurus system because it links different phrases or words with the same meaning to each other, which improves the indexing mechanism. The authors also invite researchers to add more pre-requisite texts to have better results. Originality/value In this paper, the authors present a full text-based auto-indexing method for Arabic text documents. The auto-indexing method extracts new relevant words by using data mining rules, which has not been investigated before. The method uses an association rule mining algorithm for extracting frequent sets containing related items to extract relationships between words in the texts to be indexed with words from texts that belong to the same category. The benefits of the method are demonstrated using empirical work involving several Arabic texts. en_US
dc.language.iso en en_US
dc.title Indexing Arabic texts using association rule data mining en_US
dc.type Article en_US
dc.description.version Published en_US
dc.author.school SAS en_US
dc.author.idnumber 199729410 en_US
dc.author.department Computer Science And Mathematics en_US
dc.description.embargo N/A en_US
dc.relation.journal Library Hi Tech en_US
dc.journal.volume 37 en_US
dc.journal.issue 1 en_US
dc.article.pages 101-117 en_US
dc.keywords Precision en_US
dc.keywords Recall en_US
dc.keywords Arabic text en_US
dc.keywords Auto-indexing en_US
dc.keywords Frequent sets en_US
dc.keywords Rule-based data mining en_US
dc.identifier.doi https://doi.org/10.1108/LHT-07-2017-0147 en_US
dc.identifier.ctation Haraty, R. A., & Nasrallah, R. (2019). Indexing Arabic texts using association rule data mining. Library Hi Tech, 37(1), 101-117. en_US
dc.author.email rharaty@lau.edu.lb en_US
dc.identifier.tou http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php en_US
dc.identifier.url https://www.emeraldinsight.com/doi/full/10.1108/LHT-07-2017-0147 en_US
dc.orcid.id https://orcid.org/0000-0002-6978-3627 en_US
dc.author.affiliation Lebanese American University en_US


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search LAUR


Advanced Search

Browse

My Account