.

An auto-indexing method for Arabic text

LAUR Repository

Show simple item record

dc.contributor.author Mansour, Nashat
dc.contributor.author Haraty, Ramzi A.
dc.contributor.author Daher, Walid
dc.contributor.author Houri, Manal
dc.date.accessioned 2015-10-23T08:01:12Z
dc.date.available 2015-10-23T08:01:12Z
dc.date.copyright 2008
dc.date.issued 2016-05-09
dc.identifier.issn 0306-4573 en_US
dc.identifier.uri http://hdl.handle.net/10725/2317
dc.description.abstract This work addresses the information retrieval problem of auto-indexing Arabic documents. Auto-indexing a text document refers to automatically extracting words that are suitable for building an index for the document. In this paper, we propose an auto-indexing method for Arabic text documents. This method is mainly based on morphological analysis and on a technique for assigning weights to words. The morphological analysis uses a number of grammatical rules to extract stem words that become candidate index words. The weight assignment technique computes weights for these words relative to the container document. The weight is based on how spread is the word in a document and not only on its rate of occurrence. The candidate index words are then sorted in descending order by weight so that information retrievers can select the more important index words. We empirically verify the usefulness of our method using several examples. For these examples, we obtained an average recall of 46% and an average precision of 64% en_US
dc.language.iso en en_US
dc.title An auto-indexing method for Arabic text en_US
dc.type Article en_US
dc.description.version Published en_US
dc.author.school SAS en_US
dc.author.idnumber 199729410 en_US
dc.author.idnumber 198629170 en_US
dc.author.woa N/A en_US
dc.author.department Computer Science and Mathematics en_US
dc.description.embargo N/A en_US
dc.relation.journal Information Processing & Management en_US
dc.journal.volume 44 en_US
dc.journal.issue 4 en_US
dc.article.pages 1538-1545 en_US
dc.keywords Arabic text en_US
dc.keywords Document auto-indexing en_US
dc.keywords Information retrieval en_US
dc.keywords Stem words en_US
dc.keywords Word spread en_US
dc.identifier.doi http://dx.doi.org/10.1177/147059580708337/:10.1016/j.ipm.2007.12.007 en_US
dc.identifier.ctation Mansour, N., Haraty, R. A., Daher, W., & Houri, M. (2008). An auto-indexing method for Arabic text. Information Processing & Management, 44(4), 1538-1545. en_US
dc.author.email nmansour@lau.edu.lb
dc.author.email rharaty@lau.edu.lb
dc.identifier.url http://www.sciencedirect.com/science/article/pii/S0306457308000058
dc.orcid.id https://orcid.org/0000-0002-6978-3627


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search LAUR


Advanced Search

Browse

My Account