.

An Arabic auto-indexing system for information retrieval system. (c2002)

LAUR Repository

Show simple item record

dc.contributor.author Daher, Walid K.
dc.date.accessioned 2011-08-19T09:15:58Z
dc.date.available 2011-08-19T09:15:58Z
dc.date.issued 2011-08-19
dc.date.submitted 2002-01
dc.identifier.uri http://hdl.handle.net/10725/531
dc.description Includes bibliographical references. en_US
dc.description.abstract In this report, a model is proposed for performing auto-indexing text documents. The model consists of four layers that are interdependent. This work tackles the problem of auto-indexing Arabic documents. However, due to this interdependency of the proposed model, the algorithm in this report may be applied to documents in any language. The only thing to change is the layer (or programmatically speaking: the module) that extracts the words to their original stem word. Obviously, that would require the Arabic grammar to be taken into consideration. In addition, this report introduces a new concept to calculate the weight of a term relevant to its container document. Traditionally, the weight of a term used to rely totally on the rate of repeal (or the count) of that term. The new innovation is to take into consideration the rate of "spreading" within the document. In other words, if a certain word is concentrated at a specific part of a document, then it is less likely that this word reflects its document had it been more spread in the document. This assumption is mathematically proven, and is illustrated by real examples. en_US
dc.language.iso en en_US
dc.subject Automatic indexing en_US
dc.subject Indexing en_US
dc.subject Information storage and retrieval systems en_US
dc.title An Arabic auto-indexing system for information retrieval system. (c2002) en_US
dc.type Project en_US
dc.term.submitted Fall en_US
dc.author.degree MS in Computer Science en_US
dc.author.school Arts and Sciences en_US
dc.author.idnumber 199506480 en_US
dc.author.commembers Dr. Nash'at Mansour
dc.author.woa RA en_US
dc.description.physdesc 1 bound copy: vi, 72 leaves; ill.; 30 cm. available at RNL. en_US
dc.author.division Computer Science en_US
dc.author.advisor Dr. Ramzi Haraty
dc.publisher.institution Lebanese American University en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search LAUR


Advanced Search

Browse

My Account