dc.contributor.author |
Daher, Walid K. |
|
dc.date.accessioned |
2011-08-19T09:15:58Z |
|
dc.date.available |
2011-08-19T09:15:58Z |
|
dc.date.issued |
2011-08-19 |
|
dc.date.submitted |
2002-01 |
|
dc.identifier.uri |
http://hdl.handle.net/10725/531 |
|
dc.description |
Includes bibliographical references. |
en_US |
dc.description.abstract |
In this report, a model is proposed for performing auto-indexing text documents. The model consists of four layers that are interdependent. This work tackles the problem of auto-indexing Arabic documents. However, due to this interdependency of the proposed model, the algorithm in this report may be applied to documents in any language. The only thing to change is the layer (or programmatically speaking: the module) that extracts the words to their original stem word. Obviously, that would require the Arabic grammar to be taken into consideration. In addition, this report introduces a new concept to calculate the weight of a term relevant to its container document. Traditionally, the weight of a term used to rely totally on the rate of repeal (or the count) of that term. The new innovation is to take into consideration the rate of "spreading" within the document. In other words, if a certain word is concentrated at a specific part of a document, then it is less likely that this word reflects its document had it been more spread in the document. This assumption is mathematically proven, and is illustrated by real examples. |
en_US |
dc.language.iso |
en |
en_US |
dc.subject |
Automatic indexing |
en_US |
dc.subject |
Indexing |
en_US |
dc.subject |
Information storage and retrieval systems |
en_US |
dc.title |
An Arabic auto-indexing system for information retrieval system. (c2002) |
en_US |
dc.type |
Project |
en_US |
dc.term.submitted |
Fall |
en_US |
dc.author.degree |
MS in Computer Science |
en_US |
dc.author.school |
Arts and Sciences |
en_US |
dc.author.idnumber |
199506480 |
en_US |
dc.author.commembers |
Dr. Nash'at Mansour |
|
dc.author.woa |
RA |
en_US |
dc.description.physdesc |
1 bound copy: vi, 72 leaves; ill.; 30 cm. available at RNL. |
en_US |
dc.author.division |
Computer Science |
en_US |
dc.author.advisor |
Dr. Ramzi Haraty |
|
dc.identifier.doi |
https://doi.org/10.26756/th.2023.560 |
|
dc.publisher.institution |
Lebanese American University |
en_US |