Automated annotation of keywords for proteins of the Newcastle Virus Disease. (c2007)

LAUR Repository

Show simple item record

dc.contributor.author Rababy, Antoine
dc.date.accessioned 2011-10-21T12:37:36Z
dc.date.available 2011-10-21T12:37:36Z
dc.date.copyright 2007 en_US
dc.date.issued 2011-10-21
dc.date.submitted 2007-06-12
dc.identifier.uri http://hdl.handle.net/10725/853
dc.description Includes bibliographical references (l. 60-61). en_US
dc.description.abstract The number of newly discovered proteins has increased drastically during the last two decades. Curators are no longer capable of manually annotating them. Therefore there is a great need to automate this process. Rule generation for protein annotation in databases such as Uniprot, Pro site, Interpro has been tackled by many scientists and researchers and has proven to be a reliable and successful method for correctly and accurately annotating proteins regarding certain fields (for example the keywords field). Our study of the organism "Newcastle Virus Disease" showed that data coming from Swiss-Prot was accurate (checked by human experts) while data coming from TrEMBL is not reliable and incomplete. We propose to automate the process of annotating proteins related to the Newcastle virus disease regarding their keywords field in both the Swiss-Prot and TrEMBL database. The rules generated have been applied to most of the proteins from SwissProt database and the results were promising. As a matter of fact 95% of the proteins were accurately annotated with the exact keyword(s). As for TrEMBL database our rules have annotated the proteins which were originally unannotated and improved or completed the annotation of proteins for which annotation was incomplete. These obtained results were again tested against the data in SwissProt and were found to be between 90% and 100% valid and correct. en_US
dc.language.iso en en_US
dc.subject Newcastle disease virus en_US
dc.subject Proteins en_US
dc.subject Virus diseases en_US
dc.title Automated annotation of keywords for proteins of the Newcastle Virus Disease. (c2007) en_US
dc.type Thesis en_US
dc.term.submitted Spring en_US
dc.author.degree MS in Computer Science en_US
dc.author.school Arts and Sciences en_US
dc.author.commembers Dr. Haidar Harmanani
dc.author.commembers Dr. Chadi Nour
dc.author.woa OA en_US
dc.description.physdesc 1 bound copy: 63 leaves; ill.; 31 cm. available at RNL. en_US
dc.author.division Computer Science en_US
dc.author.advisor Dr. Danielle Azar
dc.identifier.doi https://doi.org/10.26756/th.2007.33 en_US
dc.publisher.institution Lebanese American University en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search LAUR

Advanced Search


My Account