Improving the Accuracy of English-Arabic Statistical Sentence Alignment

LAUR Repository

Show simple item record

dc.contributor.author Mansour, Nashat
dc.contributor.author Salameh, Mohammad
dc.contributor.author Zantout, Rached
dc.date.accessioned 2016-01-26T13:56:28Z
dc.date.available 2016-01-26T13:56:28Z
dc.date.copyright 2011
dc.date.issued 2016-01-26
dc.identifier.issn 1683-3198 en_US
dc.identifier.uri http://hdl.handle.net/10725/2963
dc.description.abstract Multilingual natural language processing systems are increasingly relying on parallel corpus to ameliorate their output. Parallel corpora constitute the basic block for training a statistical natural language processing system and creating translation and language models. Several systems have been devised that automatically align words of a pair of sentences, each in a language. Such systems have been used successfully with European languages. In this paper, one such system is used to align sentences in an English-Arabic corpus. The system works poorly given raw unaligned sentence English-Arabic sentence pairs. This prompted the development of a preprocessing step to be applied to the Arabic sentences. The same corpus was then preprocessed and a significant improvement is reported when alignment is attempted using the preprocessed unaligned sentences. en_US
dc.language.iso en en_US
dc.title Improving the Accuracy of English-Arabic Statistical Sentence Alignment en_US
dc.type Article en_US
dc.description.version Published en_US
dc.author.school SAS en_US
dc.author.idnumber 198629170 en_US
dc.author.woa N/A en_US
dc.author.department Computer Science and Mathematics en_US
dc.description.embargo N/A en_US
dc.relation.journal The International Arab Journal of Information Technology en_US
dc.journal.volume 8 en_US
dc.journal.issue 2 en_US
dc.article.pages 171-177 en_US
dc.keywords Word alignment en_US
dc.keywords Sentence alignment en_US
dc.keywords Parallel corpora en_US
dc.keywords Statistical natural language processing en_US
dc.identifier.ctation Salameh, M., Zantout, R., & Mansour, N. (2011). Improving the accuracy of English-Arabic statistical sentence alignment. Int. Arab J. Inf. Technol., 8(2), 171-177. en_US
dc.author.email nmansour@lau.edu.lb
dc.identifier.url http://iajit.org/PDF/vol.8,no.2/9-999.pdf

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search LAUR

Advanced Search


My Account