dc.contributor.author |
Mansour, Nashat |
|
dc.contributor.author |
Salameh, Mohammad |
|
dc.contributor.author |
Zantout, Rached |
|
dc.date.accessioned |
2016-01-26T13:56:28Z |
|
dc.date.available |
2016-01-26T13:56:28Z |
|
dc.date.copyright |
2011 |
|
dc.date.issued |
2016-01-26 |
|
dc.identifier.issn |
1683-3198 |
en_US |
dc.identifier.uri |
http://hdl.handle.net/10725/2963 |
|
dc.description.abstract |
Multilingual natural language processing systems are increasingly relying on parallel corpus to ameliorate their
output. Parallel corpora constitute the basic block for training a statistical natural language processing system and creating
translation and language models. Several systems have been devised that automatically align words of a pair of sentences,
each in a language. Such systems have been used successfully with European languages. In this paper, one such system is used
to align sentences in an English-Arabic corpus. The system works poorly given raw unaligned sentence English-Arabic
sentence pairs. This prompted the development of a preprocessing step to be applied to the Arabic sentences. The same corpus
was then preprocessed and a significant improvement is reported when alignment is attempted using the preprocessed
unaligned sentences. |
en_US |
dc.language.iso |
en |
en_US |
dc.title |
Improving the Accuracy of English-Arabic Statistical Sentence Alignment |
en_US |
dc.type |
Article |
en_US |
dc.description.version |
Published |
en_US |
dc.author.school |
SAS |
en_US |
dc.author.idnumber |
198629170 |
en_US |
dc.author.woa |
N/A |
en_US |
dc.author.department |
Computer Science and Mathematics |
en_US |
dc.description.embargo |
N/A |
en_US |
dc.relation.journal |
The International Arab Journal of Information Technology |
en_US |
dc.journal.volume |
8 |
en_US |
dc.journal.issue |
2 |
en_US |
dc.article.pages |
171-177 |
en_US |
dc.keywords |
Word alignment |
en_US |
dc.keywords |
Sentence alignment |
en_US |
dc.keywords |
Parallel corpora |
en_US |
dc.keywords |
Statistical natural language processing |
en_US |
dc.identifier.ctation |
Salameh, M., Zantout, R., & Mansour, N. (2011). Improving the accuracy of English-Arabic statistical sentence alignment. Int. Arab J. Inf. Technol., 8(2), 171-177. |
en_US |
dc.author.email |
nmansour@lau.edu.lb |
|
dc.identifier.url |
http://iajit.org/PDF/vol.8,no.2/9-999.pdf |
|