An Overview on XML Semantic Disambiguation from Unstructured Text to Semi-Structured Data: Background, Applications, and Ongoing Challenges

Tekli, Joe

dc.contributor.author	Tekli, Joe
dc.date.accessioned	2017-01-27T07:54:11Z
dc.date.available	2017-01-27T07:54:11Z
dc.date.copyright	2016	en_US
dc.date.issued	2016-06-01
dc.identifier.issn	1041-4347	en_US
dc.identifier.uri	http://hdl.handle.net/10725/5080
dc.description.abstract	Since the last two decades, XML has gained momentum as the standard for web information management and complex data representation. Also, collaboratively built semi-structured information resources, such as Wikipedia, have become prevalent on the Web and can be inherently encoded in XML. Yet most methods for processing XML and semi-structured information handle mainly the syntactic properties of the data, while ignoring the semantics involved. To devise more intelligent applications, one needs to augment syntactic features with machine-readable semantic meaning. This can be achieved through the computational identification of the meaning of data in context, also known as (a.k.a.) automated semantic analysis and disambiguation, which is nowadays one of the main challenges at the core of the Semantic Web. This survey paper provides a concise and comprehensive review of the methods related to XML-based semi-structured semantic analysis and disambiguation. It is made of four logical parts. First, we briefly cover traditional word sense disambiguation methods for processing flat textual data. Second, we describe and categorize disambiguation techniques developed and extended to handle semi-structured and XML data. Third, we describe current and potential application scenarios that can benefit from XML semantic analysis, including: data clustering and semantic-aware indexing, data integration and selective dissemination, semantic-aware and temporal querying, web and mobile services matching and composition, blog and social semantic network analysis, and ontology learning. Fourth, we describe and discuss ongoing challenges and future directions, including: the quantification of semantic ambiguity, expanding XML disambiguation context, combining structure and content, using collaborative/social information sources, integrating explicit and implicit semantic analysis, emphasizing user involvement, and reducing computational complexity.	en_US
dc.language.iso	en	en_US
dc.title	An Overview on XML Semantic Disambiguation from Unstructured Text to Semi-Structured Data: Background, Applications, and Ongoing Challenges	en_US
dc.type	Article	en_US
dc.description.version	Published	en_US
dc.author.school	SOE	en_US
dc.author.idnumber	201306321	en_US
dc.author.department	Electrical And Computer Engineering	en_US
dc.description.embargo	N/A	en_US
dc.relation.journal	IEEE Transactions on Knowledge and Data Engineering	en_US
dc.journal.volume	28	en_US
dc.journal.issue	6	en_US
dc.article.pages	1383-1407	en_US
dc.keywords	Document Preparation	en_US
dc.keywords	Semantic Networks	en_US
dc.keywords	Document management	en_US
dc.identifier.doi	http://dx.doi.org/10.1006/bbrc.1994.188310.1109/TKDE.2016.2525768	en_US
dc.identifier.ctation	Tekli, J. (2016). An overview on xml semantic disambiguation from unstructured text to semi-structured data: Background, applications, and ongoing challenges. IEEE Transactions on Knowledge and Data Engineering, 28(6), 1383-1407.	en_US
dc.author.email	joe.tekli@lau.edu.lb	en_US
dc.identifier.tou	http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php	en_US
dc.identifier.url	http://ieeexplore.ieee.org/abstract/document/7398037/	en_US
dc.orcid.id	https://orcid.org/0000-0003-3441-7974	en_US
dc.author.affiliation	Lebanese American University	en_US