Full-fledged semantic indexing and querying model designed for seamless integration in legacy RDBMS

Tekli, Joe; Chbeir, Richard; Traina, Agma J.M.; Traina Jr., Caetano; Yetongnon, Kokou; Ibanez, Carlos Raymundo; Al Assad, Marc; Kallas, Christian

dc.contributor.author	Tekli, Joe
dc.contributor.author	Chbeir, Richard
dc.contributor.author	Traina, Agma J.M.
dc.contributor.author	Traina Jr., Caetano
dc.contributor.author	Yetongnon, Kokou
dc.contributor.author	Ibanez, Carlos Raymundo
dc.contributor.author	Al Assad, Marc
dc.contributor.author	Kallas, Christian
dc.date.accessioned	2024-08-13T10:12:30Z
dc.date.available	2024-08-13T10:12:30Z
dc.date.copyright	2018	en_US
dc.date.issued	2018-10-13
dc.identifier.issn	0169-023X	en_US
dc.identifier.uri	http://hdl.handle.net/10725/15974
dc.description.abstract	In the past decade, there has been an increasing need for semantic-aware data search and indexing in textual (structured and NoSQL) databases, as full-text search systems became available to non-experts where users have no knowledge about the data being searched and often formulate query keywords which are different from those used by the authors in indexing relevant documents, thus producing noisy and sometimes irrelevant results. In this paper, we address the problem of semantic-aware querying and provide a general framework for modeling and processing semantic-based keyword queries in textual databases, i.e., considering the lexical and semantic similarities/disparities when matching user query and data index terms. To do so, we design and construct a semantic-aware inverted index structure called SemIndex, extending the standard inverted index by constructing a tightly coupled inverted index graph that combines two main resources: a semantic network and a standard inverted index on a collection of textual data. We then provide a general keyword query model with specially tailored query processing algorithms built on top of SemIndex, in order to produce semantic-aware results, allowing the user to choose the results' semantic coverage and expressiveness based on her needs. To investigate the practicality and effectiveness of SemIndex, we discuss its physical design within a standard commercial RDBMS allowing to create, store, and query its graph structure, thus enabling the system to easily scale up and handle large volumes of data. We have conducted a battery of experiments to test the performance of SemIndex, evaluating its construction time, storage size, query processing time, and result quality, in comparison with legacy inverted index. Results highlight both the effectiveness and scalability of our approach.	en_US
dc.language.iso	en	en_US
dc.title	Full-fledged semantic indexing and querying model designed for seamless integration in legacy RDBMS	en_US
dc.type	Article	en_US
dc.description.version	Published	en_US
dc.author.school	SOE	en_US
dc.author.idnumber	201306321	en_US
dc.author.department	Electrical And Computer Engineering	en_US
dc.relation.journal	Data & Knowledge Engineering	en_US
dc.journal.volume	117	en_US
dc.article.pages	133-173	en_US
dc.keywords	Semantic queries	en_US
dc.keywords	Inverted index	en_US
dc.keywords	NoSQL indexing	en_US
dc.keywords	Semantic network	en_US
dc.keywords	Semantic-aware data processing	en_US
dc.keywords	Textual databases	en_US
dc.identifier.doi	https://doi.org/10.1016/j.datak.2018.07.007	en_US
dc.identifier.ctation	Tekli, J., Chbeir, R., Traina, A. J., Traina Jr, C., Yetongnon, K., Ibañez, C. R., ... & Kallas, C. (2018). Full-fledged semantic indexing and querying model designed for seamless integration in legacy RDBMS. Data & Knowledge Engineering, 117, 133-173.	en_US
dc.author.email	joe.tekli@lau.edu.lb	en_US
dc.identifier.tou	http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php	en_US
dc.identifier.url	https://www.sciencedirect.com/science/article/pii/S0169023X16301835	en_US
dc.orcid.id	https://orcid.org/0000-0003-3441-7974	en_US
dc.author.affiliation	Lebanese American University	en_US