An XML Document Comparison Framework

Tekli, Joe; Chbeir, Richard; Yetongnon, Kokou

An XML Document Comparison Framework

Tekli, Joe; Chbeir, Richard; Yetongnon, Kokou

URL: https://www.researchgate.net/publication/228963576_An_XML_Document_Comparison_Framework

Date: 2001

Terms of Use: This item is made available under the terms and conditions applicable to " Article ", as set forth at: http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php

Abstract:

As the Web continues to grow and evolve, more and more information is being placed in structurally rich documents, XML documents in particular, so as to improve the efficiency of similarity clustering, information retrieval and data management applications. Various algorithms for comparing hierarchically structured data, e.g., XML documents, have been proposed in the literature. Most of them make use of techniques for finding the edit distance between tree structures, XML documents being modeled as Ordered Labeled Trees. Nevertheless, a thorough investigation of current approaches led us to identify several similarity aspects, i.e., sub-tree related structural and semantic similarities, which are not sufficiently addressed while comparing XML documents. In this paper, we provide an integrated and fine-grained comparison method to deal with both structural and semantic similarities in XML documents (detecting the occurrences and repetitions of structurally and semantically similar sub-trees), and allow the end-user to tune the comparison process according to her requirements. Our approach consists of four main modules for i) discovering the structural commonalities between sub-trees, ii) identifying sub-tree semantic resemblances, iii) computing tree-based edit operations costs, iv) and computing tree edit distance. A prototype has been developed to evaluate the optimality and performance of our method. Results demonstrate higher comparison accuracy with respect to alternative XML comparison methods, while timing experiments reflect the significant impact of semantic similarity assessment on overall system performance.

Citation:

Tekli, J., Chbeir, R., & Yetongnon, K. (2001). An XML Document Comparison Framework.

Access Status:

N/A

Show full item record

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

SOE - Scholarly Publications [876]

Search LAUR

Advanced Search

Browse

All of LAUR
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

An XML Document Comparison Framework

LAUR Repository

An XML Document Comparison Framework

Abstract:

Citation:

Access Status:

Files in this item

This item appears in the following Collection(s)

Search LAUR

Browse

All of LAUR

This Collection

My Account