Efficient XML Structural Similarity Detection using Sub-tree Commonalities

Tekli, Joe; Chbeir, Richard; Yetongnon, Kokou

Efficient XML Structural Similarity Detection using Sub-tree Commonalities

Tekli, Joe; Chbeir, Richard; Yetongnon, Kokou

Date: 2007-01

Terms of Use: This item is made available under the terms and conditions applicable to " Conference Paper / Proceeding ", as set forth at: http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php

Abstract:

Developing efficient techniques for comparing XML-based documents becomes essential in the database and information retrieval communities. Various algorithms for comparing hierarchically structured data, e.g. XML documents, have been proposed in the literature. Most of them make use of techniques for finding the edit distance between tree structures, XML documents being modeled as ordered labeled trees. Nevertheless, a thorough investigation of current approaches led us to identify several unaddressed structural similarities, i.e. sub-tree related similarities, while comparing XML documents. In this paper, we provide an improved comparison method to deal with such resemblances. Our approach is based on the concept of tree edit distance, introducing the notion of commonality between sub-trees. Experiments demonstrate that our approach yields better similarity results with respect to alternative methods, while maintaining quatratic time complexity.

Citation:

Tekli, J., Chbeir, R., & Yetongnon, K. (2007). Efficient XML Structural Similarity Detection using Sub-tree Commonalities. In SBBD 2007 (pp. 116-130).

Show full item record