.

Using fuzzy reasoning to improve redundancy elimination for data deduplication in connected environments

LAUR Repository

Show simple item record

dc.contributor.author Yakhni, Sylvana
dc.contributor.author Tekli, Joe
dc.contributor.author Mansour, Elio
dc.contributor.author Chbeir, Richard
dc.date.accessioned 2024-08-20T11:04:29Z
dc.date.available 2024-08-20T11:04:29Z
dc.date.copyright 2023 en_US
dc.date.issued 2023-03-16
dc.identifier.issn 1432-7643 en_US
dc.identifier.uri http://hdl.handle.net/10725/16000
dc.description.abstract The Internet of Things is ushering in the era of connected environments where the number and diversity of data sources (devices and sensors) are inevitably increasing the size of the data that need to be stored locally (at the edge device level) and transmitted to base storages (at the sink level) of the network. This huge amount of data highlights several challenges including network bandwidth, consumption of network energy, cloud storage, and I/O throughput. These call for data pre-processing and filtering solutions to reduce the amount of data being handled and transmitted over the network. In this study, we investigate data deduplication as a prominent pre-processing method that can be used and adapted to address such challenges. Data deduplication techniques have been traditionally developed for data storage and data warehousing applications and aim at identifying and eliminating redundant data items. Few recent approaches have been designed for connected environments, yet they share various limitations, including: (i) detecting duplicates at one level only of the network (either edge or sink exclusively), (ii) overlooking the context and dynamicity of the network (disregarding device mobility and overlooking boundary separations and sensor coverage areas), (iii) relying on crisp thresholds and providing minimum-to-no expert control over the deduplication process (disregarding the domain expert’s needs in defining redundancy). In this study, we propose FREDD, a new approach for Fuzzy Redundancy Elimination for Data Deduplication in a connected environment. FREDD uses simple natural language rules to represent domain knowledge and expert preferences regarding data duplication boundaries. It then applies pattern codes and fuzzy reasoning to detect duplicates at both the edge level and the sink level of the network. This reduces the time required to hard-code the deduplication process, while adapting to the domain expert’s needs for different data sources and applications. Moreover, FREDD is adapted for multiple scenarios, considering both static and mobile devices, with different configurations of hard-separated and soft-separated zones, and different sensor coverage areas in the connected environment. Experiments on a real-world dataset highlight FREDD’s potential and improvement compared with existing solutions. en_US
dc.language.iso en en_US
dc.title Using fuzzy reasoning to improve redundancy elimination for data deduplication in connected environments en_US
dc.type Article en_US
dc.description.version Published en_US
dc.author.school SOE en_US
dc.author.idnumber 201306321 en_US
dc.author.department Electrical And Computer Engineering en_US
dc.relation.journal Soft Computing en_US
dc.journal.volume 27 en_US
dc.journal.issue 17 en_US
dc.article.pages 12387–12418 en_US
dc.keywords Connected environments en_US
dc.keywords Fuzzy reasoning en_US
dc.keywords Data redundancy en_US
dc.keywords Data deduplication en_US
dc.keywords Internet of Things (IoT) en_US
dc.keywords Cyber-physical systems en_US
dc.keywords Wireless sensor networks en_US
dc.identifier.doi https://doi.org/10.1007/s00500-023-07880-z en_US
dc.identifier.ctation Yakhni, S., Tekli, J., Mansour, E., & Chbeir, R. (2023). Using fuzzy reasoning to improve redundancy elimination for data deduplication in connected environments. Soft Computing, 27(17), 12387-12418. en_US
dc.author.email joe.tekli@lau.edu.lb en_US
dc.identifier.tou http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php en_US
dc.identifier.url https://link.springer.com/article/10.1007/s00500-023-07880-z en_US
dc.orcid.id https://orcid.org/0000-0003-3441-7974 en_US
dc.author.affiliation Lebanese American University en_US


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search LAUR


Advanced Search

Browse

My Account