Dynamic multiple node failure recovery in distributed storage systems

LAUR Repository

Show simple item record

dc.contributor.author Itani, May
dc.contributor.author Sharafeddine, Sanaa
dc.contributor.author ElKabani, Islam
dc.date.accessioned 2018-06-07T10:03:14Z
dc.date.available 2018-06-07T10:03:14Z
dc.date.copyright 2018 en_US
dc.date.issued 2018-06-07
dc.identifier.issn 1570-8705 en_US
dc.identifier.uri http://hdl.handle.net/10725/8022
dc.description.abstract Our daily lives are getting more and more dependent on data centers and distributed storage systems in general, whether at the business or at the personal level. With the advent of fog computing, personal mobile devices in a given geographical area may also comprise a very dynamic distributed storage system. These paradigm changes call for the urgent need of devising efficient and reliable failure recovery mechanisms in dynamic scenarios where failures become more likely and nodes join and leave the network more frequently. Redundancy schemes in distributed storage systems have become essential for providing reliability given the fact of frequent node failures. In this work, we address the problem of multiple failure recovery with dynamic scenarios using the fractional repetition code as a redundancy scheme. The fractional repetition (FR) code is a class of regenerating codes that concatenates a maximum distance separable code (MDS) with an inner fractional repetition code where data is split into several blocks then replicated and multiple replicas of each block are stored on various system nodes. We formulate the problem as an integer linear programming problem and extend it to account for three dynamic scenarios of newly arriving blocks, nodes, and variable priority blocks allocation. The contribution of this paper is four-fold: i. we generate an optimized block distribution scheme that minimizes the total system repair cost of all dependent and independent multiple node failure scenarios; ii. we address the practical scenario of having newly arriving blocks and allocate those blocks to existing nodes without any modification to the original on-node block distribution; iii. we consider new-comer nodes and generate an updated optimized block distribution; iv. we consider optimized storage and recovery of blocks with varying priority using variable fractional repetition codes. The four problems are modeled using incidence matrices and solved heuristically. We present a range of results for our proposed algorithms in several scenarios to assess the effectiveness of the solution approaches that are shown to generate results close to optimal. en_US
dc.language.iso en en_US
dc.title Dynamic multiple node failure recovery in distributed storage systems en_US
dc.type Article en_US
dc.description.version Published en_US
dc.author.school SAS en_US
dc.author.idnumber 200502746 en_US
dc.author.department Computer Science and Mathematics en_US
dc.description.embargo N/A en_US
dc.relation.journal Ad Hoc Networks en_US
dc.journal.volume 72 en_US
dc.article.pages 1-13 en_US
dc.keywords Distributed storage systems en_US
dc.keywords Fractional repetition codes en_US
dc.keywords Failure recovery en_US
dc.keywords Genetic algorithms en_US
dc.keywords Multiple failures en_US
dc.keywords Heuristic optimization en_US
dc.identifier.doi https://doi.org/10.1016/j.adhoc.2017.12.007 en_US
dc.identifier.ctation Itani, M., Sharafeddine, S., & ElKabani, I. (2018). Dynamic multiple node failure recovery in distributed storage systems. Ad Hoc Networks, 72, 1-13. en_US
dc.author.email sanaa.sharafeddine@lau.edu.lb en_US
dc.identifier.tou http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php en_US
dc.identifier.url https://www.sciencedirect.com/science/article/pii/S1570870517302263 en_US
dc.orcid.id https://orcid.org/0000-0001-6548-1624 en_US
dc.author.affiliation Lebanese American University en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search LAUR

Advanced Search


My Account