.

Practical Multiple Node Failure Recovery in Distributed Storage Systems

LAUR Repository

Show simple item record

dc.contributor.author Itani, M.
dc.contributor.author Sharafeddine, S.
dc.contributor.author ElKabbani, I.
dc.date.accessioned 2018-07-02T11:34:37Z
dc.date.available 2018-07-02T11:34:37Z
dc.date.copyright 2016 en_US
dc.date.issued 2018-07-02
dc.identifier.uri http://hdl.handle.net/10725/8152
dc.description.abstract As multiple node failures are becoming so frequent in distributed storage systems, many erasure coding techniques are emerging to handle such failures. In this paper we use the fractional repetition code to apply as a redundancy scheme for multiple failure recovery with optimized system cost. The fractional repetition (FR) code is a class of regenerating codes that consists of a concatenation of an outer maximum distance separable (MDS) code and an inner fractional repetition code that splits the data into several blocks and stores multiple replicas of each on different nodes in the system. We model the problem as an integer linear programming problem that uses modified versions of the fractional repetition code by allowing different block sizes, and minimizes the recovery cost of all dependent and independent multiple node failure scenarios. First, we generate an optimized block distribution scheme that minimizes the total system repair cost together with a full recovery plan with a node repair order for the system. Moreover, we account for the common scenario of having newcomer blocks. We allocate newcomers to nodes with minimal computations and without changing the original optimized plan. The problem is solved using genetic algorithms that search within the feasible solution space. Fast convergence validates the efficacy of our algorithms for different system parameters. Simulation results are shown to be close to optimal for the case of newly arriving blocks. en_US
dc.language.iso en en_US
dc.publisher IEEE en_US
dc.subject Data transmission systems -- Congresses en_US
dc.subject Telecommunication -- Data processing -- Congresses en_US
dc.subject Wireless sensor networks -- Congresses en_US
dc.subject Cloud computing -- Congresses en_US
dc.subject Internet of things -- Congresses en_US
dc.subject Smart power grids -- Congresses en_US
dc.title Practical Multiple Node Failure Recovery in Distributed Storage Systems en_US
dc.type Conference Paper / Proceeding en_US
dc.author.school SAS en_US
dc.author.idnumber 200502746 en_US
dc.author.department Computer Science and Mathematics en_US
dc.description.embargo N/A en_US
dc.publication.place Piscataway, N.J. en_US
dc.description.bibliographiccitations Includes bibliographical references en_US
dc.identifier.doi http://dx.doi.org/10.1109/ISCC.2016.7543851 en_US
dc.identifier.ctation Itani, M., Sharafeddine, S., & Elkabbani, I. (2016, June). Practical multiple node failure recovery in distributed storage systems. In Computers and Communication (ISCC), 2016 IEEE Symposium on (pp. 901-907). IEEE. en_US
dc.author.email sanaa.sharafeddine@lau.edu.lb en_US
dc.conference.date 27-30 June 2016 en_US
dc.conference.pages 901-907 en_US
dc.conference.place Messina, Italy en_US
dc.conference.title 2016 IEEE Symposium on Computers and Communication (ISCC) en_US
dc.identifier.tou http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php en_US
dc.identifier.url https://ieeexplore.ieee.org/abstract/document/7543851/ en_US
dc.orcid.id https://orcid.org/0000-0001-6548-1624 en_US
dc.publication.date 2016 en_US
dc.author.affiliation Lebanese American University en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search LAUR


Advanced Search

Browse

My Account