The explosive growth of data being generated and collected has rekindled interest in efficient means of storing such data. Large data centers and distributed storage systems have become more widespread, playing an ever-increasing role in our everyday computational tasks. While a data center should never lose data, disk failures occur on a daily basis as confirmed by the industry statistics. Methods and ideas from error correcting codes developed in this project enable the system to provide better guarantees against data loss as well as to reduce the amount of data that needs to be moved in order to enable recovery of information lost due to disk failures. Another related goal of this project is the reduction of storage overhead needed to support the recovery procedures. These goals are accomplished by relying on algebraic methods of constructing the data encoding procedures as well as on novel algorithms of data exchange and recovery. Overall the research performed in the course of this project contributes to the development of more efficient data management procedures in large-scale distributed storage systems.
This project puts forward new algebraic procedures for data encoding and recovery that enables one to achieve tradeoff between overhead and repair bandwidth based on the concept of local recovery. The project studies both the case of recovering from a single disk loss, which is the most frequent problem in systems, as well as from the failure of multiple disks, addressing the problem of correcting one erasure as well as multiple erasures in data encoding. New bounds on the distance of codes with the locality requirement derived in this research are attained with new constructions of optimal locally recoverable codes equipped with simple recovery procedures. The project also addresses the problem of simultaneous recovery of data from multiple locations, enhancing data availability in large-scale distributed storage systems which are a key backbone component of the 21st century economy.