RAID, EC, Replication: Data Protection in Storage Systems

Data protection is the process of safeguarding important information from corruption, compromise or loss. The importance of data protection increases as the amount of data created and stored continues to grow at unprecedented rates. When looking at data protection, many factors need to be considered.

What is RAID?

RAID stands for Redundant Array of Independent Disks, and combines multiple hard drives together in order to improve efficiency. Depending on how your RAID is configured, it can increase your computer's speed while giving you a single drive with a huge capacity and increase reliability.

RAID works by placing data on multiple disks and allowing input/output (I/O) operations to overlap in a balanced way, improving performance. Because the use of multiple disks increases the mean time between failures (MTBF), storing data redundantly also increases fault tolerance.

What is Erasure Coding?

Erasure coding is a form of encoding that works by splitting a unit of data, such as a file or object, into multiple fragments (data blocks) and then creating additional fragments (parity blocks) that can be used for data recovery. In the event of a failure, the parity fragments can be used to rebuild the data unit without experiencing data loss. Read more about Erasure Coding in this blog post by the Quobyte CTO.

What is Data Mirroring and Replication?

Data mirroring refers to the process of keeping a backup database server for a master database server. If for some reason, the master database is down, the mirror database can be used as an alternative.

Data replication, however, is the process by which data residing on a physical/virtual server(s) or cloud instance (primary instance) is continuously replicated or copied to a secondary server(s) or cloud instance (standby instance). Organizations replicate data to support high availability, backup, and/or disaster recovery. Depending on the location of the secondary instance, data is either synchronously or asynchronously replicated. Data can also be spread across multiple geographic locations (referred to as geo replication) so that the user can download the file from the nearest location to avoid network delays and any slow response.

You can read more about the tradeoffs of synchronous and asynchronous replication and mirroring in this blog post.

With all this to consider you need to make sure your storage solution can keep your data safe while knowing that your storage provider understands the challenges that come with that. With data protection built into Quobyte from the ground up, it protects data with checksums as it flows across your systems and adds layers of redundancy to data at rest with replication and erasure coding. So your data is safe, and you can rest easy.

