When considering how to build out your storage architecture, you need to consider both cost and performance, but most importantly you need to know what type of enterprise storage to use to house your data. Enterprise storage systems are divided into three categories today: SAN, NAS and Object. Like most storage systems, each one has both advantages and disadvantages. So how do you know which one is right for you?
Storage area network storage, otherwise known as SAN (or block storage), refers to block based storage accessible over a network. SAN uses the same abstraction as a hard drive where blocks of data can be read or written at a specific location, which is why it's called block based storage. Most applications require a file system on top to organize the data stored on the block storage. The exceptions are a few databases and virtual machines that directly consume block storage. Unlike a direct attached hard drive (DAS), a SAN is accessed over the network. The protocols used are Fibre Channel, iSCSI over ethernet or NVMe over fabric.
The major downside of SAN is that you need a file system on top that then needs to be exported to storage consumers. One or multiple head nodes export the file system and often introduce a significant performance bottleneck, e.g. when exporting the file system via NFS.
NAS stands for network attached storage, but in actuality it's a file system storage that is accessed over a network. The big advantage of NAS storage is that the file system can directly be used by applications and users. There’s no extra layer needed. However, the term NAS doesn't say anything about the storage architecture behind the file system that you see.
For example, a simple Linux server exporting local storage via NFS is considered NAS. This kind of NAS storage, however, is monolithic and can only be scaled up (i.e. add more drives into a single box). SOHO (small office home office) NAS boxes or so-called filers (monolithic enterprise NAS appliances) are other examples of this. In contrast, a scale-out NAS system has an architecture where you can add more servers or boxes to increase capacity, and ideally also performance.
The second aspect of NAS storage is the primary access protocol. Many NAS systems use NFS - a protocol that is approaching it's 30th birthday - that was designed for clients accessing a single server. NFS is dated and has severe limitations in terms of performance, security and fault-tolerance. Many true scale-out systems rely on a native protocol for parallel IO and avoiding NFS bottlenecks. That is why you can identify a true scale-out NAS as it has a protocol that is not NFS.
Now enter the new kid on the block - object storage. Object storage, also known as object-based storage, is the last category of enterprise storage. It is a data storage strategy that sections data into distinct units, or objects, which are stored in an isolated storehouse along with all relevant metadata and a custom identifier. Since it was invented by Amazon to provide cost effective storage for large amounts of data, the Amazon product name S3 is used synonymously with the term object storage.
There are two main differences between object storage and both SAN and NAS. The first is consistency. Both SAN and NAS provide very strong consistency, so when you write to a file or block, you have the guarantee that the next read will return the latest data you wrote to the file or block. This consistency model is very intuitive and most applications rely on it. Object storage, however, has very relaxed consistency guarantees (also called eventual consistency), which means effectively that you have no guarantees. Your read might return any value that was previously written to the object so applications have to be able to cope with this therefore object storage is mostly used for write-once data or archival only.
The second major difference is protocol. Object storage is accessed via the HTTP protocol - the same protocol your web browser used to request the page you are reading right now. This makes it easy to access object storage from a variety of applications. However, HTTP was never designed for speed or efficiency whereas SAN and NAS protocols are all about performance.
The performance tier determines the IOPS (input/output operations per second) and throughput your managed disk has. Often, these terms are mis-used to describe different performance and cost tiers of storage. SAN is used to describe expensive low-latency storage, NAS for the general purpose, mid-performance tier and object storage is synonymous with "cheap and deep". Unfortunately, this is largely based on historic attributes of the storage tiers. Today, NAS storage is as low latency as SAN and as scalable as object storage.
Honestly no. The challenge in enterprise storage today is scale-out workloads like BigData analytics (hadoop, spark, etc.), machine learning, image analysis and 3D rendering just to name a few. These workloads require an unprecedented amount of storage capacity and also scalable performance. Rather than the access protocol, the question today is if your storage system is able to scale-out performance and capacity without bottlenecks and is on demand to match the needs of your users and applications.
Similarly, the differentiation through access protocols has become less important as good storage systems let your users access the same data through a range of protocols. An example of this is the hadoop cluster . A hadoop cluster should be able to access the same data as your machine learning applications or workstations.
But, there are better ways to determine the right storage system for you. For example, you should always consider these three main things:
There’s no official right or wrong answer when choosing your enterprise storage solution, there is just the right answer for you and your needs. Thankfully Quobyte gives you the scale-out for modern workloads as well as for all protocols in order to serve a broad range of applications while providing the ability to combine flash+hdd for performance and cost.
Learn more about Quobyte - a distributed scale-out file system that runs on commodity servers with flash and hard disks.