File storage and object storage are two ways to store data, and share large amounts of storage. Both have benefits and drawbacks, so how are they different? Well, the main differences between the two are the access protocol, performance, scalability, and the consistency guarantees they offer. But before we dive deeper into the differences and why they matter, we need to first understand what file storage and object storage really are.
File storage, sometimes referred to as NAS (network attached storage), is exactly what you may think it is. Much like how you store a file on your computer, file storage is where data is stored in a hierarchical folder structure. You can move files and folders around and also set access permissions on a folder for everything inside. In contrast, objects exist in a flat namespace. There are no folders and also no hierarchical access control. Let's look at the major differences between the two storage types:
File systems are often accessed via NFS (network file system), and other more efficient binary protocols optimized for low latency and binary data transfer. In contrast, object storage is accessed via Hypertext Transfer Protocol, or HTTP, making it very easy to access objects through a range of applications, including web browsers. Web browsers can directly access objects on an object store, e.g., downloads or images. To access a file from an NFS or other file system, you need a webserver to serve the data from the file system via HTTP.
The HTTP protocol is text-based, making it slower and more expensive to process than the file system protocols. However, object storage can be great for easy access, but it's not well suited to high-performance, low latency applications which can be a problem. A truly unified storage system should allow you to access your files/objects via both protocols at the same time. That way, applications, and users can choose the tradeoff that makes sense for them.
The design objectives for object storage were mainly cost, scalability, and ease of access - including access from web browsers. It's not surprising that object storage is not the right choice for latency-sensitive applications or millions of tiny objects (where access latency matters). In contrast, modern file systems have been designed to take advantage of the performance and low latency of flash media.
The design goal for object storage was (and still is in many cases) the cost-effective storage of large amounts of data and not performance. For that reason, it is sometimes referred to as "cheap & deep" storage.
Another design goal for object storage was scalability; that is why object storage has a flat namespace, whereas POSIX file systems have a hierarchical folder structure. Modern distributed file systems, however, have overcome the scalability issues of the hierarchical structure. Similarly, Amazon added methods to their S3 object storage standard to mimic file system hierarchies. As it turns out, humans prefer folder structures.
Today, there is hardly any difference in scalability between an object storage system and a modern file distributed file system.
One of the most significant differences between files and objects is the consistency guarantees each system provides. File systems are required to implement strong consistency to be POSIX compatible, even when they are distributed. Strong consistency means that whenever you write to a file and then read from that file, you get the data from the latest write. This model is very intuitive for users and application developers. Most Linux applications, such as databases, and virtual machines (VMs) require this behavior.
Object stores were designed for a very different use-case by Amazon. To enable scalability and cost-effectiveness, Amazon opted for eventual consistency. This means that eventually - and there is no time limit on what eventually means - you'll get the latest version of your write returned by a read. However, in the meantime, you might get any previous write back. Applications and users have to be able to tolerate these stale reads. Unless your use-case is archival or write-once, it is a major challenge to develop software that can tolerate eventual consistency and the resulting stale reads.
The consistency is also why it's easy to offer the object storage interface from a file system. When you relax consistency (i.e., offer a less consistent storage), then it's easy to "take something away." The other way around is a very different story as you have to create strong consistency on top of an object store that gives you much lower consistency guarantees.
Quobyte is a distributed parallel file system with unified storage access: A file can be an object as well as a file. You can access the same data through all interfaces at the same time (S3, file, Hadoop, etc.). Learn more about using Quobyte as an object store or how to combine flash and hard drive for optimal cost.