File storage and object storage are two ways to store and share large amounts of storage. Both have benefits and drawbacks, so what is the difference? Well, the main differences between the two are the access protocol, performance, scalability, and the consistency guarantees they offer. But before we dive deeper into the differences and why they matter, we first need to understand what file storage and object storage really are.
File storage, sometimes referred to as NAS (network attached storage), is exactly what you may think it is. Much like how you store a file on your computer, file storage is where data is stored in a hierarchical folder structure. You can move files and folders around and also set access permissions on a folder for everything inside. In contrast, objects exist in a flat namespace. There are no folders and also no hierarchical access control. Let's look at the major differences of the two storage types:
File systems are often accessed via NFS (network file system) and other - more efficient - binary protocols that are optimized for low latency and binary data transfer. In contrast, object storage is accessed via Hypertext Transfer Protocol- or HTTP, which makes it very easy to access objects through a range of applications including web browsers. Web browsers can directly access objects on an object store, e.g. downloads or images. To access a file from a NFS or other file system you need a webserver to serve the data from the file system via HTTP.
The HTTP protocol is text based, which makes it slower and more expensive to process than the file system protocols. However, object storage can be great for easy access, but it's not well suited to high performance low latency applications which can be a problem. A truly unified storage system should allow you to access your files/objects via both protocols at the same time. That way applications and users can choose the tradeoff that makes sense for them.
The design objectives for object storage were mainly cost, scalability and ease of access - including from web browsers. It's not surprising that object storage is not the right choice for applications that are latency sensitive or millions of tiny objects (where access latency matters). In contrast, modern file systems have been designed to take advantage of the performance and low latency of flash media.
Since the design goal for object storage was (and still is in many cases) the cost effective storage of large amounts of data, and not performance, it is sometimes referred to as "cheap & deep" storage.
One design goal for object storage was scalability. This is why object storage has a flat namespace where (POSIX) file systems have the hierarchical folder structure. Modern distributed file systems, however, have overcome the scalability issues of the hierarchical structure. Similarly, Amazon added methods to their S3 object storage standard to mimic file system hierarchies. As it turns out, humans prefer folder structures.
Today, there is hardly any difference in scalability between an object storage system and a modern file distributed file system.
One of the biggest differences between files and objects is the consistency guarantees each system gives you. File systems are required to implement strong consistency to be POSIX compatible, even when they are distributed. Strong consistency means that when you write to a file and then read from the file you get the data from the latest write. This model is very intuitive for users and application developers and most Linux applications, such as databases, virtual machines (VMs), require this behavior.
Object stores were designed for a very different use-case by Amazon. To enable the scalability and cost effectiveness they decided to opt for eventual consistency. This means that eventually - and there is no time limit on what eventually means - you'll get the latest version of your write returned by a read. However, in the meantime you might get any previous write back. Applications and users have to be able to tolerate these stale reads. Unless your use-case is archival or write-once it is a major challenge to develop software that can tolerate eventual consistency and the resulting stale reads.
The consistency is also the reason why it's easy to offer the object storage interface from a file system. When you relax consistency (i.e. offer a less consistent storage) then it's easy to "take something away". The other way around is a very different story as you have to create strong consistency on top of an object store that gives you much lower consistency guarantees.
Quobyte is a distributed parallel file system with unified storage access: A file is an object is a file. You can access the same data through all interfaces at the same time (S3, file, hadoop, etc.). Learn more how to use Quobyte as an object store or how to combine flash and hard drive for optimal cost.