As defined by Gartner, Scalability is the measure of a system’s ability to increase or decrease in performance and cost in response to changes in application and system processing demands.When choosing a storage solution considering this is essential as prioritizing it from the start leads to lower maintenance costs, better user experience, and higher agility. But before you can make a decision you need to first understand the differences in the ways you can scale your solution.
Scale-out refers to the ability of a system to scale certain dimensions when you add more components. In a storage or file system - sometimes also called scale-out NAS - these components are hard drives (hard disk, NVMe) and servers. The more interesting part are the dimensions to scale in a storage system when you add more components:
A proper scale-out file system should scale in all of these dimensions. If you can only scale the capacity for example, like with many storage appliances, you will often run out of performance for the applications that want to access the growing amount of storage. Most use-cases and applications grow performance and capacity in lockstep however one of the few exceptions is archival storage.
Another important aspect of a scale-out system is determining how far it can actually scale. All distributed systems have limits regarding the number of servers and/or drives they can have. Good systems have limits that are in the thousands or even tens of thousands of servers so they are more a theoretical issue.
Others, especially those where scale-out was added later, have much lower limits like 16 servers. 16 servers might be enough for you today, but are you ready to move to a new system when you go to 17? Or even worse, start a new cluster that is completely independent?
Similarly, it's important to look for practical scalability limits, which might be much lower than what the theoretical limit says. One example are file or storage systems that rely on so-called "consistent hashing" to determine the location of data. Whenever the storage cluster changes (outage, new or removed server), the data needs to be moved. The more servers, the more outages or failures you'll see, which causes the clusters to become unstable and results in higher latencies and partial unavailability.
Linear scaling is the term or feature to look out for. It means that when you double the resources you also double the performance dimensions. Linear scaling also means that you double your performance when you go from 4 to 8 or from 100 to 200. There are no diminishing returns on the performance as the system scales.
The file system itself should have the ability to scale linearly, but also the access layer should be able to scale linearly with the performance. If you use a protocol that doesn't have native support for parallel IO and load balancing, like NFS, your access layer and gateway nodes will quickly become a bottleneck. So what use is a scalable storage system when clients will just cause congestion at the NFS gateways?
Finally, the question is how the resources that you add benefit file systems and users on the system. Ideally, the new resources and their performance should increase the performance of all file systems (sometimes called exports or shares) on the storage system. However, if you have a file system where resources are pre-allocated to a specific file system then your ability to scale is not uniform. This is a big issue because you then have to add significantly more resources to scale all file systems.
The lack of thin provisioning, over-subscription or when a file system requires a pre-allocation of storage resources (drives, groups or servers) to a single file system are all warning signs that the file system will not allow you to scale-out uniformly. Most block-based distributed file systems lack thin provisioning and have very static resource allocations, making uniform scale-out a complex and manual task or impossible.
Quobyte is a distributed scale-out file system that scales performance and capacity linearly and uniformly with the number of resources.