To understand the basic difference between “scale-up” and “scale-out” technologies, think of residential housing. You’ve got a lot of people who need shelter and minimal space constraints. A scale-up approach would be to make one building as tall as possible. However, the higher you build, the more physics and cost come into play. Soon, the economics of scaling elevation prove self-limiting, which is why fewer than a dozen buildings in the world are taller than 500 meters (1640 feet). A scale-out approach emphasizes decentralization, spreading the population among many shorter buildings. Without geographic boundary limitations, the scale-out approach will likely prove more cost-efficient and resilient against accident or attack.

Twenty years ago, scale-up approaches dominated IT. If you wanted more performance or capacity, you built a bigger server with faster chips and/or larger numbers of more capacious drives. This principle stemmed from Moore’s Law, which ingrained the idea that more of a resource (transistors, in the case of CPUs) meant more speed. As with skyscrapers, that idea ultimately ground to a halt before the realities of physics. Given modern fabrication technologies, transistors could only be packed so densely and run so fast before heat build-up became impossible to dissipate through practical means.
Such physical limitations forced Intel, AMD, and others to prioritize compute power over speed. Rather than run one centralized core in a processor, CPUs integrated multiple connected cores at lower clock speeds, ultimately obtaining scaling performance through more work done per time, per watt. This is effectively scale-out core design within the confines of a CPU package.
A similar development path played out with disk storage. Flash-based SSDs effectively killed off 10K and 15K RPM hard drives, capping performance for high-capacity drives. However, Seagate, WD, and Toshiba continue to improve a real density on drive platters, so capacity keeps increasing within the same physical space. Individual hard drives aren’t getting faster, but their capacity per square inch keeps growing. Concurrently, thanks to improving protocols and network fabrics (which allow scale-out solutions to flourish), decentralized disk storage continues to advance in total performance and energy efficiency.
Now, in the 2020s, most aspects of the IT world have embraced scale-out principles. Supercomputers are clusters of regular servers; when more performance is needed, you simply add more servers. In hyperscale storage, platforms such as Quobyte don’t require users to stack every proprietary appliance in adjacent racks. Rather, storage can scale out across commodity hardware into adjacent rooms, buildings, or even continents. Capacity, performance-per-watt efficiency, and solution-level TCO all improve.
Scale-Out Benefits
Scale-out benefits, once deployed, quickly become undeniable. Unfortunately, old habits die hard, especially when capex costs have been amortized and IT budgets remain flat. Some organizations feel content to sit tight with their aging scale-up storage solutions. Many of these groups have legitimate reasons for their static strategies, but none of those reasons will dam the inexorable tide of data growth. Infrastructure built for a terabyte world will buckle in an era of petabytes. Only scale-out solutions will accommodate these rising data loads in a cost-efficient manner.
If data trends aren’t enough to persuade you that scale-up isn’t sufficient for future storage needs, consider:
- Cost. Scale-up strategy requires ever-faster hardware. In comparison, scale-out excels with cheaper, average-performance components that can be networked in massive quantities.
- Balance. In the triad of low cost, capacity, and speed, scale-up can give you capacity and speed but only with the sacrifice of low cost. Scale-out excels at capacity and offers a fair balancing of speed and cost, improving more as workloads increase.
- Ceiling. Scale-up is limited by hardware technologies and often runs up against limitations, such as the 6 Gbit/s drive interface cap. Scale-out can aggregate bandwidth for significantly higher headroom and ability to grow over time. Thus, the right software paired with the right architecture can allow scale-out to grow practically without limits.
- Upgrading. Scale-up requires “forklift upgrades,” wherein large portions of the infrastructure must be upgraded all at once. Usually, this requires knowing how much performance or capacity will be needed. A wrong guess can mean costly overprovisioning or future costs due to underprovisioning requiring another costly, large installation. Scale-out is inherently modular, allowing for the addition of just enough storage and/or servers when needed. Even if purchases result in underprovisioning, the cost of optimizing from that point will be much lower.
Hitting the Right Scale-Out Notes
An effective scale-out storage system will prioritize and balance performance and capacity. Capacity alone may be fine for archival use, but it won’t suffice for most business applications. Conversely, going all-in on scale-up SSD arrays will deliver incredible performance, but costs will be stratospheric or, if capacity is no object, downright galactic. Scale-out offers the most cost-effective solution for non-real-time, hyperscale applications.
Additionally, with any form of IT scaling comes the question of rebuilds. Rebuilds typically tie up resources, so it follows that rebuild time should be kept to the barest minimum. Thus, rebuilds at scale should be “declustered,” meaning the whole cluster participates in the rebuild rather than spreading out the impact through piecemeal downtime. Avoid RAID, error correction, or replication groups, because if a component dies in these structures, performance impact in the subgroup can be massive.
Similarly, you want a storage solution with a scalable access protocol. NFS was built 30 years ago for a paradigm based on single storage servers. Making NFS your primary access protocol severely limits scalability and introduces performance bottlenecks.
Quobyte is a distributed file system that delivers on all the above advice and best storage practices. The platform combines your existing servers with flash or hard drives into a reliable scale-out storage system.