The cloud has changed the user’s expectations for enterprise storage: Today, users expect instant provisioning of new storage, “unlimited” scalability, 24/7 uptime, and storage that can handle a wide variety of workloads – from traditional databases to massive scale-out workloads like machine learning or big data analytics.
Enterprise storage systems need to keep up with user expectations and business needs. They must dynamically adapt to rapidly changing business and application requirements, new workloads, and dynamic scaling when experiments turn into production workloads.
What is IBM SpectrumScale / GPFS
SpectrumScale, initially called GPFS, was developed over 25 years ago as the enterprise’s first distributed software storage solution. The primary purpose was to coordinate the concurrent access to a SAN (storage area network, see here) from many clients.
This explains the architecture of SpectrumScale: A SAN is block storage – almost like a giant hard drive. SpectrumScale’s architecture is very similar to a local file system, just distributed across machines. Reliably storing blocks of data was the task of the SAN. SpectrumScale provided the shared file system later.
Later on, SpectrumScale got support for a shared-nothing architecture where the file system layer handles the data redundancy across regular servers with local drives. However, this layer was “bolted” onto an existing architecture that had been designed for a completely different purpose. This dated architecture that wasn’t designed for shared-nothing storage is also one of the primary sources of the lack of flexibility and scalability in SpectrumScale/GPFS when compared to modern software storage competitors.
|Anyone can download and install||no||yes|
|Software on any x86 server from any vendor||yes, but limited HCL||yes|
|Mix-and-match hardware||limited, uniform capacity per pool||yes|
|Free edition||non-commercial, single-node only||yes, up to 150TB|
|Scale-out without NFS bottlenecks||yes||yes|
|Linear performance scaling||no, lock service and block allocation bottleneck||yes|
|Maximum cluster capacity||8EB||unlimited, 2EB per file|
|Native high-performance drivers||Linux, Windows||Linux, Windows, macOS|
|Clients do not disrupt cluster||no||yes|
|File and object (S3) in the same namespace||limited||yes|
|Low-cost flash (QLC)||yes||yes|
|Combine flash and HDD in the same file||no||yes|
|On-prem or colo||yes||yes|
|Data protection||EC, synchronous replication||EC, synchronous replication|
|Multi-tenancy and oversubscription||no||yes|
|4k random IO||limited (fixed block size)||yes|
|X.509 certificate support||no||yes|
|No kernel modules||no, requires custom kernel module||yes|
|Deploy on k8s||no||yes|
|Secured access with user-provided credentials||no||yes|