Product Tech

NFS, It’s Time for You to Call It a Night

Posted by  Quobyte  on  

We recently wrote about Hadoop’s HDFS and how it was once mighty and instrumental in big data, but today the file system’s flaws leave it buckling under modern analytics and AI workloads. Anyone working in those spaces needs to be aware of these disadvantages. But what about the rest of the enterprise storage market? For that, we need to discuss the Network File System (NFS), which, like HDFS, has grown geriatric, overrated, and insufficient.

NFS emerged from Sun Microsystems in 1984 to satisfy a growing market need. The exploding number of business client systems needed a protocol able to provide centralized file access from one storage server, feeling to users as if they were accessing local files. NFS version 3 arrived in 1995 and updated the 32-bit architecture to 64-bit while also allowing for files larger than 2GB. Version 4 (2000), 4.1 (2010), and 4.2 (2016) continued to make improvements. In a few key ways, though, NFS enhancements have resembled putting new walls and fresh coats of paint within the same aging house. The foundations of NFS have remained constant, and now they’re showing cracks. Let’s examine three in particular.


Crack #1: There Can Be Only One…Unfortunately

OK, NFS v1 predated Highlander by two years, but the core idea remains: many clients talking to a single server. That was NFS’s beauty — and now its main flaw. Keep in mind that 10-megabit 10BASE10 Ethernet wouldn’t be invented for another few years. Networks of the day were slow, but the arrangement worked because datasets were relatively small and there weren’t that many clients yet contending for access to that data.

Today, we’re working on 100-gigabit networks, with potentially thousands of machines trying to access the same data. In these circumstances, the NFS many-to-one model creates a critical bottleneck. Designers try to work around this limitation by implementing gateways. Instead of a single server, the organization has a scaled storage system behind multiple NFS gateways, which serve to branch the traffic load across several hubs. This offers some improvement, but it still leaves bottlenecked communication between the gateways and the storage system.

Moreover, NFS can’t load balance across these gateways because, after all, there wasn’t much load to balance in the 20th century. The storage system can scale up to the moon, but effective distribution for modern datasets requires scaling out with intelligent traffic management. Otherwise, you get what NFS exhibits: some gateways going underutilized and some being swamped to the point of crippling performance. NFS has no way to push a client to an alternative gateway. You can terminate the connection before routing the client to a new gateway connection, but this will result in all kinds of client errors.

In a scale-out storage solution like Quobyte, data is decentralized across multiple servers. Clients can seek data directly from the server that has the necessary file with no need to pass through an intermediate. By implementing a many-to-many architecture, Quobyte eliminates NFS’s top bottleneck.


Crack #2: NFS Failover Fails…Over and Over

Failover capability has been bolted onto NFS in fits and starts over the years; it’s not native to the file system. As a result, failover functionality often feels like a slapped-together workaround. For example, if an NFS gateway (“A”) goes down, another gateway (“B”) will assume control of gateway A’s IP traffic and announce to the network that it’s now responsible for that address. It then takes several seconds for the network to receive and assimilate this information. Clients will then talk to gateway B, which may not have the same state as gateway A. This produces stale file handles. If applications do not adequately accommodate this mismatch, it will produce errors.

Again, NFS’s roots come back to undermine it. Back in the era of single servers, there was little thought for failover because there were no other servers to fail to. It’s like trying to build a geodesic dome on a square foundation; one was never meant to function on the other.


Crack #3: Checksum Doesn’t Check Out

We’re not saying it’s aliens, but…it could be aliens. More likely, it’s cosmic rays, materials degradation, a buggy router, or some similar cause. While the source of randomly flipped bits — when a 0 changes unbidden to a 1 or vice versa in data in transit or at rest — often remains opaque, the effects can be devastating. The wrong flipped bit can throw off a spreadsheet, destroy a database, or take down a system. In storage, flipped bits strike both magnetic disk and SSD media. According to The Register, consumer HDDs experience an error every 12.5TB. An enterprise HDD will have a flipped bit every 125TB. In the mid-1990s, when NFS was gaining traction, the average hard drive size was about 1 gigabyte.

A December 2019 Trusted CI technical report titled “An Examination and Survey of Random Bit Flips and Scientific Computing” found flipped bits throughout computing systems, including long-term storage. The report notes, “In a recent study of 1.53 million disk drives over 41 months … 400,000 blocks across 3,855 drives (0.25% of the drives) had checksum mismatches.” If 0.25% sounds a lot more common than manufacturers’ stated “disk error rate of 10^15 bits,” think again of today’s vastly more dense, high-capacity drives. We fit a lot more bits onto every platter these days.

One way of coping with and correcting bit errors is by using checksum algorithms, which derive a string of checksum data from input data. If a given data block yields a checksum at a certain time and/or place, and that same data block yields a different checksum later, then you know something (like a flipped bit) has changed in the data. We have many ways to perform checksum computation, but back in the 1990s, these methods were very computationally taxing. In fact, Intel only added 32-bit cyclic redundancy check (CRC) functionality in 2008’s Nehalem architecture as part of the SSE4.2 instruction set. Before this, CRC32 checking (which is very common for modern storage) had to be performed in software. Additionally, checksum computation can now be offloaded to network cards to help improve system performance.

NFS does not support checksum capabilities, thus endangering the integrity of stored data. TCP does contain integrated checksum functions, but the protocol is only strong enough for up to 64K of total data. Back in the days of 10-megabit networks, 64K was fine. Now, enterprise storage needs something far more performant and powerful.

With Quobyte storage, when data arrives from the operating system, the first order of business is to checksum that data. The checksum is tied to that data forever. This is possible because Quobyte uses its own protocol and isn’t bound to NFS’s limitations. When the data is read back, Quobyte computes the checksum again. Any corruption will be discovered, whether the data is on-disk or in-flight, and corrections pulled from replica storage. Quobyte does this because users should be free to use whatever off-the-shelf infrastructure they please. Quobyte’s integrity checking allows enterprise-grade protection on commodity-class hardware.



After all this criticism, you may ask, “If NFS is so bad, why is it the de facto standard for today’s enterprise storage?”

True enough, NFS is supported across Linux, Unix, Windows, macOS, and a host of other OSes and protocols. In answer, why do we still run cars on gasoline? Why do we teach kids cursive? Habit. Momentum. Ongoing monetization of capital investments. It’s easier to keep on than level up.

Quobyte represents what NFS could have and should have been with 30 years of proper evolution. For better or worse, NFS is stuck with its flaws and shows no sign of ever correcting them. Quobyte has remedied these shortcomings more thoroughly and affordably than any other solution on the market.

Photo of Quobyte

Posted by

Quobyte enables companies to run their storage with Google-like efficiency and ease. It offers a fully automated, scalable, and high-performance software for implementing enterprise-class storage infrastructures for all workloads on standard server hardware.