Product Tech

Machine Learning in the Enterprise Requires the Right Storage

By  Björn Kolbeck  on  

While Artificial Intelligence and Machine Learning have become the buzziest of buzz words over the past year or so, I was using Machine Learning for protein-folding prediction and drug design as a research student 15 years ago. Back then you needed a compute cluster or, better yet, a vector machine to train complex neural networks – since distributed neural networks are a difficult beast. Today, with the broad availability of GPU, you can easily add the massive parallel computing power to a PC that was once reserved for supercomputers like the early Crays, at a fraction of the cost. Thus ML has moved beyond the domain of scientific research to become an indispensable tool used across many industries, among them the insurance and automotive sector, and quantitative trading.

Storage Is As Important As GPUs

However, the availability of immense parallel computing power is only one half of the high-performance computing equation. Many enterprises forget to account for the other: massive scale-out storage that is just as crucial of a part for the most successful Artificial Intelligence and Machine Learning projects. For one, storage needs to deliver fast read-write access to the data and for another there’s tons of data that need to be stored efficiently without exploding cost.

It’s become a competitive advantage to own large quantities of data and have the ability to turn it into quality insights by quickly analyzing and learning from it. Nowadays, amount of tens of petabytes of data are considered normal for a ML setup, with hundreds of petabytes common when working with images and videos. Their GPUs need to be fed data at high rates to keep busy, which can be only be done if sufficient storage resources are available to stream data across multiple client machines simultaneously.

Tiering Is Not A Viable Strategy

What makes ML workloads “special” when it comes to storage is that all data is more or less hot data. This faces ML users with a dilemma:  the usual strategy of using cheaper HDDs for “cold” or archival data and SSDs for the “hot” data won’t work in this case. So, tiering doesn’t seem a viable strategy. On the other hand, all-flash arrays are often too expensive, especially at scale. Considering capacity-to-throughput ratios and the large capacities that are often required in ML, I think HDDs still remain a good choice in many cases. Scaling performance linearly by adding storage servers is the way out and ensures that you can deliver the throughput you need, even when you grow from just a few GPUs to hundreds.

We created Quobyte because we saw that enterprises were starting to use HPC-style workloads and needed the right storage system to support them. We combine the performance and scalability of an HPC system with the reliability of an enterprise storage system and the manageability at scale (that we learned when working at Google).

Quobyte provides you with just the right technology mix to run HPC workloads like ML and is ready to run in an enterprise setting with all the security requirements that entails. And because Quobyte is a POSIX file system (which also speaks S3), you can move all of your applications seamlessly to Quobyte and use standard, cost-efficient hardware. So it won’t require much change to your existing setup and there’s no need for specialized hardware components.

The Best Price-Performance Storage For Machine Learning

Quobyte delivers the throughput you require from your storage and takes full advantage of your GPU investment. Quobyte  helps you leverage HDD and flash to get the best price-performance ratio without cumbersome tiering. And, most importantly, you’ll benefit from full flexibility when growing your storage meaning you’ll get the throughput required and can add the capacity when you need it. As ML project requirements change – oftentimes more quickly than you anticipate – your Quobyte installation will adapt. Just add disks or servers when you need more performance and capacity – which can be done without any interruption.

Regardless of whether it’s the next big thing now or in 15 years, the reality is that any computing innovation will require data storage that can handle the performance and capacity requirements of any given project. For Artificial Intelligence and Machine Learning workloads today, Quobyte is the right tool to overcome the challenges of scale, throughput, and availability in order to get faster results and ensure GPUs are leveraged to their full capabilities.

Next Steps

  1. Go to https://www.quobyte.com/get-quobyte
  2. Download the latest Quobyte software and try it with your Artificial Intelligence and Machine Learning project – for free.
  3. Contact Quobyte to let us know how you would like to get started with a licensed deployment.
Photo of Björn Kolbeck

Written by

Björn is Quobyte’s co-founder and CEO.