In a recent post, we discussed the technological advances happening around protein folding, crystallography, atomic-resolution visualization, and modeling. Success in this space carries the promise of life sciences labs being able to study existing, unmapped proteins with unprecedented speed, which in turn assists new predictive modeling techniques. Collectively, these advances may lead to cures to terrible diseases that have afflicted mankind for centuries.
As noted in that prior post, cryo-electron microscopy (cryo-EM) of a single protein at raw, atomic-scale resolution can generate hundreds of terabytes for storage. Performing machine learning (ML)-driven analysis and modeling on this data requires storage with sufficient capacity and throughput to make workloads feasible on a time scale that won’t bottleneck project workflow. Said differently, if large-scale image processing pipelines are not kept sufficiently filled with data, system cost-effectiveness and viability break down.
Cryo-EM is the backbone of next-generation protein visualization. As noted, though, cryo-EM generates prodigious data quantities. Quobyte can help accelerate cryo-EM workflows and simplify storage operations in five key ways.
#2: Flexible, Tiered Media for Cost Optimization
Applications such as cryo-EM can easily scale into petabytes of storage for a single project. At the same time, potentially large segments of that total data must be available for analysis at extremely high speeds. Hard disk remains the medium of choice for reasonably performant mass storage, but Quobyte adds an NVMe-based flash storage tier above this for keeping the data pipeline to GPUs full. The amount of NVMe flash and hard disk storage can be optimized for specific applications and workloads for maximum cost efficiency. This tiered approach provides cryo-EM cluster solutions with just the right amount “hot” storage needed for analytics and modeling while simultaneously providing ample nearline and long-term storage at an attractive per-terabyte price point.
#3: 24/7 Uptime and Easy Maintenance
Life science labs can only drive return on their equipment investments when that equipment is in use. Every time sysadmins take down storage infrastructure for upgrades, patching, or other servicing takes another slice out of ROI. Quobyte’s integral redundancy and robustness ensures around-the-clock operation. Admin tasks can be done at any time, whether scheduled or ad hoc, without disrupting users or their applications. This gives organizations far more flexibility in how they choose to perform admin operations and schedule their IT labor.
#5: Strong Data Protection and Security
Over the last few years, spending on data security has continued to compound at anywhere from 10% (IDC) to 16% (MarketsandMarkets). The need for keeping data safe from theft continues to rise alongside growth of total data volumes, and life sciences data, where even one advance can be worth many billions of dollars, is no exception. Quobyte ensures data protection on two fronts. First, the platform employs end-to-end checksums to verify that the data at one end of a communication is the same exact data that arrives at the other end. (This guards against random bit errors as well as intentional tampering). Second, Quobyte uses government-grade encryption for all data, whether at-rest or in-transit. With encryption, any intercepted data registers as gibberish to unauthorized third parties.
Cryo-EM and similar breakthroughs now enable long-awaited advances in protein folding and the broader life sciences. A robust, cost-effective storage platform like Quobyte will allow researchers to retain data at full resolution for the most accurate results and faster processing of petascale projects.