The UK’s Science and Technology Facilities Council (STFC) needed to upgrade the storage infrastructure for the JASMIN super data cluster. With demand for growth of up to 300PB in the next few years, it required a storage system that was massively scalable, high performance, and easy to manage and maintain.
Scalability, Operational efficiency at scale
Unified file and S3 object storage
High performance, low latency
Digital Infrastructure for a World-Leading Environmental Data Facility
The United Kingdom recognizes that digital infrastructure underpins the UK economy, facilitating growth, job creation and innovation. One of the key UK facilities with major digital infrastructure is JASMIN, a “super-data-cluster” which delivers infrastructure for environmental science data.
Designed, built, and managed by STFC’s Scientific Computing Department (SCD) for the Natural Environment Research Council (NERC) in technical terms it is part supercomputer and part data-centre, with far more storage than computing, and provides a globally unique computational environment.
The JASMIN infrastructure provides a compute and storage cloud for researchers in the UK, linked together by a very high bandwidth network in a unique topology. With its significant compute power and a bandwidth greater than usual in data centers, the JASMIN network topology is more typically found in the largest global-scale data centers.
JASMIN has been in operation for seven years. However, the needs of environmental researchers are ever increasing with new high-resolution satellite observations and increasingly complex high-resolution models. The manipulation of these ever larger datasets demands vast amounts of massively scalable storage. As part of the JASMIN Phase 4 incremental upgrade an additional 45 PB of storage has been added. This upgrade will help to ensure that JASMIN continues to be a world-leading environmental data analysis facility.
At the foundation of the JASMIN 45 PB storage upgrade is the Quobyte Data Center File System – a fully-featured parallel POSIX file system designed for massive scalability and operational efficiency.
Unrestricted Pipeline Unleashes Collaboration
Important for the JASMIN storage upgrade were scalability and flexibility in how the various JASMIN users were supported. The users require fast (parallel) file system storage, fast metadata as well as a scale-out capability, and a high-performance object (S3) interface to the file system. Quobyte supports each of these different storage needs with a single storage platform and namespace. By utilizing the Quobyte file system, JASMIN users now have a choice, they can move freely between file system and object storage accessing the same files and data through either interface with little impact on performance. Users are now able to develop modern data analysis workflows that exploit the benefits of object storage, whilst retaining backward compatibility for legacy POSIX-based codes. Additionally, the object interface enables users to collaborate more easily, nationally and internationally, without using legacy file transfer tools.
Quobyte ticks the boxes for massive capacity and performance scalability using software-defined storage. STFC runs Quobyte on commodity hardware freeing them from proprietary hardware solutions and making it much easier to scale capacity, as well as performance, as necessary.
Easy-to-Use Delivers Operational Efficiency
With such powerful software, managing massive capacity, one would expect Quobyte to be complex to manage, but the ease of management and operational efficiency are where Quobyte really shines. In large HPC environments, it can be a struggle to know where the load is, but with the Quobyte monitoring and analytics, the exact nodes can be identified. Running thousands of physical and virtual machines, it can be difficult to track down the one with an issue. However, with Quobyte, it is easy, the monitoring shows you!
JASMIN is the UK’s largest capacity Quobyte cluster at 42PB. All collected environmental data never gets deleted because the exact environment from which it was collected can’t be repeated. Because of this, the environmental data will only grow and the projected demand is for 300PB to be stored within the next few years. Keeping, managing, and providing subsets of this data for analysis is made possible by Quobyte.