Covering Scientific & Technical AI | Saturday, January 25, 2025

Software Defined Storage for Dummies 
Provided by IBM

A starting point for your software-defined-storage strategy that scales with the enterprise

HPC has long driven innovation in computing, systems and software and many of the concepts of an HPC infrastructure have transcended into mainstream computing and the modern enterprise. Largely due to the massive amounts of data and how the data is being used, real-time analysis of the data for faster time to insight and of course, how the data is being stored and retrieved to support organizational objectives at every level.

The importance of a software-defined storage (SDS) strategy should be as important as hardware decisions that drive compute power and software that makes the most of available computing resources, which is the construct and foundation of HPC.

The latest edition of Software-Defined Storage for Dummies, IBM Limited Edition and authored by Chris Saul, tells a story that’s important to organizations of all types and sizes, and helps readers better understand what SDS is and the important roles this rapidly evolving collection of technologies can play in both enterprise and HPC environments. Over the past decade, SDS innovation and capability have improved dramatically and the growing collection of data services and storage tools has become a foundational element of nearly every data storage environment, from garage-sized startups to the fastest supercomputer installations today.

The new edition of Software-Defined Storage for Dummies, does what all Dummies books are intended to do – help make a complex topic more accessible to those who have real interest, but not deep domain knowledge. The book approaches this in a unique way – by telling the story from the perspective of the market-leading IBM Spectrum Storage family of SDS solutions.

As the book notes: “IBM Spectrum Storage helps you manage all your data, of all types, wherever it resides, with a comprehensive portfolio of SDS applications. You can unify your storage across on-premises and multicloud environments, leverage the power of the family to more easily and effectively implement and manage important business tools such as analytics and artificial intelligence (AI), and reduce costs while increasing business agility.

To dig deeper, Software-Defined Storage for Dummies tackles crucial topics such as data protection or file system management by relating them to the members of the IBM Spectrum Storage family that are designed to address the challenges of managing, moving, and processing as fast as possible the enormous data sets generated by activities such as scientific research, genomics mapping and risk analysis.

Here are some examples of organizations and modern enterprise deploying a SDS strategy for the infrastructure.

Two of the world’s fastest Supercomputers built in collaboration with Oak Ridge National Laboratory, Lawrence Livermore National Laboratory and IBM - Summit and Sierra (#1 and #2 on the Top500 survey). These two new supercomputers built using standard SDS components highlight some emerging application workloads where HPC is playing a major role. The first is artificial intelligence (AI). Summit and Sierra leverage AI-optimized IBM Power servers that include significant GPU resources and multiple HPC-oriented IBM Spectrum Computing tools to manage workload scheduling and even facilitate data movement to and from the cloud. But a key characteristic of many AI workloads is their enormous and rapidly growing unstructured datasets that demand the highest system performance available. Learn more about the Storage behind Summit and Sierra here.

Autonomous driving (AD) offers an excellent example. The one thing AD initiatives all have in common is data – miles and miles of it – sensor data, weather data, satellite data, behavioral and other personal data, diagnostic data, and more. Each connected car generates from a few megabytes to sometimes gigabytes of data per day. When the car is a test vehicle used to train AI/AD models, data volumes can reach terabytes per car per day and hundreds of exabytes across entire AD initiatives.

Blockchain offers another example of a rapidly expanding new application workload. Because of the extreme growth rates in blockchain implementations, and the challenges of coordinating off-chain and on-chain systems, the underlying IT infrastructure supporting blockchain implementations must provide extreme levels of security, availability, system performance, and scalability.

AI and blockchain implementations are two use cases where HPC and enterprise environments are merging – and SDS solutions such as IBM Spectrum Scale shine. ESS solutions offer essentially unlimited scalability; you simply add nodes as needed to increase storage capacity, performance, and resilience. Plus, the massively parallel IBM Spectrum Scale ESS architecture provides the system performance AI and blockchain solutions demand.

Data protection and security is another domain that highlights the importance of storage technology in HPC environments. Though Software-Defined Storage for Dummies stays true to its title and simply focuses on introducing SDS topics from the IBM Spectrum Storage point of view, it’s easy to see how the sections on IBM Spectrum Protect and IBM Spectrum Protect Plus are relevant from HPC perspectives. Perhaps in the past, HPC installations were not as inherently concerned about data protection as were commercial environments. After all, stealing research raw data hardly seems enticing to cyber thieves. But as HPC and enterprise use cases converge, more and more datasets are coming under the jurisdiction of various governmental regulations related to privacy and archiving, amongst other concerns.

The healthcare industry provides plenty of such examples. Now, medical and pharmaceutical research groups such as the team at Thomas Jefferson University are using HPC capabilities to mine publicly accessible databases for trends that can guide laboratory initiatives. These techniques speed time to insights, but they can move research into areas where data security becomes essential.  HPC installations like those at Thomas Jefferson University leverage the SDS capabilities of IBM Spectrum Protect to provide everything from encryption to replication and even copy management services.

Software-Defined Storage for Dummies, IBM Limited Edition, is the smart way to learn about a topic that’s important for both enterprise and HPC users. It’s a much quicker read than Gone with the Wind, and much easier to consume and understand than your high school Calculus textbook. If you are a business executive making important infrastructure decisions for your company, a researcher leveraging AI to analyze enormous datasets, or anyone hoping to learn more about technologies that affect our daily lives, be one of the first to download your free edition of Software-Defined Storage for Dummies.

 

AIwire