Covering Scientific & Technical AI | Monday, December 2, 2024

ScaleMatrix and Nvidia Launch ‘Deploy Anywhere’ DGX HPC and AI in a Controlled Enclosure 

HPC and AI in a phone booth: ScaleMatrix and Nvidia announced today at the SC19 conference in Denver a joint offering that puts up to 13 petaflops of Nvidia DGX-1 compute power in an air conditioned, water-cooled ScaleMatrix Dynamic Density Control (DDC) “clean room” cabinet. Built for modular deployments and designed for high-demand AI workloads, ScaleMatrix said its ruggedized cabinet can be erected “anywhere power and a roof exist,” and it includes biometric security and fire suppression.

At the high end of the product line is a composable SKU comprised of the Nvidia DGX-1 system, a single rack running at 42kW, containing 13 DGX-1 units and delivering 13 Pflops of throughput. Other configurations come with a DGX POD deployment, four DGX-2s, run at 43kW and deliver 8 Pflops of compute, the companies said. The units will be sold with storage and networking following DGX POD reference architecture designs, such as NetApp’s ONTAP AI solution. Microway will provide hardware and software integration services, including the Nvidia DGX software stack, deep learning and AI framework containers, with the DGX systems, NetApp ONTAP storage, and Mellanox switching.

ScaleMatrix-Nvidia DDC

Chris Orlando, ScaleMatrix co-founder and principal said the DDC S-Series cabinet technology has been used in the company’s cloud and colocation data centers since 2010. “With DDC technology we have a mature platform that solves the density challenges that other complex liquid cooling systems are trying to solve, but without the mess and hassle of immersion cooling or risky hardware modifications to expensive chips,” he said.

ScaleMatrix made DDC commercially available last year, and Orlando said today’s announcement constitutes the first major engineering upgrade to the platform, including a 63 percent density capacity upgrade and more precise controls of temperature management inside the enclosure.

“In traditionally cooled data centers,” he told us, “you want to understand what the temperature and air flow are in front of, say, rack 14 row 7, right, because that airflow and temperature management become so much more important to the performance of the actual hardware platform itself. And today’s data center systems and platforms aren’t really set up well to provide great feedback to that type of environment. Because of the enclosed nature of our cabinets, because of the number of sensors deployed and the level of precision control that we have got, the DDC client portal is where you have access to all of that information, and it helps the system make real-time, live decisions about air pressure, volume and temperature, which enables it to achieve these seemingly high thermal ratios.”

At SC19, live demos of the Nvidia-ScaleMatrix products will be shown at the ScaleMatrix/DDC booth (#2131), Monday, Nov. 18-Thursday, Nov. 21.

“Quickly building enterprise-grade AI infrastructure can be a challenge for some organizations which may not have an AI-ready data center,” said Charlie Boyle, Nvidia vice president and general manager of DGX Systems. “Nvidia DGX systems provide world-leading AI compute performance, and DDC technology extends the value of DGX systems in a ‘deploy-anywhere’ form-factor that overcomes the challenge of finding the right facilities to host the infrastructure.”

AIwire