Tech Giants Ally to Bring Coherence to Data Center Workloads
Overcoming one of the most vexing technical problems hampering performance in the data center - “cache incoherence,” the inability of different devices (CPUs, GPUs, FPGAs) to work together without copying data back and forth – is the target of a joint effort bringing together several major technology companies.
Aligned to develop the Cache Coherence Interconnect for Accelerators (CCIX) are Advanced Micro Devices, ARM, Huawei, IBM, Mellanox, Qualcomm Technologies and Xilinx. Scheduled for availability in 2017, the companies hope to build a single interconnect technology specification to ensure that processors using different instruction set architectures (ISA) can share data with accelerators and enable efficient heterogeneous computing – improving compute efficiency, data sharing and latency for servers running data center workloads.
Gilad Shainer, vice president of marketing at interconnect technology company Mellanox, said there is no single standard today supported by multiple accelerator and processor companies to deliver scalable performance for high bandwidth, low latency and coherent data movement. He also said that while it’s difficult to quantify the improvements that CCIX is projected to deliver, “a major impact on performance” is expected.
“The way data sharing is done today requires data copies because you don’t have cache coherency,” Shainer told EnterpriseTech. “Moving copies of data takes time. But with cache coherency, essentially when you have one copy in one place you’ve got a duplicate copy in another place. How much it’s going to save per application is really application dependent.”
Power, space and time-to-results requirements have placed a premium on accelerating applications in the data center and the cloud. Big data analytics, search, machine learning, NFV, wireless 4G/5G, in-memory database processing, video analytics, and network processing all benefit from moving data efficiently among the various system components. The strategy behind CCIX is to allow these components to access and process data without software intervention, regardless of where the data resides and without the need for complex programming environments. This will enable both off-load and bump-in-the-wire inline application acceleration while leveraging existing server ecosystems and form factors, lowering software barriers and costs.
“This leads to the real benefit,” Shainer said, “which is simplicity in the software model for data-sharing. Software does not need to manage data movement anymore, data-movement happens automatically and seamlessly. This new software model does not require runtime drivers, DMAs or interrupts.” The CCIX link layer also will be optimized for latency by avoiding store-forward and ordering bottlenecks, according to Shainer.
"A 'one size fits all architecture' approach to data center workloads does not deliver the required performance and efficiency," said Lakshmi Mandyam, director server systems and ecosystems, ARM. "CCIX enables more optimized solutions by simplifying software development and deployment of applications that benefit from specialized processing and hardware off-load, delivering higher performance and value to data center customers."