Covering Scientific & Technical AI | Sunday, January 19, 2025

Rackspace Ramps Up Open Compute Iron 

The Open Compute open source hardware project got its start more than three years ago when social media giant Facebook, which was at the time using custom server designs it created in conjunction with Dell, sat down with Rackspace Hosting and a number of other big datacenter operators and worked out the means of open sourcing hardware designs to drive innovation in servers, storage, switches, and datacenters and to drive down the cost of that infrastructure at the same time.

After years of development and the formation of a supply chain to make and support OCP server and storage designs, many large-scale datacenter operators are finally beginning deployments of OCP iron in their datacenters, and Rackspace is one of the early adopters along with companies like Goldman Sachs and Fidelity Investments, which are a bit more secretive about their precise plans as major financial services firms tend to be.

Rackspace, on the other hand, has always been upfront about its server fleet and publishes statistics about that fleet and its datacenter capacity every quarter when it puts out its financial results. And building custom machines is nothing new to the company, either. Rackspace was founded 1998, which was coincidentally (or perhaps not) when both Linux and rack-based X86 servers first started taking off in datacenters. At the time, Rackspace built its own machines, but as it went after more and more enterprise customers, it was compelled to offer its hosting services on machines made by Hewlett-Packard and Dell because these were the systems that enterprises themselves used. Rackspace was one of the first companies invited to join the OCP effort and two years later it was rolling out test versions of systems made by Quanta and Wiwynn. After a little more than a year of testing the whole process, from design to manufacturing to installation to support, COO Mark Roenigk tells EnterpriseTech that the company is rolling out production versions of OCP machinery right on schedule.

Roenigk probably knows the IT supply chain better than most people on the planet. About a decade before Rackspace was being founded, he joined Compaq as a manufacturing engineer and eventually took control of the PC and server giant's supply chain. A decade later, he was put in charge of Microsoft's software licensing and when the company got into hardware manufacturing he took over those operations, including the supply chain. Roenigk worked at Intuit and eBay for a bit and came to Rackspace in January 2010 to become its COO. At the time, it was making the shift from regular-stock servers from Dell and HP to custom machines that fit its workloads and datacenters better, so it has some experience prior to the OCP move with tailoring machines. Back in January, at the Open Compute Summit, Rackspace was one of a number of users (it is hard to tell the customers from the users sometimes) showing off their initial OCP systems, and in this case it was four systems--two server enclosures and two storage enclosures, one each from Quanta and Wiwynn.

Speaking to EnterpriseTech at the Rackspace::Solve conference in New York this week, Roenigk gave an update on where the company is at in terms of adopting OCP iron in its datacenters. As we previously reported, Rackspace is using an OCP server design as the basis of its OnMetal "bare metal" cloud service. As the name suggests, the OnMetal service uses the same cloud provisioning tools to control physical servers and the deployment of their Linux or Windows operating systems that are used to deploy virtual machines and operating systems atop the XenServer hypervisor that Rackspace uses in its public cloud. This is a kind of best-of-all-worlds approach, in that you get the entire machine like a hosting customer but minutely pricing and fast configuration and scaling like cloud customers.

Roenigk confirmed that the OnMetal service is indeed running on hardware derived from OCP's "Winterfell" three-node Xeon E5 server design, and said further that Rackspace is in the midst of deploying seven different OCP-derived "cabinet configurations" in three of its datacenters. The machines are manufactured in Asia with system and rack integration facilities in Fremont, California, and Nashville, Tennessee (with the latter being the primary one), and in addition to leveraging Quanta and Wiwynn, Rackspace has forged direct partnerships with component suppliers Seagate Technology, Samsung Electronics, LSI, and Avago Technologies. Roenigk said that Rackspace had "many more strategic partners on deck" but none of them were yet shipping OCP gear, and he did not want to elaborate further. It would be interesting to see if Rackspace ends up being one of the early adopters of hyperscale machinery from the HP-Foxconn partnership that was announced back in April.

The OnMetal service was first announced with general availability in late July in Rackspace's Virginia datacenter. There are three different OnMetal configurations, and each has its own rack setup. Here is what the OnMetal Compute racks look like:

rackspace-ocp-one

And here is what the OnMetal High Memory versions look like:

rackspace-ocp-two

This is the configuration of a rack for the High I/O OnMetal configurations:

rackspace-ocp-three

Rackspace has also backcast OCP iron into its Cloud Server Performance Cloud and Cloud Server Memory and Compute Optimized services. Here is what the racks look like:

rackspace-ocp-four

And this is the rack configuration for the memory and compute optimized versions of the Cloud Servers:

rackspace-ocp-five

It is not clear at press time how many of these Cloud Server machines have been installed using OCP iron.

All told, Rackspace has deployed 2,600 OCP servers thus far and the number is growing. OCP servers will be rolling out in London for the European market and in Hong Kong and Australia for the Asia/Pacific market in the first half of 2015.

As you can see, Quanta is the main supplier of OCP compute nodes, and at the moment Wiwynn is mostly making JBOD arrays for Rackspace according to Roenigk. There were early concerns about the quality of OCP iron versus machines from tier one server and storage array makers, but Rackspace has found these machines to be slightly more reliable. "If you have fewer components, you have fewer things to break," Roenigk said.

The servers at Rackspace have a very long life, akin to what you see among large enterprise customers. Roenigk said that most of the hyperscale datacenters are turning their servers every 15 months or so, which is a very rapid pace and pretty much in keeping with the Xeon rollouts from Intel. Rackspace is supporting a much more diverse workload and is not sitting on billions of dollars in profits like Google and Microsoft, so it has to squeeze more use out of its machines. A machine serves for somewhere between 36 to 40 months. Once a machine is moved out of its primary job, Rackspace can usually find a second purpose for it and squeeze another 12 to 18 months of use out of it, for somewhere between 48 and 58 months in total. With the rapid adoption of the OnMetal service and the desire of these customers to stay on the cutting edge, that could mean a faster cycle time on server refreshes for this service and therefore a faster cascade upgrade cycle to other services in the Rackspace infrastructure as these machines are repurposed. For instance, the first machines based on Intel's new "Haswell" Xeon E5-2600 v3 processors will be rolling into Rackspace datacenters in the fourth quarter.

In Rackspace's experience, moving to OCP hardware shaves somewhere between 12 and 18 percent off the cost compared to buying general purpose systems from the world's top server makers, and Roenigk says that when you look at power, space, and cooling efficiencies, the total cost of ownership advantage is more like 30 to 35 percent. And looking at it holistically at the datacenter level, where Rackspace builds in 8 to 10 megawatt chunks, adopting new datacenter technologies every two years or so in new facilities can take another 30 percent or so out of the aggregate cost of the facilities with each new datacenter design. These are big numbers, and ones that Rackspace will have to chase if it hopes to compete with human-derived services atop its infrastructure.

AIwire