Facebook’s ‘Building Blocks’ Mantra Includes Intel
Facebook executives touted the advantages of its datacenter "building blocks" approach during the Open Compute Project (OCP) summit as a way to achieve flexibility and faster scaling. Meanwhile, it also detailed datacenter processing and storage projects with chipmaker Intel Corp.
Jay Parikh, Facebook's (NASDAQ: FB) vice president of engineering and infrastructure, noted during the OCP summit this week in San Jose that the social media giant's early datacenter infrastructure was mostly a combination of off-the-shelf hardware and software components. "We were faced with a big, big decision [several] years ago, which was: Do we stay going down the path of continuing to buy these off-the-shelf solutions or do we kind of break loose and go down a different path, and start to build up expertise and a team that is going allow us to design and build and operate all of our own equipment," Parikh explained.
"We chose door number two."
Since then, the social media giant has adopted a comprehensive, full-stack approach. "We look at the technology at the datacenter level—the cooling, the electrical. We look at the servers that go into it and all the different types of workloads that we have to process and handle on a daily basis" along with networks and software code running a growing list of social media applications.
Then there is scaling. Facebook estimates it currently has 1.6 billion users, 800 million using its Messenger service and 400 million on Instagram. "We continue to really prioritize scalability as we think about these [datacenter] designs and this infrastructure we're building," Parikh said, adding that OCP has created efficiencies that have saved Facebook "several billions of dollars."
Along with efficiency, Parikh emphasized the need for flexible infrastructure designs. "It's a tradeoff: There's a struggle between being efficient and being flexible. These two things are kind of always in tension, and it’s a good tension if you can manage it carefully in how you think about your infrastructure."
The result for hyper-scalers like Facebook has been a "menu of building blocks." For example, Parikh cited Facebook's work with Intel (NASDAQ: INTC) on the Yosemite platform "which brings the next generation of general-purpose compute to our infrastructure from a building-block perspective." The partners predict Yosemite represents a 40-percent increase in efficiency and processing performance.
Facebook worked with Intel on the design of its Xeon D system-on-chip while redesigning its server infrastructure so the design could be shared with OCP members. "The result was a one-processor server with lower-power CPUs, which worked better than the two-processor server for our web workload and is better suited overall to datacenter workloads," Facebook engineers noted in a blog post.
Another goal was to "avoid the flattening performance trajectory" of previous processor technologies while operating with the same datacenter power budget, Facebook noted.
Storage remains a limiting factor in datacenter scaling, Parikh noted. "We’re really kind of stuck with this paradigm where things are kind of scaling out and getting bigger, but from a performance perspective we're not getting what we actually need."
Hence, Facebook also has been working with Intel on nonvolatile memory (NVM) technologies, specifically its 3D XPoint architecture unveiled last July. The approach is seen as providing much faster memory, better endurance than flash and the promise of lower latency. NVM "gives us the ability to think with this 'tiered' mindset for our applications so we can…scale out things for performance or for capacity or for optimizing price."
Meanwhile, Parikh stressed the "disaggregation" of building blocks like its Wedge networking switches and the ability to "mix and match" those components for different workloads. "The software is the same, so you just change the networking silicon in this platform and you can easily move from 40 [Gbit/s] to 100 [Gbit/s] so that we can keep up with the demands of our applications."
A new capability resulting from Facebook's mixing and matching of compute, storage and networking building blocks is a live streaming video feature that was launched and scaled in a few months. "We've been able to support millions of simultaneous viewers on these live broadcasts because of these infrastructure building blocks we've been building over time," Parikh noted.
Related
George Leopold has written about science and technology for more than 30 years, focusing on electronics and aerospace technology. He previously served as executive editor of Electronic Engineering Times. Leopold is the author of "Calculated Risk: The Supersonic Life and Times of Gus Grissom" (Purdue University Press, 2016).