Modernizing ‘Cranky’ Rendering Code for HPC-based Animation
When Miguel – the 12-year old protagonist of “Coco,” the recently released film from Disney and Pixar – first sees the Land of the Dead, it’s a beautiful moment in animated filmmaking. The city, which appears to go on forever, floats in space. Because it’s always night in the Land of the Dead, millions of light sources were needed to illuminate the scene, including street lights, lights in the plazas, architectural lighting on buildings, trolley track lights, headlights on moving vehicles and lights on construction cranes. “Coco” illustrates the next wave of animated film creation, which uses the power of high performance computing (HPC) to create new, complex and captivating visuals that were not possible until now.
HPC, in turn, is providing the impetus to modify and modernize existing code to run on these highly parallel multicore processors. It’s also the reason that Pixar teamed up with Intel.
“For over two years now Intel engineers have been helping us modernize a very old, very cranky code base," says Allan Poore, VP and GM engineering for RenderMan products at Pixar. “We parallelized and vectorized the software to substantially improve performance. One of the early wins for the joint team was to parallelize our de-noiser [“noise” is errors or random disturbances in an electrical signal – ed.], which resulted in a 2x to 4x speedup in processing.
“We were also able to switch from a proprietary threading system to the Intel Threading Building Blocks (Intel TBB), a more standard approach that makes the whole process much easier, including minimizing thread contention. Intel TBB is a C++ template library developed by Intel for parallel programming on multicore processors.”
Poore said Intel helped bring about changes in the LLVM compiler, a collection of modular and reusable compiler and toolchain technologies needed for implementation of the Open Shading Language (OSL), a widely used rendering capability in the DCC (Digital Content Creation) industry.
OSL was developed by Sony Pictures Imageworks as an open source tool for programmable shading. It is easy to adopt using a simple API and runtime library and is designed for production rendering when combined with software, such as RenderMan. OSL includes a complete language specification, a compiler from OSL to an intermediate bytecode, a runtime library that executes the shaders (including JIT [Just in Time] machine code generation using LLVM), and an extensive shader function library. Because these capabilities exist as libraries with common C++ APIs, OSL is easily integrated into existing renderers.
The OSL used at Pixar has been vectorized for render time optimization. It leverages the latest release of Intel Advanced Vector Extensions 512 (Intel AVX-512), a set of single instruction multiple data (SIMD) operations on Intel advanced CPU architectures, such as the Intel Xeon Scalable processor, to concurrently execute up to 16 shading operations.
“Shading networks are very complicated with many layers required to create forms,” said Poore. “This is also very expensive in terms of compute time to get the desired effect. A large percentage of the rendering time on any film deals with OSL components – for example, 60 percent of the rendering time spent on “Coco” was for shading. In order to speed up the rendering we vectorized the code allowing us to process the huge datasets involved very quickly. With Intel AVX-2 we are currently seeing a 2x speedup; we will realize even better numbers when we start working with Intel AVX-512.”
In “Coco,” Miguel Rivera is transported by accident to the Land of the Dead and has to find his way back to the land of the living. Among the visuals is a marigold bridge that connects the lands of the living and the dead. The bridge, made of illuminated marigold petals, uses what Pixar calls “particle light,” allowing the creation of many points of light and the rendering of individual petals that glow as the characters walk through them. Illuminated petals also drift down from above.
Miguel first sees the city of the dead from the bridge. The city is illuminated by 7 million lights, all created by the renderer. Assisted by new software, the team was able to group the lights, which previously would need to be placed and adjusted individually. But “Coco” is incredibly complex visually and the budget of 50 hours rendering per frame sometimes required 150 hours. All of those 7 million lights had to be calculated individually to see how they interact with the other objects in the scene – and there could be tens of millions of objects involved. The Pixar team not only maxed out the cores internally available for processing, but also had to borrow additional compute power from other sources, such as Lucas and Disney. About 50K cores were required.
The rendering process is going to get another major speedup this summer when RenderMan 22 is released.
Traditionally, RenderMan has been used off-line, Poore said. When the artist sent a file out for rendering it often took hours or even days before the results were obtained. Relevant work in the pipeline was often on hold until those results were available.
RenderMan 22, released last year, takes advantage of the power of the new processors and the parallelization and vectorization of the code base. According to Pixar, this version of RenderMan provides a variety of interactive rendering capabilities. The software features “always-on” rendering embedded in artist applications, responding instantly to geometry, camera, light and material edits. “This new live rendering mode delivers incredible interactive frame rates with the same renderer used for batch renders,” Pixar said in a prepared announcement.
Other RenderMan 22 technologies, like fast vectorized OSL shader network evaluation on Intel Xeon Scalable CPUs, were also unveiled last year.
Poore said the optimized hardware and software will enable artists to create digital effects with huge volumes, such as clouds, oceans or multiple explosions, inside various settings to see how they interact.
People time is expensive, he said, so production managers are constantly looking for ways to streamline the process without compromising the artists’ creative reach. This includes using the new highly parallel hardware and software to improve “the time to first decision” in new projects.
Another challenge for streamlined filmmaking is to use the same renderer throughout the pipeline. Today, different renderers are used for different pieces of the film – from Open GL for interactive tools to the complexities of final frame rendering. Each stage of the rendering process has to be translated from one renderer to another in the course of making the film. As final shading and rendering happen late in the pipeline, obvious issues are often not discovered until the final render. However, with everyone using the same renderer, the team can catch errors early in the process or do away with them completely.
The changes to Pixar’s film making processes are resulting in the generation of increasingly large and complex workloads. In response, the company’s IT organization is supporting its user’s need for increasingly higher volumes of compute power by evolving a strategy that includes cloud computing – HPC in the Cloud.
Currently when Pixar’s IT organization runs low on resources it borrows HPC systems from external sources, such as its business partners. This hybrid cloud approach blends the use of private and off-premise cloud infrastructures and provides cloud bursting capabilities to ensure the production teams have adequate capacity to facilitate their rendering.
“When working on a film, no matter how complex, our teams of directors, artists and editors, along with our Intel team members, want speed, stability and simplicity,” Poore said. “This provides a base of support that allows them to express the full range of their creativity.”