Covering Scientific & Technical AI | Saturday, November 30, 2024

Nvidia Rethinks Graphics Fundamentals for 3D Internet 

Nvidia is making big changes to fundamental graphics technologies as it pursues a kingmaker role of converting the internet from a 2D universe to an interactive 3D world.

Some technologies being cooked up in Nvidia's labs are focused on establishing a new foundation to create, render and manipulate 3D images and video, and delivering that content to devices.

The technologies, which interlaces AI with video and image rendering, are being pitched as the plumbing for a new generation of 3D graphics tools.

To push some of those ideas forward, Nvidia is presenting more than a dozen papers at this week's SIGGRAPH conference being held this week in Vancouver, Canada.

"Graphics is really reinventing itself with AI, leading to significant advances in this field," said Sanja Fidler, vice president of AI research at Nvidia, in a briefing with the press.

Nvidia fits its new-fangled 3D graphics enhancements under "neural graphics," a term it coined to inject AI as a key ingredient in the graphics workstream. The company already has AI technologies like DLSS (Deep Learning Super Sampling), which is a ray-tracing technology to improve graphics in real time.

Neural graphics workflow depiction (Credit: Nvidia)

The company envisions on-the-fly creation, enhancement and images for virtual 3D worlds, and infusing a healthy dose of AI for superior outcomes. Desktops could provide the first layer of enhancements when graphics, like real-world images, are fed into the cloud. Online service providers will further refine those graphics with AI technologies in the cloud or data centers.

Nvidia is also focused on reducing file sizes to pack-in more 3D imaging data, which will reduce the overhead and time required to process graphical information.

"I can take all the game content and... create much more powerful visualizations of today's graphics, and provide new opportunities for artists and creators that maybe haven't even been imagined in the past," Fidler said.

But there are strings attached – all the processing for the proposed AI-based 3D graphics techniques have to go through Nvidia's GPUs, which include the hooks for such acceleration. Nvidia has a closed-source approach in which software enhancements will work best on its hardware.

"Even though [3D] is one more dimension than 2D … you might think it's just 50% harder. It's hundreds or thousands of times harder, and it has an insatiable appetite for computing power," said Rev Lebaredian, vice president of Omniverse and simulation technology at Nvidia, on the call with press.

Graphics for 3D internet will move into the cloud, as customers are "always going to be constrained by the amount of computing you can have on a device, especially one that you put on your head or put in your pocket," Lebaredian said.

Nvidia’s proposals solve some of the challenges related to implementing AI in today's 2D graphics workstream. AI-based upscaling of 2D graphics from companies like Topaz Labs are becoming popular because of the simple GUI in which users upload the video, and the AI doing the rest.

But more customizable denoising and upscaling graphics efforts require knowledge of neural networks, video data sets, and scripting. On top of that, high-end GPUs are required, which can be expensive for desktops. A Video2X Colab Notebook on Google provides access to cloud-based GPUs for video upscaling, but it isn't as straightforward as uploading a file. Nvidia provides access to its cloud-based GPUs through its Launchpad service, but it is mostly targeted at enterprise users.

Nvidia’s Fidler shared more details about the processing pipeline for neural graphics to feed 2D data and delivery of enhanced 3D content. The pipeline includes funneling 2D elements of graphics, such as animation or lighting, through GPU cores dedicated to shaders, ray tracing and AI, which helps create the 3D scene. Tools like DLSS can then be used to enhance the 3D image or video.

At SIGGRAPH, Nvidia is addressing challenges on content creation for virtual worlds. Images can be created by feeding real-world 2D or 3D images that can be reconstructed into a 3D world. The tools for that already exist, but it is somewhat cumbersome for artists and it involves many different tools, Fidler said.

Nvidia said it will talk about tools to speed up the process, so "developers can do this in a matter of seconds. This is a really major breakthrough," Fidler said, adding " you can just plug it in into whatever graphics software you have. And this is just a matter of seconds and it looks beautiful."

The company is tying its 3D content creation tools to the metaverse, a parallel animated universe that duplicates the real world. AI is "existential for 3D content creation, especially for metaverse where we just don't have enough experts to populate all the content we need for the metaverse," Fidler said.

Nvidia researchers will also present a paper about applying reinforcement learning to automate the creation of virtual characters, which learn how to move in a physically simulated environment by looking at human motion data. AI controls this character, which reacts to the physics of the environment.

"At a high level this is pretty much how babies learn, by looking at their parents move around," Fidler said, adding "It just kind of brings a significant leap in animation quality and capability, modeling different skills for the character.”

Nvidia is also targeting the user experience of metaverse in hardware such as glasses and VR headsets, and will talk about breakthroughs to optimize design of the optics for 3D hardware.

“In this latest work by Nvidia, we show that this optimized design can deliver full-color 3D holographic images in less than half the size of the existing VR displays. This is ... [a] really breakthrough achievement on the experience side,” Fidler said.

Nvidia is developing a common set of tools to quickly deliver 3D graphics to headsets by reducing file sizes. Nvidia is pushing NeuralVDB, which is an AI and 3D-world version of OpenVDB, which is an Academy Award-nominated industry standard to present volumetric storage.

The AI and GPU optimization via NeuralVDB for sparse volume datasets reduces memory footprint by up to one hundred times. "Using machine learning, NeuralVDB... dramatically reduces the memory footprint, which means that we can now represent much, much higher resolution of 3D data," Fidler said.

NeuralVDB is like how "we can take PNG images and convert them to JPEG and they're much smaller and if you do it right, it doesn't really look different than the original," Lebaredian said.

Nvidia also announced investments in USD, a file format that the company thinks is the foundation of the 3D internet, like HTML was for 2D internet.

Nvidia's Avatar Cloud Engine interface (credit: Nvidia)

The USD file format was developed originally at Pixar, but Nvidia is promoting it aggressively to get an early lead in an impending metaverse file-format battle. Many companies already create 3D content around the Khronos Group-backed glTF format, which is a rival format that has been called the JPEG of the 3D universe.

Nvidia is packaging all its metaverse offerings in a platform called Omniverse.

"We've been incrementally updating USD working with Pixar to change its APIs and add new features. Omniverse and our Omniverse kit and simulation engine and toolkit is built on top of USD," Lebaredian said.

The company also announced many updates to Omniverse creation, simulation and rendering tools. A new offering is Omniverse Avatar Cloud Engine (ACE), which can create more realistic avatars. ACE runs in public clouds (Azure and Oracle are shown on a slide) or in embedded systems. The avatar offering combines Nvidia’s AI technologies targeted at AI-based computer vision, natural language processing, speech, recommendation and animation engines.

AIwire