Covering Scientific & Technical AI | Sunday, December 22, 2024

Nvidia’s Avatar Creation Tool Follows Footsteps of ChatGPT and DALL-E 2 

Nvidia has not been very clear on how it was helping build the "plumbing" for the metaverse, a grand concept which has been derided by critics as unrealistic.

The graphics chipmaker this week shared insights into how it gives birth to avatars, and provides them with interactive speech and gesture abilities. Avatars are a key asset to help the digital parallel universe get off the ground. The company also made metaverse announcements during a taped keynote speech from Nvidia executives first shown at the CES 2023 show, and later made available on YouTube.

The keynote blended new GPU announcements with reminders of the metaverse software tools already offered by the company. But there was more clarity this time on how those tools, which include Audio2Face and Audio2Gesture, are being used to create the core interactive elements in the metaverse.

AI has become accessible to everyday users with offerings like ChatGPT, which is an AI tool that can generate lengthy narratives from simple questions, and DALL-E 2, which can generate images from text. Nvidia's goal is to build a more complex generative AIs with text, gesture, speech and graphics that will be as easy to use as the OpenAI tools.

Nvidia announced it was providing early access to its Omniverse ACE (Avatar Cloud Engine), with which users can create digital avatars that companies can deploy as animated customer service representatives or chauffeurs in cars.

The creation of avatars is like creating a new character in the metaverse, and giving it intelligence. The avatar can be created by tools like Nvidia's own graphics tools or popular software like Ready Player Me. ACE gives character to the avatar by linking together a whole range of Nvidia’s natural language, text-to-speech and graphic generation tools. The tools, which are offered as microservices, make the avatar interactive.

Once an animation or avatar is created, add-on tools Audio2Face and Audio2Gesture can generate interactive animatronics capabilities from an audio file. For example, these AI tools can help the avatar lip sync or generate hand gestures based on the audio in the file. The Riva speech AI can recognize speech, or generate text to speech and link it up to the animation.

Nvidia also released a few experimental AI plugins that could aid in the creation of realistic avatars. The company's AI ToyBox includes GET3D, which helps transform 2D images into metaverse-ready 3D models. A plug-in from Move.AI for Omniverse automates the transformation of real-world gestures into animation.

The creation of an animation is a part of Nvidia's goal to blend AI and graphics to create a new form of fluid interaction between digital beings and objects. Omniverse ACE relies on a drag-and-drop workflow to create the interactive avatar and how it interacts in a digital world.

These tools are available to content creators in organizations or at home, and will connect into Nvidia's larger metaverse platform called Omniverse, which in many cases requires the power of an Nvidia GPU on a local desktop or server. Omniverse tools will not be able to take advantage of acceleration in GPUs from AMD or Intel.

Nvidia views the metaverse as the 3D iteration of the internet, and an advance from the 2D version of the web for PCs and mobile devices. The company is backing the USD (Universal Scene Description) file format, which are graphics blocks that can be shared and modified. Nvidia also announced a new version of 3D creation software Blender was now available in Omniverse.

The chipmaker also announced new consumer GPUs based on the Ada Lovelace architecture, which is a consumer spin-off of the Hopper GPU architecture.  The company also announced enhancements in its automotive and robotics AI platforms.

Header image: Violet, a cloud-based avatar developed using Nvidia Omniverse Avatar Cloud Engine (ACE). Credit: Nvidia.

AIwire