Imagine a game where you could have intelligent, unpredictable and dynamic conversations with non-playable characters (NPC). These characters would have stable personalities that change over time, accurate facial animations and expressions. And all in the player’s own language.
The power of generative AI is making this a reality, and at COMPUTEX 2023, NVIDIA presented a new technology – the NVIDIA Avatar Cloud Engine (ACE) for Games. NVIDIA ACE for Games is a specialised AI model-building service that aims to modify computer games by adding intelligence to non-playable characters.
“Generative AI has the potential to revolutionise how players interact with NPC and dramatically increase the presence effect,” said John Spitzer, vice president of developer and performance technologies at NVIDIA. “Building on our expertise in MI and decades of experience working with game developers, NVIDIA is pioneering the use of generative AI in games.”
NVIDIA ACE for Games AI models
Software, tool and PC game developers will be able to use NVIDIA ACE for Games to create and deploy customised speech, conversation and animation models in their software and games, both in the cloud and on PCs.
The optimised MI base models include the following interesting technologies:
- NVIDIA NeMo, which provides basic language models and customisation tools so developers can further customise models for game characters. This customisable Large Language Model (LLM) allows for the creation of character histories and personalities that fit the world of the developer’s game.
- NVIDIA Riva, which provides automatic speech recognition (ASR) and text-to-speech (TTS) capabilities to enable live conversations with NVIDIA NeMo.
- NVIDIA Omniverse Audio2Face, which instantly creates expressive facial animations for game characters using only an audio source. Audio2Face technology has Omniverse connections to Unreal Engine 5 so developers can add facial animation straight to MetaHuman characters.
How does NVIDIA ACE for Games work?
To demonstrate the power of NVIDIA ACE for games and show how developers will be creating NPC in the near future, NVIDIA has partnered with Convai (NVIDIA’s Inception startup), which is building a platform for creating and deploying MI characters in PC games and virtual worlds.
“With NVIDIA ACE for Games, Convai tools can achieve the latency and quality needed to make AI NPCs accessible to almost any developer, and in a cost-effective way,” said Purnand Mukherjee, Founder and CEO of Convai.
The Kairos demonstration used NVIDIA Riva to provide speech-to-text and text-to-speech, NVIDIA NeMo to power conversational MI, and Audio2Face to provide MI-powered facial animation from voice input. These modules were integrated into the Convai service platform and used with Unreal Engine 5 and MetaHuman to bring to life an NPC named Jin.
Jin and his Ramen Shop scene was created by the NVIDIA Lightspeed Studios art team and was fully rendered in Unreal Engine 5 using NVIDIA RTX Direct Illumination (RTXDI) ray-tracing for lighting and shadows, as well as DLSS to ensure the highest possible frame rate and image quality.
This real-time demonstration shows what’s possible with GeForce RTX graphics cards, NVIDIA RTX technologies and NVIDIA ACE for Games. Here we can see the possibilities for future games that will use this new and innovative technology.
NVIDIA ACE: bringing NPC to life
Several game developers are already using NVIDIA’s generative MI technologies. For example, GSC Game World is using Audio2Face in the highly anticipated computer game S.T.A.L.K.E.R. 2 Heart of Chornobyl. And indie developer Fallen Leaf is using Audio2Face to animate the faces of characters in the science fiction thriller Fort Solis, set on Mars. And Charisma.ai, a company that provides virtual character creation through MI, is using Audio2Face to provide animation in its conversation engine.
Future perspectives
Overall, NVIDIA ACE for Games technology has the potential to revolutionise the PC gaming experience by allowing NPC to generate unique conversations, emotions and even voice intonations in real-time, tailored to each game situation and player action. This could make computer game worlds much more alive and engaging, as each NPC could become a unique personality with its own story and reactions.
In the future, this technology could evolve to integrate increasingly complex emotion patterns, personality traits and even the ability to learn from interactions with players, creating increasingly realistic and unpredictable virtual worlds that offer unique experiences for each player.