Listen to this story
|
Building on the success of AlphaStar, Google DeepMind has introduced the Scalable Instructable Multiworld Agent (SIMA), a generalist AI agent for 3D virtual settings. The versatile AI system can execute tasks in diverse video game environments based on natural-language instructions, presenting a shift towards a generalized approach in AI gaming research.
Read the full paper here.
Training Method
SIMA has been trained across multiple games, including collaborations with eight game studios, and tested on nine different video games, such as No Man’s Sky by Hello Games and Teardown by Tuxedo Labs. Each game in SIMA’s portfolio opens up a new interactive world, including a range of skills to learn, from simple navigation and menu use to mining resources, flying a spaceship, or crafting a helmet.
Google DeepMind has also partnered with four research environments, including a new environment built with Unity called the Construction Lab. In this environment, agents must build sculptures from building blocks, testing their object manipulation and intuitive understanding of the physical world.
The agent encompasses pre-trained vision models and a central model equipped with memory, which interprets images and language inputs to generate keyboard and mouse actions for gameplay.
SIMA’s performance relies on language
SIMA’s evaluation across 600 basic skills demonstrates proficiency in navigation, object interaction, and menu use, focusing on tasks achievable within short durations. Unlike specialised agents trained for individual games, SIMA performs better by generalising knowledge across multiple environments and shows competence in unseen games.
The agent’s reliance on language proficiency is evident in controlled tests where language inputs significantly influence behaviour. SIMA’s ability to follow instructions is assessed through nearly 1500 unique in-game tasks, with human judges contributing to evaluations.
SIMA’s development aims to pioneer a new breed of generalist AI agents driven by language input, facilitating their utility across diverse tasks both online and in real-world scenarios.
While early-stage, this research lays the groundwork for future enhancements in AI systems. It aims to bridge the gap between virtual environments and practical applications, ultimately fostering more helpful and adaptable AI solutions.