Listen to this story
DeepMind recently released a framework that will enable the creation of AI agents that can understand human instructions and perform actions.
Existing AI frameworks have been at the receiving end of criticism for ignoring the situational understanding inherent to how humans use language. For example, DALL.E 2, the text-to-image generator, received a lot of flak for failing to understand the syntax of text prompts. So, for a simple input text like ‘a spoon on a cup’, the responses would include all images consisting of a spoon and a cup in the dataset, without really knowing the relationality between the spoon and cup in the text.
To overcome this problem and build agents that can follow instructions and safely perform actions in open-ended conditions, the researchers at DeepMind created a new model within a video game environment.
The new framework will move away from training the AI agents, keeping a score based on wins/losses calculated by computer code used in applications like StarCraft and Dota. Instead, people will create tasks and score AI agents based on their behaviour.
Although in its infancy, the new research paradigm develops real-time agents that can navigate, talk, interact with people, search for information, ask questions, control items, and do various other tasks.
The game is conceptualised based on a child’s “playhouse”, where both humans and agents have an avatar that can interact with each other and manipulate objects around. The framework is four-stepped. First, human-human interaction facilitates the ground for training initial agents by imitation learning. Then comes the cycle of human-agent interaction with performance judgement and optimisation of these judgements, which will improve agents based on reinforcement learning (RL).
The new research is built on DeepMind’s previously published work demonstrating the role of imitation learning in creating AI agents that can capture the diversity of human behaviour well. In the new work, a reinforcement learning model is used to improve AI systems based on scores obtained via human evaluation.
The framework has also been described as offering utility in building digital and robotic assistants—to create safe AI.