Now Reading
DeepMind Introduces A New Benchmark For Meta Reinforcement Learning

DeepMind Introduces A New Benchmark For Meta Reinforcement Learning

  • Alchemy is a 3D, first-person perspective video game implemented in the Unity game engine.

Recently, a team of researchers from DeepMind and University College London have released a principled benchmark for meta-reinforcement learning (meta-RL) research, known as Alchemy. The benchmark is a combination of structural richness and structural transparency.

As an approach for increasing the flexibility and sample efficiency of reinforcement learning, meta-reinforcement learning (meta RL) has picked up momentum in the last few years. Meta-RL is defined as any process which yields faster learning, on average, with each new draw from the task distribution. 

Register for FREE Workshop on Data Engineering>>

As per the researchers, unlike deep reinforcement learning that requires a task, meta-RL needs a task distribution — a large set of tasks with a shared structure. However, researchers often face challenges in this area such as: a scarcity of adequate benchmark tasks; ill-defined to support principled analysis, etc. The researchers came up with the new meta-RL benchmark to address these hurdles.

Behind Alchemy

The DeepMind Alchemy environment is a meta-reinforcement learning benchmark that presents tasks sampled from a task distribution with deep underlying structure. Alchemy is a 3D, first-person perspective video game implemented in the Unity game engine. According to the researchers, the benchmark was created to test the ability of agents to reason and plan via latent state inference, as well as useful exploration and experimentation.

Alchemy is highly structured and has non-trivial latent causal structure resampled every time the game is played. It requires knowledge-based experimentation and strategic action sequencing. The latent causal structure is resampled procedurally from episode to episode, affording structure learning, online inference, hypothesis testing and action sequencing based on abstract domain knowledge.

The researchers stated, “Because Alchemy levels are procedurally created based on a fully accessible generative process with a well-defined parameterisation, we are able to implement a Bayesian ideal observer as a gold standard for performance.”

How It Works

The Alchemy environment is played in a series of ‘trials’, which fit together into ‘episodes’. Within each trial, the goal is to use a set of potions to transform each in a collection of visually distinctive stones into more valuable forms, collecting points when the stones are dropped into a central cauldron. Also, the value of each stone is tied to its perceptual features, but this relationship changes from episode to episode. Hence, the implicit challenge within each episode is to diagnose, within the available time, the current chemistry and thus leveraging this diagnosis to manufacture the most valuable stones possible. 

Benefits Of Alchemy

The researchers said Alchemy brings two desirable features:

  • Structural Interestingness: It demands experimentation, structured inference and strategic action sequencing
  • Structural Accessibility: Alchemy is conferred by its explicitly defined generative process, which furnishes an interpretable prior and supports the construction of a Bayesoptimal reference policy.

Wrapping Up

As a validation of the 3D environment, the researchers evaluated two powerful reinforcement learning agents on Alchemy and found that in both the cases, despite mastering the basic mechanical aspects of the task, neither agent showed any appreciable signs of meta-learning.

Alchemy proved to be a challenging benchmark for meta-RL and will be useful to the larger community. The researchers open-sourced both the full 3D and symbolic versions of the Alchemy benchmark environment, along with a suite of benchmark policies, analysis tools, and episode logs on GitHub.

To use this benchmark environment, one must require  Docker, Python 3.6.1 as well as an x86-64 CPU with SSE4.2 support. Also, the benchmark is intended to be run on Linux and is not officially supported on Mac and Windows  

Click here to install Alchemy.

Subscribe to our Newsletter

Get the latest updates and relevant offers by sharing your email.
Join our Telegram Group. Be part of an engaging community

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top