The use of machine learning in video games to generate images and situations is almost yesterday’s news. However, using video games as a platform to improve machine learning algorithms like those of reinforcement learning is still an ongoing research. Teaching an agent to complete a task or win a game to get rewards and use that in turn to incentivise more wins is the usual strategy. In a similar experiment, the researchers at Microsoft propose a novel approach to exploit interactive games to make machines smarter at understanding language.
From a machine learning perspective, Interactive Fiction games exist at the intersection of natural language processing and sequential decision making. Like many NLP tasks, they require natural language understanding, but unlike most NLP tasks, Interactive Fiction games are sequential decision making problems in which actions change the subsequent world states of the game and choices made early in a game may have long term effects on the eventual endings.
Interactive Fiction(IF) games are fully text-based simulation environments where a player issues text commands to effect change in the environment and progress through the story. IF games combine challenges of combinatorial action spaces, language understanding, and commonsense reasoning.
Interactive Fiction games are rich narrative adventures that challenge even skilled human players. In contrast to other video game environments, IF games stress natural language understanding and commonsense reasoning, and feature combinatorial action spaces. To aid in the study of these environment, researchers have introduced Jericho, an experimental platform with the key feature of extracting game-specific action templates and vocabulary.
Overview Of Jericho
Jericho is an open-source Python-based IF environment, which provides an OpenAI-Gym-like interface for learning agents to connect with IF games. Jericho supports a set of human-made IF games that cover a variety of genres: dungeon crawl, Sci-Fi, mystery, comedy, and horror. Games were selected from classic Infocom titles such as Zork and Hitchhiker’s Guide to the Galaxy, as well as newer, community-created titles like Anchorhead And Afflicted. Supported games use a point-based scoring system, which serves as the agent’s reward.
Using these features, a novel template-based action space is proposed, which serves to reduce the complexity of full scale language generation. Using this space, they then introduced the Template-DQN agent TDQN, which generates actions first by selecting a template then filling in the blanks with words from the vocabulary.
How NLU Systems Can Benefit
Combinatorial spaces pose extremely difficult exploration problems for existing agents. For example, an agent generating a four-word sentence from a modest vocabulary of size 700, is effectively exploring a space of 7004= 240 billion possible actions.
Whereas, knowledge representation poses a mapping problem. Due to the large number of locations in many games, humans often create maps to navigate efficiently and avoid getting lost.
In particular, because connectivity between locations is not necessarily Euclidean, agents need to detect when a navigational action has succeeded or failed and whether the location reached was previously seen or new.
Reinforcement learning has studied agents that operate in discrete or continuous action space environments. However, IF games require the agent to operate in the combinatorial action space of natural language.
Interactive fiction (IF) games are software environments in which players observe textual descriptions of the simulated world, issue text actions, and receive score as they progress through the story.
Here are few key takeaways according to the authors of the original paper:
- Researchers introduce Jericho, a learning environment for human-made IF games.
- And, introduce a template-based action space that is appropriate for language generation.
- Also, conduct an empirical evaluation of learning agents across a large set of human-made games.
Beyond games, real-world applications such as voice-activated personal assistants can also benefit from advances in these capabilities at the intersection of natural language understanding, natural language generation, and sequential decision making.
These real world applications require the ability to reason with ungrounded natural language (unlike multimodal environments that provide visual grounding for language) and IF games provide an excellent suite of environments to tackle these challenges.
Read the original paper here.