Can Reinforcement Learning Be Used For Better Economic Policies

AI Economist combines machine learning and AI-driven economic simulation to overcome current challenges.
Economic Policies

Income inequality is one of the major problems of economics. Policymakers use taxation as an effective tool to address this. In simplest terms, the government collects money from people according to their income and redistributes it, either directly or indirectly. But, developing the best tax policy is a major challenge. Economists have struggled in building the best of it, but to date, it remains an open problem.

Economic methodology is limited by counterfactual data, simplistic behavioural models, and offers limited opportunities to experiment with policies. Amidst this, machine learning based economic simulation can prove to be a powerful policy and mechanism design framework that can help overcome these limitations.

To this end, researchers from Salesforce, Harvard University, and have attempted to design an optimal economic policy via two-level deep reinforcement learning.

AI Economist using Deep Reinforcement Learning

Policy optimisation has a mechanism design challenge. The government aims to find a policy under which the rational behaviour of the affected economic agents yield desired social outcome. However, theoretical approaches to policy design are limited by analytical traceability. They fail to capture the complexity of the real world.

While machine learning and computational techniques for automated mechanism design hold promise for overcoming the existing challenges, thus far, there hasn’t been a general computational approach for policy design. There is a need for solving a highly non-stationary, two-level, sequential decision-making problem where all the actors are learning — while economic agents learn rational, utility maximising behaviours, the government learns optimisation of its objective through policy choices.

The authors of the study, “Optimal Economic Policy Design via Two-level Deep Reinforcement Learning”, introduce a new framework — AI Economist, which combines machine learning and AI-driven economic simulation to overcome current challenges. Specifically, this technique builds on AI-driven economic simulations and two-level reinforcement learning as a new paradigm for economic policy design.

This study shows that AI-driven simulations capture features of real-world economies. It does not need hand-crafted behavioural rules or simplifications for analytical tractability. The researchers used a single step economy and a multiple-step, micro-founded economic simulation called Gather-Trade-Build. This feature has multiple heterogeneous economic agents in a two-dimensional spatial environment. Gather-Trade-Build includes trading between agents and simulates the economy over extended periods of time. Gather-Trade-Build serves as a rich testbed for AI-driven policy design and is more complex than traditional tax frameworks.

The AI Economist uses a two-level, deep reinforcement learning – individual agents within the economy, and at the level of the social planner. The agent and social planner use deep neural networks to implement their policy model. The two-level RL is natural in many contexts, including mechanism design, principal-agent problem, and regulating systems with unethical incentives. This system compares the performance of billions of economic designs.

The AI Economist uses learning curricula and entropy-based regularisation to solve the two-level problem by providing a tractable and scalable solution. This approach stabilises training using two assumptions:

  • The agent and social planner should be encouraged to explore and co-adapt
  • Agents should not face high utility costs that discourage exploration during learning

This approach offers the following advantages:

  • By design, it considers actors that c-adapt with economic policy. Because of this, it doesn’t suffer from Lucas critique. As per Lucas critique, one cannot predict the effects of economic policy change based only on the relationships observed in historical data.
  • The use of reinforcement learning provides rational agent behaviour.
  • Since the simulation framework is flexible, it supports a configurable number of agents and offers various choices in economic processes
  • The designer can choose any policy objective, and it needn’t be analytically tractable or differentiable.
  • The use of RL does not require knowledge of simulation or economic theory.

Previous Solutions

This isn’t the first time Salesforce is playing with the idea of AI Economist. In fact, the team introduced it in 2020. This version used reinforcement learning to tax research to provide a simulation and data-driven solution. It used a collection of AI agents that were designed to simulate how real people may react to different taxes.

In the simulation, each agent earned money by collecting and trading resources and building houses. The agents here maximise their utility by adjusting movement, building, and trading behaviour. Simultaneously, the AI Economist optimises taxes and subsidies to promote global objectives.

More Great AIM Stories

Shraddha Goled
I am a technology journalist with AIM. I write stories focused on the AI landscape in India and around the world with a special interest in analysing its long term impact on individuals and societies. Reach out to me at

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM