MITB Banner

Hands-On to CompilerGym: A Reinforcement Learning Toolkit for Compiler Optimizations

Share
Compiler Gym

CompilerGym is a python toolkit by Facebook. It is a reinforcement learning package for compiler optimization problems. This framework’s motivation is that compilers’ decisions are very risky performance-wise and have to be efficient for the required software. Applying AI to optimize the compiler is growing these days. However, due to the compiler’s dynamic nature, it is not easy for it to do experiments. The key idea is to allow AI researchers to experiment with compiler optimization methods without really getting into the compilers’ details and help compiler developers look into new AI optimization problems. 

Vision of CompilerGym

The vision is to ease program optimization, without even writing a single line of code in C/C++. The goals of CompilerGym are mentioned below:

  1. To create “ImageNet for Compilers” : an open source OpenAI gym environment for experimenting with compiler optimizations using real-world software and datasets.
  2. Provide a common platform for comparisons of different techniques of compiler optimization.
  3. Give control to users for all the decisions that the compiler makes.(see Roadmap for details).
  4. Making CompilerGym ready for easy deployment, creating the production environment more efficient.

Key Concepts in CompilerGym

CompilerGym uses compiler optimization problems as an environment for reinforcement learning. It uses OpenAI Gym interface to the agent-environment loop mentioned below:

Source : https://facebookresearch.github.io/CompilerGym/

Here,

  • Environment: defines a compiler optimization task. The environment contains an instance of the compiler and the program which is being compiled. Whenever the agent – environment interaction happens, the compiler’s current state can change.
  • Action Space: it defines the set of actions that are taken on the current environment state.
  • Observation: it defines the view of the current environment state.
  • Reward: it defines a metric that indicates the performance of the previous action.

A single instance of the “agent-environment loop” represents the compilation of a particular program to develop an agent that can maximize the cumulative reward from these environments.

Installation

CompilerGym can be installed via PyPI.

!pip install compiler_gym

For installation from other sources, check the official tutorial.

Structure of CompilerGym Code

 import gym
 import compiler_gym                     # imports the CompilerGym environments
 env = gym.make("llvm-autophase-ic-v0")  # starts a new environment
 env.require_dataset("npb-v0")           # downloads a set of programs
 env.reset()                             # starts a new compilation session with a random program
 env.render()                            # prints the IR of the program
 env.step(env.action_space.sample())     # applies a random optimization, updates state/reward/actions 

Demo – CompilerGym Basics

  1. Import the compiler_gym  library. Importing compiler_gym automatically registers the compiler environment.
 import gym
 import compiler_gym 

    Its version can be checked by:

compiler_gym.__version__

We can check all the available environments in CompilerGym with the help of the command mentioned below:

compiler_gym.COMPILER_GYM_ENVS

  1. Select an environment.

CompilerGym environment are named as one of the following formats:

  • <compiler>-<observation>-<reward>-<version>
  • <compiler>-<reward>-<version>
  • <compiler>-<version>

where, 

  • <compiler> is the compiler optimization task
  • <observation> is the default observation provided and,
  • <reward> is the reward signal 

Check compiler_gym.views for more details. For the example purpose, the following environment will be used:

You can create an instance of this environment by using the code below:

env = gym.make("llvm-autophase-ic-v0")

  1. Installing Benchmarks

In CompilerGym, the input programs are known as benchmarks and a collection of benchmarks are contained into datasets. You can use a pre-defined benchmark or create your own.

The benchmarks(if available to the present environment) can be queried using env.benchmarks. Available benchmarks can be seen as:

env.benchmarks

It will return an empty list if there are no benchmarks available. You can also use predefined programs. For the example purpose, we will use the NAS Parallel Benchmarks dataset :

 env.require_dataset("npb-v0")
 env.benchmarks 
  1. The compiler environment
  • The CompilerGym environment is very similar to the OpenAI Gym environment. You can check the documentation of any method via help() function. For example:

help(env.reset)

  • Action Space : CompilerGym defines the action space by env.action_space. You can check the codes here.
  • Observation Space: The observation space is described by env.observation_space.
  • The upper and lower bounds of the reward signal are described by env.reward_range.
  • Before using the other CompilerGym environment we should call env.reset() to reset the environment state.
  1. Interaction with the environment: It is the same as the interaction with the OpenAI Gym environment.

To print the Intermediate Representation (IR) of the program in the current state, we use : 

env.render()

env.step() runs an action. It returns four values: a new observation, a reward, a boolean value to know whether the episode has ended and additional information.

observation, reward, done, info = env.step(0)

An example of optimization is shown below where rewards indicated the reduction in size of the code as compared to the previous action.A cumulative reward greater than one means that the sequence of optimizations performed yields better results than LLVM’s default optimizations. Let’s run 100 random actions and see how close we can get:

 env.reset(benchmark="benchmark://npb-v0/50")
 episode_reward = 0
 for i in range(1, 101):
     observation, reward, done, info = env.step(env.action_space.sample())
     if done:
         break
     episode_reward += reward
     print(f"Step {i}, quality={episode_reward:.2%}") 

The above code can be run with a simple command line command.

env.commandline()

You can also save the program for future purposes, the code is given below:

 env.write_bitcode("/tmp/program.bc")
 !file /tmp/program.bc 

At the end, never forget to close the environment to end that particular compiler instance.

env.close()

Conclusion

In this article, we have covered CompilerGym, a reinforcement learning toolkit for optimizing the compiler. The basic code structure and code usage are mentioned above in different sections. Colab Notebook for Demo is available at:

Official Codes & Docs are available at:

PS: The story was written using a keyboard.
Share
Picture of Aishwarya Verma

Aishwarya Verma

A data science enthusiast and a post-graduate in Big Data Analytics. Creative and organized with an analytical bent of mind.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India