Active Hackathon

Is Reproducibility In AI A Big Deal?

reproducibility in AI

Any decent research paper will consist of the following:

  1. How to do part, so that it can be repeated
  2. What to do to get results that are consistent with fresh experiments

The same stands true for machine learning research papers as well. However, the last decade has seen a heavy rise in the number of publications per year. Few hundreds of papers are being published every day, and, keeping track of these papers itself has become a challenge nowadays.


Sign up for your weekly dose of what's up in emerging technology.

To check the papers, at least those that get the state-of-the-art status, is a tedious job. Edward Raff, a machine learning researcher from Booz Allen Hamilton, had shouldered this gigantic responsibility of testing the papers for reproducibility.

His work, which was the culmination of eight years of his research, consisted of 255 papers, was published between 1984 and 2017. Raff’s paper was presented at the prestigious NeurIPS, last year, in which, he compiled his findings and also had released the same on a popular portal for machine learning.

Overview Of Raff’s Experiment

Here are a few excerpts from Raff’s paper on reproducibility and the factors that were considered in selecting the papers:

  • The papers that were selected for checking reproducibility have proposed at least one new algorithm or method that is the subject of reproduction.
  • The paper is excluded from analysis if the available source code for a paper under consideration was successfully reproduced before.
  • Any paper was excluded if the paper’s authors had any significant relationship with the reproducers (e.g., academic advisor, coworker, close friends, etc.) because the ability to have more regular communication could bias results.
  • A paper is regarded as reproducible if the majority (75% +) of the claims in the paper could be confirmed with code written independently.

Raff was left with 255 papers, of which 162 (63.5%) were successfully replicated, and 93 were not.

Read more here.

Importance of Being Independently Reproducible

Before we go further, we need to understand what reproducibility in the context of machine learning really means. 

A work is said to be reproducible when a reader follows the procedure listed in the paper and ends up getting the same results as shown in the original work. Machine learning papers nowadays come with code for easy implementation. Developers can easily validate the paper with code.

However, too much emphasis on papers with code also led to an essential question of the reason a procedure should be revealed. Because a paper, when appropriately explained, should be enough to devise a procedure that would give the same results. This is now called independently reproducible work.

Independently reproducibility also has another advantage of finding a more efficient solution to the same problem. 

According to Raff, his findings can be summarised as follows:

  • More math in a paper is not suitable for reproducibility.
  • Open-sourcing of code is at best a weak indicator of reproducibility.
  • Too many simplifications and analogies can hamper reproducibility.

Science Of Doing Meta Science

Late last year, machine learning researcher Joelle Pineau, brought the whole ML communities attention to reproducibility. In an interview published by Nature, Pineau addressed them in a detailed way.

In reinforcement learning, says Pineau, if you do two runs of some algorithms with different initial random settings, you can get very different results. And if you do a lot of runs, you’re able to report only the best ones. 

Results from the people with more computing power to do more runs will look better. 

“Papers don’t always say how many runs were performed. But it makes a big difference to the conclusions you draw,” added Pineau.

This whole new obsession with reproducibility didn’t go well with researchers such as Misha Denil of DeepMind, who responded to Nature’s interview with the following tweet:

As said earlier, being able to replicate the original paper might increase the credibility of the paper but doesn’t encourage new solutions. If machine learning is thought of as science, independent reproducibility is indeed crucial going forward. 

More Great AIM Stories

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

Data Science Skills Survey 2022 – By AIM and Great Learning

Data science and its applications are becoming more common in a rapidly digitising world. This report presents a comprehensive view to all the stakeholders — students, professionals, recruiters, and others — about the different key data science tools or skillsets required to start or advance a career in the data science industry.

How to Kill Google Play Monopoly

The only way to break Google’s monopoly is to have localised app stores with an interface as robust as Google’s – and this isn’t an easy ask. What are the options?