Last updated February 11, 2020
In AI Origins & Evolution

Is Reproducibility In AI A Big Deal?

Share

Published on February 11, 2020

by Ram Sagar

Any decent research paper will consist of the following:

How to do part, so that it can be repeated
What to do to get results that are consistent with fresh experiments

The same stands true for machine learning research papers as well. However, the last decade has seen a heavy rise in the number of publications per year. Few hundreds of papers are being published every day, and, keeping track of these papers itself has become a challenge nowadays.

To check the papers, at least those that get the state-of-the-art status, is a tedious job. Edward Raff, a machine learning researcher from Booz Allen Hamilton, had shouldered this gigantic responsibility of testing the papers for reproducibility.

His work, which was the culmination of eight years of his research, consisted of 255 papers, was published between 1984 and 2017. Raff’s paper was presented at the prestigious NeurIPS, last year, in which, he compiled his findings and also had released the same on a popular portal for machine learning.

Overview Of Raff’s Experiment

Here are a few excerpts from Raff’s paper on reproducibility and the factors that were considered in selecting the papers:

The papers that were selected for checking reproducibility have proposed at least one new algorithm or method that is the subject of reproduction.

The paper is excluded from analysis if the available source code for a paper under consideration was successfully reproduced before.

Any paper was excluded if the paper’s authors had any significant relationship with the reproducers (e.g., academic advisor, coworker, close friends, etc.) because the ability to have more regular communication could bias results.

A paper is regarded as reproducible if the majority (75% +) of the claims in the paper could be confirmed with code written independently.

Raff was left with 255 papers, of which 162 (63.5%) were successfully replicated, and 93 were not.

Importance of Being Independently Reproducible

Before we go further, we need to understand what reproducibility in the context of machine learning really means.

A work is said to be reproducible when a reader follows the procedure listed in the paper and ends up getting the same results as shown in the original work. Machine learning papers nowadays come with code for easy implementation. Developers can easily validate the paper with code.

However, too much emphasis on papers with code also led to an essential question of the reason a procedure should be revealed. Because a paper, when appropriately explained, should be enough to devise a procedure that would give the same results. This is now called independently reproducible work.

Independently reproducibility also has another advantage of finding a more efficient solution to the same problem.

According to Raff, his findings can be summarised as follows:

More math in a paper is not suitable for reproducibility.
Open-sourcing of code is at best a weak indicator of reproducibility.
Too many simplifications and analogies can hamper reproducibility.

Science Of Doing Meta Science

Late last year, machine learning researcher Joelle Pineau, brought the whole ML communities attention to reproducibility. In an interview published by Nature, Pineau addressed them in a detailed way.

In reinforcement learning, says Pineau, if you do two runs of some algorithms with different initial random settings, you can get very different results. And if you do a lot of runs, you’re able to report only the best ones.

Results from the people with more computing power to do more runs will look better.

“Papers don’t always say how many runs were performed. But it makes a big difference to the conclusions you draw,” added Pineau.

This whole new obsession with reproducibility didn’t go well with researchers such as Misha Denil of DeepMind, who responded to Nature’s interview with the following tweet:

Reproducibility is the wrong objective for Machine Learning. Reproducibility is key for science, but ML is not a science. https://t.co/HFGEUfCugV
— Misha Denil (@notmisha) December 22, 2019

As said earlier, being able to replicate the original paper might increase the credibility of the paper but doesn’t encourage new solutions. If machine learning is thought of as science, independent reproducibility is indeed crucial going forward.

Access all our open Survey & Awards Nomination forms in one place

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

‘iPhone is the Greatest Piece of Technology Humanity has Ever Made,’ Says OpenAI’s Sam Altman

Siddharth Jindal

“There are a bunch of societal and interpersonal issues that are all very complicated about wearing a computer on your face,” says OpenAI chief, taking a dig at Meta smart