MITB Banner

‘I don’t really trust papers out of top AI labs anymore’

Findings that can't be replicated are intrinsically less reliable.

Share

Listen to this story

The role of scientific research in pushing the frontiers of artificial intelligence cannot be overstated. The researchers working at MIT’s Computer Science and Artificial Intelligence Laboratory, Stanford Artificial Intelligence Laboratory, Oxford University and many other top labs are shaping the future of humanity. In addition, most top AI labs, even the private players such as DeepMind and OpenAI, publish on preprint servers to democratise and share knowledge.

But, how useful are these papers for the community at large?

Are top AI labs trustworthy?

Recently, a Reddit user published a post titled, ‘I don’t really trust papers out of “Top Labs” anymore. In the post, the user asked: Why should the AI community trust these papers published by a handful of corporations and the occasional universities? Why should I trust that your ideas are even any good? I can’t check them; I can’t apply them to my own projects.

Citing the research paper titled ‘An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale Multitask Learning Systems’, the Reddit user said, “It’s 18 pages of talking through this pretty convoluted evolutionary and multitask learning algorithm; it’s pretty interesting, solves a bunch of problems. But two notes. One, the big number they cite as the success metric is 99.43 on CIFAR-10, against a SotA of 99.40. 

The Reddit user also referred to a chart towards the end of the paper that details how many TPU core-hours were used for just the training regimens that resulted in the final results. 

“The total is 17,810 core-hours. Let’s assume that for someone who doesn’t work at Google, you’d have to use on-demand pricing of USD3.22 per hour. This means that these trained models cost USD57,348.

“Strictly speaking, throwing enough compute at a general enough genetic algorithm will eventually produce arbitrarily good performance, so while you can read this paper and collect interesting ideas about how to use genetic algorithms to accomplish multitask learning by having each new task leverage learned weights from previous tasks by defining modifications to a subset of components of a pre-existing model,” he said.

Jathan Sadowski, a senior fellow at Emerging Tech Lab, responded: “AI/ML research at places like Google and OpenAI is based on spending absurd amounts of money, compute, and electricity to brute force arbitrary improvements. The inequality, the trade-offs, the waste—all for incremental progress toward a bad future.”

The Reddit post has been a source of much debate on social media. Many pointed out that there should be a new journal for papers where one can replicate their results in under eight hours on a single GPU.

Findings that can’t be replicated are intrinsically less reliable. And the fact that the ML community is maturing towards decent scientific practices instead of anecdotes is a positive sign, said Leon Derczynski, associate professor at IT University of Copenhagen.

Replication crisis

The replication crisis has been gripping the scientific community for ages. The AI domain is also grappling with it, mostly because researchers often don’t share their source code. A replication crisis refers to when scientific studies are difficult or impossible to reproduce. 

According to a 2016 Nature survey, more than 70 percent of researchers have tried and failed to reproduce another scientist’s experiments. Further, more than 50 percent of them have failed to reproduce their own experiments.

Reproducibility is the basis of quality assurance in science as it enables past findings to be independently verified. 

The scientific and research community strongly believes that withholding important aspects of studies, especially in domains where larger public good and societal well-being are concerned, does a great disservice.

According to the 2020 State of AI report, only 15 percent of AI studies share their code, and industry researchers are often the culprits. The report criticises OpenAI and DeepMind, two of the world’s best AI research labs, for not open sourcing their code.

In 2020, Google Health published a paper in Nature that described how AI was leveraged to look for signs of breast cancer in medical images. But Google drew flak as it provided little information about its code and how it was tested. Many questioned the viability of the paper, and a group of 31 researchers published another paper in Nature titled ‘Transparency and reproducibility in artificial intelligence’. Benjamin Haibe-Kains, one of the paper’s authors, called Google’s paper an advertisement for cool technology with no practical use.

However, things are changing. NeurIPS now asks authors/researchers to produce a ‘reproducibility checklist’ along with their submissions. This checklist consists of information such as the number of models trained, computing power used, and links to code and datasets. Another initiative called the ‘Papers with Code’ project was started with a mission to create free and open-source ML papers, code and evaluation tables.

Share
Picture of Pritam Bordoloi

Pritam Bordoloi

I have a keen interest in creative writing and artificial intelligence. As a journalist, I deep dive into the world of technology and analyse how it’s restructuring business models and reshaping society.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.