How AI Is Tackling Fake Academic Research That Is Plaguing Scientific Community

Across the globe, scientific reputations are being damaged with news about inexperienced researchers coming under pressure to publish research papers. Research fraud is being committed by fabricating or falsifying data and reporting incorrect findings. This negative trend is damaging the scientific integrity of researchers.

A recent Finnish study indicated that between 2010 and 2014, the number of articles published by “predatory” journals rose from 53,000 to half a million. Predatory journals are publications who charge money for fake peer reviews and publications. Reportedly, the Chinese government cracked down on more than 400 authors for damaging China’s scientific reputation with such fraudulent research papers.


Sign up for your weekly dose of what's up in emerging technology.

In a bid to build up their resumes, young researchers are pushed to publish scientific research with fabricated data which hurts the community and makes it difficult for journals to sort through the noise. Reportedly, almost two percent scientists admitted to falsifying data, and almost 34% admitted to using questionable research practices. The survey further indicated that 14.2% resorted to even falsification. Not just that, tools like SCIGen are also being used to generate valid-looking articles. In 2013, the IEEE reportedly pulled 120 papers out of their publication since they were computer-generated.

Interestingly, a decade ago, researchers Jeremy Stribling, Dan Aguayo and Max Krohna of MIT’s Computer Science and Artificial Intelligence Lab (CSAIL) built a computer science paper generator that could stitch together nonsense papers with impressive graphs and such. The papers drummed up by SCIgen software were accepted at major conferences and even by reputed journals. Reportedly, the SCIgen software has been leveraged by scores of academics to publish journals articles and conference proceedings by reputed publishers such as Springer and the Institute of Electrical and Electronic Engineers (IEEE, US).

However, French computer scientist Cyril Labbe has spent a considerable amount of time flagging down these 120 SCIgen papers with ML tools.

Now, AI and machine learning techniques are being used to exponentially improve the way that research is being conducted and published. Some of the top ways AI is being used to benefit the scientific community are:

  • NLP tools are being used to fight plagiarism and identify sections which are reworded
  • AI and machine learning techniques are being deployed to find data that is flawed or misreported. AI capabilities can detect how statistics was applied to arrive at a certain outcome. It is also being used to detect whether data was manipulated to get the desired result

Other Tools

Machine Box: To counter the menace of fake research, Machine Box leverages ML capabilities into Docker containers so that developers can easily incorporate NLP, facial detection, object recognition, etc into your the apps quickly. It is also cheaper than any of the cloud services and the data doesn’t leave the organisation/individual’s infrastructure.

Algorithmic Tool to mine research papers: In a similar vein, Daniel Acuna, Assistant Professor at Syracuse University’s School of Information Studies, and his team revealed how they successfully implemented an algorithmic approach to mine nearly 800,000 biomedical papers and 2 million images for duplication. According to Acuna, ML can be used to detect duplicate images whether they were rotated or skewed in some way.

How AI Helps In Automating Peer Review Process

Besides combating research fraud, AI can also automate the process of peer review process. With Google Translate, which follows rules-based machine translation, one can improve the peer review process. While computers may play an increasingly useful role in editorial and peer review processes, it will still require a human-in-the-loop element.

However, peer reviewers cannot be replaced by machines — humans are needed to review language competence. As the machines are only as good as the people who programmed them, machine intelligence can never keep pace with scientific research. Hence, humans are needed to evaluate and provide feedback on manuscripts and feed information to computers to help them improve. In a similar vein, administrators handling research manuscripts will continue to be necessary for dealing with the unexpected: answering questions and managing projects.

More Great AIM Stories

Richa Bhatia
Richa Bhatia is a seasoned journalist with six-years experience in reportage and news coverage and has had stints at Times of India and The Indian Express. She is an avid reader, mum to a feisty two-year-old and loves writing about the next-gen technology that is shaping our world.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

Conference, in-person (Bangalore)
Cypher 2023
20-22nd Sep, 2023

3 Ways to Join our Community

Whatsapp group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our newsletter

Get the latest updates from AIM