Last updated September 15, 2022
In AI Origins & Evolution

Risks of Letting AI Experts Experiment with Healthcare

Researchers have achieved groundbreaking results in the field of science using AI/ML algorithms like predicting the protein structure or controlling fusion, but several doctors and consumers fear the tech industry’s mantra of “fail fast and fix it later”.

Share

Published on September 15, 2022

by Mohit Pandey

Listen to this story

“We do not want schizophrenia researchers knowing a lot about software engineering,” said Amy Winecoff, data scientist and Princeton’s Centre for IT Policy. Research asserts that a basic understanding of machine learning and other software engineering principles might be a desirable trait for medical practitioners, but these skills should not come at the expense of expertise in domain knowledge.

Many new startups and enterprises sell their products boasting about incorporating AI/ML techniques in the development. Though this is an issue in the developer and business market, the bigger worry is misapplied AI/ML algorithms in the field of science and healthcare as it causes real world consequences.

Sayash Kapoor and Arvind Narayanan of Princeton University published a research paper—Leakage and the Reproducibility Crisis in ML-based Science, pointing out the problem of “data leakage” in various researches using pools of data to train and test their development’s performance.

What is the crisis?

For a long time, AI/ML-powered health products like virtual doctor apps, sensors, and doctoral chatbots have been making rounds in the field of healthcare. Pilar Ossorio, professor at University of Wisconsin, said that the majority of the AI devices that are implemented even in healthcare do not require FDA approval and since they are new, they require a lot more careful oversight.

Recently a Reddit user posted his groundbreaking algorithm which detects skin cancer through a phone’s camera using AI. In the same thread, several doctors pointed out that the inaccuracy and risks of such algorithms can result in people believing in “false-negatives” generated by the Snapchat filter. Another user said that these might not be an end result in the research, but given it uses past data of images of skin cancer, it might excessively produce inaccurate results and fail to make accurate predictions compared to biopsy.

Although researchers have achieved groundbreaking results in the field of science using AI/ML algorithms like predicting the protein structure or controlling fusion, several doctors and consumers fear the tech industry’s mantra of “fail fast and fix it later”. Small incremental improvements in the tech field might make developers hail their inventions as a replacement for medical practitioners, but this can put patients at risk.

Sayash Kapoor said that without a comprehensive understanding of machine learning techniques and limitations, researchers often rush towards implementing it in their developments. The idea of AI being the most promising tool in the research industry has made scientists and researchers bet heavily on it.

Kapoor focuses on the fact that it only takes a four-hour online course in machine learning and then researchers implement it in their research without considering the potentiality of it going wrong. Momin Malik, data scientist at Mayo Clinic, says that researchers apply machine learning for various jobs that do not mandatorily require it.

How risky is it?

Liz Szabo in Scientific American highlights the problems of many researchers not putting up their papers for peer-review, which would eliminate the majority of the risks as scientists would be able to scrutinise their work. Since most of these algorithms are trusted because of the amount of data they are trained on and not tested in randomised clinical trials, they are often described as “black boxes” because even their developers do not understand how they work exactly.

For example, various researchers produced misleading results with Google Flu Trends, a tool developed in 2008 with the aim to identify flu outbreaks using machine learning which then used the history of search queries typed in by web users on their search engine. The same algorithm was used to predict the 2013 flu season but failed miserably as it used season terms and data to predict the results, thereby reducing the reproducibility of the algorithm and degrading the trust.

Another important example mentioned by Dr Michael Roberts of Cambridge University is dozens of papers during the COVID-19 pandemic that claimed to use machine learning to fight the spread of the virus but used skewed data coming from different imaging machines. Northwestern University’s professor Jessica Hullman compared the same machine learning problems with studies in psychology that are impossible to replicate because researchers use very little data and misread the statistical significance of outcomes.

Bob Kocher, partner at VC firm Venrock, said that most of the AI products have little to no evidence to support their thesis. Until AI algorithms are used on a large pool of patients, their effectiveness would remain immeasurable and therefore would likely come with a bunch of unintended consequences and risks. Ron Li, medical informatics director for artificial intelligence integration at Stanford Health care, said using unproven software on patients is like turning “patients into unwitting guinea pigs.”

What is the reason?

Industry analysts propose that AI developers are focused on building their tech and not conducting expensive and time-consuming trials. Large tech companies like Google, Microsoft, and Apple have huge resources to develop and test their models and algorithms which new and small developers do not have.

A Reddit thread points out how developers do not trust “Top Labs” with their research anymore owing to the difference in resources, equipment to test, and amount of dataset which make the results and models unreproducible. Several of these big companies do not make their code publicly available which results in closing research in a specific field. This practice makes developers try out things without prior knowledge or access to dataset or codes, often resulting in creation of incomplete and risky prototypes.

Dr Mildred Cho, professor of paediatrics at Stanford University, said that many AI developers cull huge amounts of data from electronic health records which are then used for billing and not patient care, resulting in building on a broken system which is also heavily filled with errors. With AI gaining attention in the field of medicine and healthcare, it is also a responsibility for clinicians to take the opportunity to learn about these products and assess their potential risks and put a stop to its unregulated use—for the sake of patients.

Access all our open Survey & Awards Nomination forms in one place