Last updated December 5, 2023
In AI Origins & Evolution

Is GPT-4 Really Better than Radiologists?

“Radiology report summaries created by GPT-4 are comparable, and in some cases, even preferred over those written by experienced radiologists.”

Share

Illustration by Raghavendra Rao

Published on December 4, 2023

by Siddharth Jindal

There is a dearth of radiologists in the country. Data shows that there are approximately 20,000 radiologists for a population of over 1.4 billion people. That makes it one radiologist for every 1,00,000 individuals — a ratio significantly below the recommended benchmarks by global healthcare organisations.

This shortage, coupled with physician burnout, continues to plague the industry emphasising an urgent need to deliver scalable and practical solutions to improve process efficiency amid the growing demand for diagnostic imaging.

GPT-4 to the Rescue

Microsoft recently published a paper ‘Exploring the Boundaries of GPT-4 in Radiology‘, which assesses the performance of GPT-4 in text-based applications for radiology reports.

One of the primary applications of GPT-4 in radiology lies in its ability to process and comprehend medical images, ranging from X-rays to MRIs. The paper said, “The radiology report summaries created by GPT-4 are comparable, and in some cases, even preferred over those written by experienced radiologists.”

GPT-4 for radiology. Far from perfect, but state-of-the-art performance on some tasks: “Surprisingly, we found radiology report summaries generated by GPT-4 to be comparable and, in some cases, even preferred over those written by experienced radiologists”https://t.co/bi6RwqgeHw
— Greg Brockman (@gdb) November 28, 2023

Microsoft collaborated with AI healthcare startup Nuance for this study and found that for some tasks, GPT-4 achieved about a 10 percent improvement over existing models. One of Nuance’s generative AI products is PowerScribe Smart Impression which automatically draft radiology reports.

Now, with the incorporation of GPT Vision, GPT-4 is truly multimodal, boosting its capabilities and broadening its range of applications. The ChatGPT app is currently available on both Android and iOS. Radiologists only need to scan the radiology reports, and GPT-4 interprets them, offering summaries, medication suggestions, and diagnoses. Alternatively, medical professionals using the ChatGPT app can also leverage the GPT-4 Vision API.

Another encouraging aspect of GPT-4 is its ability to automatically structure patient reports. These reports, based on the radiologist’s interpretation of medical images like X-rays and the patient’s clinical history, are often complex and unstructured making them difficult to interpret.

Moreover, research suggests that organising these reports can make it easier for healthcare professionals to understand and improve the searchability of information for research and quality improvement.

Additionally, using GPT-4 to structure and standardise radiology reports can further support efforts to augment real-world data (RWD) and its use for real-world evidence (RWE). This can complement more robust and comprehensive clinical trials and, in turn, accelerate the application of research findings into clinical practice.

Interestingly, Microsoft is not the only one experimenting with generative AI in radiology. Recently, Jama Open Network conducted a study and found that chest radiograph reports, created by a generative AI, matched the quality and accuracy of reports by in-house radiologists in a retrospective study. Utilising an open-source model from Hugging Face, the study included 500 chest radiographs that the AI model decoded.

Challenges and Limitations

GPT-4 can function as a valuable assistant to radiologists, but it shouldn’t be viewed as a total replacement for human judgment. Ultimately, it’s a language model trained on vast amounts of data and is not immune to errors, biases, or manipulation. Radiologists shouldn’t blindly rely on GPT-4’s outputs without verifying them. They must check the sources, references, and evidence behind its texts.

Moreover, effective prompting plays a key role when utilising GPT-4, especially in healthcare. Microsoft recently published a report stating that with proper prompting, a generalist GPT-4 model can perform comparably to a specialist on medical challenge problem benchmarks.

No doubt GPT-4 is an asset in radiology but beyond radiology, the capabilities of GPT-4 stretch into translating medical reports into more empathetic and comprehensible formats for patients and other healthcare professionals.

A user on X shared, “GPT-4 is reading scans and creating radiology reports — sometimes better than experienced doctors! Imagine this being applied to blood tests and all other kinds of diagnostic tests. Medicine is going to change forever.”

Access all our open Survey & Awards Nomination forms in one place

Siddharth Jindal

Siddharth is a media graduate who loves to explore tech through journalism and putting forward ideas worth pondering about in the era of artificial intelligence.