Listen to this story
|
Google at its annual event called “The Check Up”, announced the latest version of its medical large language model called MedPaLM — along with new health initiatives and partnerships.
According to the MedPaLM 2 team, their model achieved a score of 85% on medical exam questions (USMLE MedQA), which is comparable to the level of an “expert” doctor. This is an improvement of 18% from the previous performance of Med-PaLM, surpassing similar AI models — the likes of GPT-4 and others.
The team also obtained results on other benchmarks such as MedMCQA and MMLU clinical topics.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
The evaluators, consisting of clinicians and non-clinicians from diverse backgrounds and countries, tested the models against 14 criteria which included factors such as scientific accuracy, exactness, conformity with medical consensus, logical thinking, partiality, and potential for harm.

Google identified significant disparities, however, it pledged to collaborate with researchers and healthcare professionals to narrow these disparities and enhance healthcare services.
Google Research and DeepMind had released the initial model called MedPaLM, in December 2022. MedPaLM was evaluated using a new open-source medical question-answering benchmark called MultiMedQA.
The AI system had achieved a passing score of over 60% on multiple-choice style questions, which are similar to those used in U.S. medical licensing exams. This was the first time that such a system had been able to do so successfully.
The researchers utilised PaLM, which is a large language model with 540 billion parameters, and its instruction-tuned variation called Flan-PaLM to create the model. They employed these models to evaluate other large language models using MultiMedQA.
In an interesting new development, Google also launched PaLM API right before OpenAI’s GPT-4. The latest API now permits businesses and developers to construct applications utilising Google’s SOTA large language model, which is identical to the one employed in Search, YouTube, and Gmail. Google is offering access to its underlying models for the first time.