Listen to this story
Microsoft-backed OpenAI’s ChatGPT is known for its versatility for the chatbot can switch between generating complex codes to composing songs with ease. Its capabilities extend beyond the literary and technical realms, as it now has an array of impressive academic accomplishments to show off.
Besides acing the MBA program at the University of Pennsylvania and the law exam at Minnesota Law School, the chatbot passed the United States Medical Licensing Examination (USMLE) at one go, which usually takes aspiring doctors close to four years and over two years of clinical rotations to clear.
But, after a failed attempt at UPSC, AIM decided to check ChatGPT’s medicine prowess. This time around, we experimented with NEET (UG), one of the most difficult entrance examinations for medical aspirants, which often requires more than one attempt to get into the top medical schools in India. A record 18 lakh students appeared for the exam in 2022.
Did ChatGPT Pass or Fail?
We grilled ChatGPT on all 200 questions from the NEET 2022 paper available online. It consists of 180 multiple-choice questions (MCQs) from Physics, Chemistry, and Biology (Botany and Zoology) subjects. NEET also has 20 extra questions called ‘bonus/trial questions’ that do not carry any marks but are included to assess the candidates’ knowledge and understanding of the subject. So, we tested ChatGPT on all 200 questions.
Every right answer would fetch our examinee four marks, while every wrong answer would cost it one negative mark. We skipped 10 questions from Physics, 15 from Chemistry and 1 from Biology as they were based on graphs and diagrams.
Although ChatGPT is trained on data only up to 2021, no question in the paper was based on current affairs.
For NEET 2022, the cutoff mark was 50% for the general category. And ChatGPT managed to just pass NEET with 50.14% (357 out of 712 in all). It secured 90/160 in Physics, 81/156 in Chemistry and 186/396 in Biology.
Extremely Weak in Botany
ChatGPT is a weak student when it comes to Biology, especially Botany. It answered almost half of the questions in Biology incorrectly, most of which were from Botany.
As seen in the screen grab above, above. The chatbot could not answer the question. The answer, according to the internet and the solution paper, is maize. However, if you tweak the prompt, it can give the right answer.
So, it was clear that a lot of the game depends on prompts. When we altered the prompts in the second attempt, the chatbot could answer many questions correctly. Everytime it gave a wrong answer, it had its own analysis which it readily rectified, when prompted.
This takes us back to the fact that LLMs are prone to hallucinations. Google Bard, Microsoft Bing, and Meta’s Galactica, have all given incorrect responses leading to major setbacks.
ChatGPT is an Average Kid
ChatGPT relies on probabilistic distribution since LLMs are naturally non-deterministic (next token predictors) and not “understanders”. In conversation with Debarghya Das, founding engineer at Glean, on LLM and hallucination, he told AIM that LLMs struggle with basic Math involving large numbers because they are focused on predicting the next token, rather than computing the answer.
“LLMs may not always provide accurate information as they rely on probability distributions and may be influenced by examples of incorrect information. Retrieval-augmented generation techniques can be used to solve this problem, where the model generates answers based on information from reliable web sources,” he added.
Unlike chatbots, it is much more difficult for humans. ChatGPT is trained on a staggering 175B parameter, unlike the human brain. We spoke to several medical students to understand why it takes more than a year to crack NEET. Imagine taking a test where your success rate depends not on your intelligence or critical thinking, but on your ability to memorise facts, especially in Biology. That’s the reality which aspiring medical students are facing in India to pass the NEET.
Annjali Sarkar, a third-year medical student at the prestigious RG Kar Medical College and Hospital, Kolkata, said, “Adding to the peril is the stress that comes with answering 180 questions manually in 180 minutes allotting 1 minute for each. The answering scheme makes it irreversible to make changes. So, no matter how ChatGPT passes medical, it surely can’t compete with the real struggles that we face.”
Meta AI chief Yann Lecun notes that ChatGPT could be “useful and fun” but it can’t compete with human intelligence and is nothing more than a writing aid at most. Even OpenAI head Sam Altman acknowledged that the chatbot is incredibly limited, and should not be relied on for factual queries. But OpenAI is constantly working towards making it better.
So, what if ChatGPT can’t be a public servant in India, it can be an average doctor, for sure.