IIT Hyderabad Professor Believes Indic Data is All You Need

“One of the things that we are focused on is developing models that reach a vast audience, even in remote villages," said Professor Maunendra Sankar Desarkar.
IIT-Hyderabad-Professor-Believes-Indic-Data-is-All-You-Need
Image by Raghavendra Rao
While fine-tuning with Indic language tokens on top of existing English models is a viable approach, building foundational models from scratch offers several advantages, and that is what BharatGPT is aiming to do. “Existing models may not adequately represent the Indian cultural and linguistic diversity, which can lead to biases and limitations in their applicability,” said Professor Maunendra Sankar Desarkar from IIT Hyderabad, who is also a core team member of the BharatGPT initiative. “Moreover, fine-tuning may not fully address the unique linguistic challenges posed by Indic languages,” he added. He further said that by building foundational models tailored to the Indian context, we can ensure greater inclusivity and effectiveness across diverse linguistic communities, which would deliver AI in the best possible way in India.  “We're sourcing data from various repositories available on the web, including digitised books and datasets,” Desarkar added. He said
Subscribe or log in to Continue Reading

Uncompromising innovation. Timeless influence. Your support powers the future of independent tech journalism.

Already have an account? Sign In.

📣 Want to advertise in AIM? Book here

Picture of Mohit Pandey
Mohit Pandey
Mohit writes about AI in simple, explainable, and often funny words. He's especially passionate about chatting with those building AI for Bharat, with the occasional detour into AGI.
Related Posts
AIM Print and TV
Don’t Miss the Next Big Shift in AI.
Get one year subscription for ₹5999
Download the easiest way to
stay informed