MITB Banner

Meet ಕನ್ನಡ Llama

Adarsh Shirawalmath is a 2nd year B.Tech student at Vellore Institute of Technology who built Kannada Llama.

Share

Meet the Creator of Kannada Llama

Given the rise of Indic LLMs in Tamil, Telugu, Hindi, and Odia, it was only a matter of time that a Kannada language model came up, and it’s finally here. Meet Kannada Llama aka Kan-LLaMA, a 7 billion Llama 2 model which is LoRA pre-trained and fine-tuned on “Kannada” token, built by a Mumbai-based company called Tensoic

“It all started when Meta’s Llama dropped and then we saw all these Indic language models coming up,” said Adarsh Shirawalmath, the 2nd year B.Tech student at Vellore Institute of Technology, who is the creator of Kan-LLaMA, in an exclusive interaction with AIM.

“Our college had a couple of GPU clusters and we got access to them, and started messing around with them,” Shirawalmath said that he and his co-founder, Raghav Ravishankar, started collaborating with people online and were determined to build their own language model, and started using college GPUs. 

That is also how they got in touch with Adithya Kamath and Bharat Shetty Barkur, who also contributed to the creation big time. 

When the AWS Campus Fund was announced at VIT, Shirawalmath and Ravishankar were very excited about getting funded for building AI models. But the minimum requirement for that was having a registered company. “I said that if we’re planning on building this venture, let’s just do it,” he narrated how they quickly registered a company within 15 days. 

“We randomly came up with the name Tensoic, which means ‘Tensor’ plus ‘Logic’, and we had no goals then, we were just fishing stuff.” 

Bound to happen

The team is still working on the research paper as Shirawalmath said that he wants to make it perfect. “I don’t even know how to read Kannada,” he laughed. “We got experts of the Kannada language for the data curation part, while we worked on the model.” 

Born in Davanagere in Karnataka, Shirawalmath has lived most of his life in Mumbai, and then started studying at VIT. “I used to do ethical hacking being a bug bounty for Shopify and many other companies,” he added. “I have always been interested in AI but the ChatGPT frenzy got me wondering about the business use cases of generative AI and LLMs.”

ChatGPT is not specialised, but a general chatbot only for conversations. The team saw many use cases in multilingual algorithms. “Customising something like ChatGPT for a dynamic country like India was a very huge task,” he said, but the team decided to build it from scratch, and make the model and dataset open source on Hugging Face.

The pre-training process occurred on a solo NVIDIA A100 80GB instance, requiring approximately 50 hours and incurring an estimated cost of $170. The resultant LoRA adapter attained a size of around 1.1GB.

In their blog post, the researchers conveyed that they pre-trained Llama 2 on approximately 600 million Kannada tokens from the well-known CulturaX dataset. It comprises diverse de-duplicated multilingual dumps obtained from popular scrapes such as mC4 and OSCAR.

More models coming soon

Currently, the model is built on top of Llama 2, but Shirawalmath said that the team is also planning to make it on top of Mistral’s models, but the dataset is a little messy and not ready for Indic models yet. 

“We are also planning to build a Gujarati Llama soon, but it’s just the beginning phase,” he added about future plans and the possibility of releasing more Indic models in the coming months.

Shirawalmath said that there are various use cases of building Indic LLMs and the ones that Tensoic is focused on are mostly in the healthcare sector, and also primarily focusing on the defence sector.

“When Sam Altman said that it is impossible to compete with OpenAI, we decided to join the movement for Indic languages acceleration,” Shirawalmath added about his motivation, and how models like Tamil Llama, Telugu Llama, and Sarvam AI’s OpenHathi pumped his enthusiasm even further.

Given the battle between the US and China about AI models, Shirawalmath said that he was surprised that there were no models coming out from India. “I think we should follow what Japan is doing in terms of AI policies,” he added about the need for less tapping and bureaucracy for rapid AI development in India and grinding out LLMs. 

“I think it’s crucial for India to have technologies like what the US has. That is the main motive that we have,” Shirwalamath concluded, saying that India’s generative AI moment is just getting started and the team is looking for further collaboration.

Share
Picture of Mohit Pandey

Mohit Pandey

Mohit dives deep into the AI world to bring out information in simple, explainable, and sometimes funny words. He also holds a keen interest in photography, filmmaking, and the gaming industry.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.