21st-may-banner design

Gyan AI Unveils Smaller-Scale Maths LLM, Paramanu-Ganita, Outperforming LLama, Falcon

The model utilises an Auto-Regressive (AR) decoder that processes information sequentially, making it particularly adept at solving complex mathematical problems through logical reasoning.

Share

Listen to this story

Gyan AI has recently unveiled Paramanu-Ganita – a mathematical language model of 208 million parameters. 

Despite its relatively modest size—35 times smaller than bigger LLMs—it outshines its counterparts, including generalist models like LLama and Falcon and specialised models like Minerva, by significant margins in the GSM8k benchmark. 

The model’s success highlights the efficiency of developing domain-specific models from scratch rather than adapting general LLMs to specific domains.

The research team consists of Mitodru Niyogi, founder and chief executive officer of Gyan AI, and Arnab Bhattacharya, computer science and engineering professor at IIT Kanpur, India, and AI advisor at Gyan AI. Niyogi is also associated with Abu Dhabi’s MBZUAI as an AI Researcher. 

Training Method 

The model was trained on a unique, high-quality mathematical corpus curated by the researchers, consisting of textbooks, lecture notes, and web-sourced materials. It was trained only for 146 hours of A100. 

Paramanu-Ganita’s success can be attributed to its training regimen and its specialisation in mathematics. The model utilises an Auto-Regressive (AR) decoder that processes information sequentially, making it particularly adept at solving complex mathematical problems through logical reasoning. Its training was executed on various mathematical texts and source codes, ensuring a comprehensive understanding and application of mathematical logic and problem-solving.

The model’s performance was rigorously evaluated using perplexity metrics and benchmarks, confirming its effectiveness in handling complex mathematical problems efficiently.

The implications of such a specialised tool are vast. Paramanu-Ganita offers a reliable, efficient, and less resource-intensive alternative to larger, more generalised language models for industries and sectors relying heavily on mathematical calculations and modelling. 

It also shows that smaller, domain-focused models can match or even exceed the performance of their larger counterparts without the need for massive computational power or financial investment.

Previously, the researchers had come up with Paramanu, a series of language models tailored for ten Indian languages, including Assamese, Bangla, Hindi, and others, using five different scripts. These models range from 13.29M to 367.5M parameters and were developed on a single GPU with a context size of 1024. The lineup features monolingual, bilingual, and multilingual configurations, the latter avoiding the “curse of multilinguality” by using typologically similar corpora. 

Share
Picture of Shritama Saha

Shritama Saha

Shritama (she/her) is a technology journalist at AIM who is passionate to explore the influence of AI on different domains including fashion, healthcare and banks.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.