Another Indic language model is here, and this time it’s for Malayalam. Vishnu Prasad J, AI engineer at Examroom.AI has introduced MalayaLLM, focusing exclusively on the Malayalam language, along with a video demonstration.
Developed on Meta’s Llama 2 7B architecture, MalayaLLM aims to revolutionise AI capabilities for Malayalam speakers. The model went through continuous pretraining and fine-tuning exclusively on Malayalam tokens.
Click here to check out the model on Hugging Face.
Unlike generic language models catering to multiple languages, MalayaLLM is dedicated to optimising performance for specific tasks such as content generation and question answering in Malayalam.
The type of the model is specified as 7B Llama 2 pretrained on Malayalam, encompassing language support for both Malayalam and English. The source model is identified as meta-llama/Llama-2-7b-hf.
Currently in its early stages, the model is undergoing continuous training and fine-tuning using a comprehensive Malayalam dataset to enhance its performance. Regular updates will be provided as the model evolves and matures, ensuring its effectiveness in various language-related tasks.
The development of MalayaLLM involves the utilisation of datasets such as AI4Bharat and CulturaX for pre training/tokenisation, as well as the Alpaca_Instruct_Malayalam dataset for fine-tuning. These datasets contribute to the model’s ability to understand and generate content in the Malayalam language.