Listen to this story
|
As promised last week, Elon Musk has released the open source version of Grok, its fully trained large language model.
xAI has unveiled the base model weights and network architecture of Grok-1, a colossal language model boasting an impressive 314 billion parameters, and 25% of weights active per token.
Click here to check out the GitHub repository.
Developed as a Mixture-of-Experts model, Grok-1 marks a significant milestone in AI research and development. The release includes the raw base model checkpoint from Grok-1’s pre-training phase, which concluded back in October 2023.
Notably, this model has not been fine-tuned for any specific application, such as dialogue, offering immense potential for diverse uses across various domains.
Released under the Apache 2.0 license, xAI has made both the weights and architecture of Grok-1 available to the public, inviting developers and researchers to explore its capabilities.
Grok is developed from scratch by xAI using a proprietary training stack leveraging JAX and Rust.
With 2 out of 8 active mixture of expert model on 314 billion model, the active parameters of the model around 86 billion are still larger than the biggest Meta Llama 2 model.
All of this amidst the lawsuit against OpenAI by Musk for not open-sourcing its models and turning into a for-profit company.