Google has unveiled Gemma, a new family of open models, leveraging the research and technology behind the existing Gemini models. The Gemma open models are released in two sizes, Gemma 2B and Gemma 7B, each offering pre-trained and instruction-tuned variants.
Users can start working with Gemma today using free access in Kaggle and a free tier for Colab notebooks. Additionally, first-time Google Cloud users can avail themselves of $300 in credits. Researchers can also apply for Google Cloud credits of up to $500,000 to accelerate their projects.
Gemma outperforms Llama 2 on several benchmarks, including MMLU, HellaSwag, and HumanEval.
To support developer innovation and responsible use, Google is also providing a Responsible Generative AI Toolkit alongside the models. This toolkit includes essential tools for creating safer AI applications with Gemma, offering guidance and support for developers.
To facilitate widespread adoption, Gemma is compatible with major frameworks, including JAX, PyTorch, and TensorFlow through native Keras 3.0. The release includes ready-to-use Colab and Kaggle notebooks, integration with popular tools such as Hugging Face, MaxText, NVIDIA NeMo, and TensorRT-LLM.
Gemma models can run on various platforms, from laptops and workstations to Google Cloud, with optimisation for industry-leading performance on NVIDIA GPUs and Google Cloud TPUs.
This development comes after Google recently introduced Gemini 1.5 with a 1 million token context window — the largest ever seen in natural language processing models. In contrast, GPT-4 Turbo has a 128K context window, and Claude 2.1 has a 200K context window.