MITB Banner

Apple Releases Four Open Source LLMs with OpenELM Series of Models

The models come in 270M, 450M, 1.1B, and 3B parameters, both pre-trained and fine-tuned according to instructions.

Share

apple
Listen to this story

Apple has open sourced OpenELM, a collection of Efficient Language Models (ELMs). OpenELM utilises a layer-wise scaling approach to efficiently distribute parameters within each layer of the transformer model, resulting in improved accuracy. 

Click here to check out the model on Hugging Face.

OpenELM models were pre-trained using the CoreNet library. The models come in 270M, 450M, 1.1B, and 3B parameters, both pre-trained and fine-tuned according to instructions.

The pre-training dataset consists of RefinedWeb, deduplicated PILE, a subset of RedPajama, and a subset of Dolma v1.6, totaling approximately 1.8 trillion tokens. Please review the licence agreements and terms of use for these datasets before utilising them.

https://twitter.com/ClementDelangue/status/1783107571294900300

For instance, with a parameter budget of around one billion parameters, OpenELM demonstrates a remarkable 2.36% increase in accuracy compared to OLMo, while requiring only half the pre-training tokens.

In benchmarking, modern, consumer-grade hardware was used, with BFloat16 as the data type. CUDA benchmarks were conducted on a workstation equipped with an Intel i9-13900KF CPU, 64 GB of DDR5-4000 DRAM, and an NVIDIA RTX 4090 GPU with 24 GB of VRAM, running Ubuntu 22.04.

To benchmark OpenELM models on Apple silicon, an Apple MacBook Pro with an M2 Max system-on-chip and 64GiB of RAM, running macOS 14.4.1, was employed.

Token throughput was measured in terms of tokens processed per second, including prompt processing (pre-fill) and token generation. All models were benchmarked sequentially, with a full “dry run” generating 1024 tokens for the first model to significantly increase the throughput of generation for subsequent models.

The entire framework, including training logs, multiple checkpoints, pre-training configurations, and MLX inference code, has been made open-source, aiming to empower and strengthen the open research community, facilitating future research efforts.

Share
Picture of Mohit Pandey

Mohit Pandey

Mohit dives deep into the AI world to bring out information in simple, explainable, and sometimes funny words. He also holds a keen interest in photography, filmmaking, and the gaming industry.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.