Last updated January 7, 2024
In Creative AI

Apple becomes AI Developers New Favourite for Running Biggest Open Source LLMs

Developers can now run the biggest open source LLM (Falcon with 180 billion parameters) with low quality loss on a 14 inch laptop.

Share

Published on November 4, 2023

by K L Krithika

Listen to this story

Apple recently treated AI developers with M3 chips, which now lets them work with large transformer models with billions of parameters on the MacBook, seamlessly. “Support for up to 128GB of memory unlocks workflows previously not possible on a laptop,” said Apple, in its blog post.

Currently, only the 14-inch MacBook Pro supports the M3, M3 Pro, and M3 Max chips, and as for the 16-inch MacBook Pro, it supports only M3 Pro and M3 Max configurations. Apple also claimed that its enhanced neural engine helps accelerate powerful machine learning (ML) models, alongside preserving privacy.

“What a time to be alive,” said Yi Ding. He said that developers can now run the biggest open source LLM (Falcon with 180 billion parameters) with low quality loss on a 14 inch laptop.

However, running open source LLM on laptops isn’t new. Previously, AI folks have tried with M1 as well. Anshul Khandelwal, the co-founder and CTO of invideo, experimented with a 65 billion open source LLM on his MacBook (powered by M1). He said that this changes everything about once a week now. “A future where every techie runs a local LLM is not too far,” he added.

Perplexity.ai co-founder and chief Aravind Srinivas jokingly said once a MacBook is sufficiently powerful in terms of FLOPs packed per M1 chip, a large organisation with everyone on their MacBooks and high speed intranet is subjected to regulation and needs to report their existence to the government.

M3 for AI Workloads

Apple claimed that the M3 family of chips is currently 15% faster than the M2 family chips, and 60% faster than the M1 family chips. Clearly, there is only a glaring difference between M2 and M3 in terms of performance and other specs. The latest chips by Apple have the same core count but a different balance of performance and efficiency cores (six of each vs eight P and 4 E) and support up to 36GB of memory rather than 32GB.

The M3 chip boasts support for up to 128GB of unified memory, marking a doubling of capacity compared to its predecessor, both the M1 and M2 chip. This expanded memory capacity is especially critical for AI/ML workloads that demand extensive memory resources to train and execute large language models and complex algorithms.

In addition to the enhanced neural engine and expanded memory support, the M3 chip features a redesigned GPU architecture.

This architecture is purpose-built for superior performance and efficiency, incorporating dynamic caching, mesh shading, and ray tracing capabilities. These advancements are specifically designed to expedite AI/ML workloads and optimise overall computational efficiency.

The new M3 prominently features GPUs with “Dynamic Caching,” unlike traditional GPUs, using local memory in real-time, enhancing GPU utilisation, and significantly boosting performance in demanding pro apps and games.

For game developers and users of graphics-intensive apps like Photoshop or photo-related AI tools, the GPU’s capabilities will be beneficial. Apple claimed up to 2.5 times the speed of the M1 family of chips, with hardware-accelerated mesh shading and improved performance using less power.

Apple vs the World

Apple is not alone, other players such as AMD, Intel, Qualcomm, and NVIDIA is also heavily investing in enhancing the edge capabilities, making it possible for users to run large AI workloads on laptops and personal computers.

For instance, AMD recently introduced AMD Ryzen AI, including the first built-in AI engine for x86 Windows laptops, and the only integrated AI engine of its kind.

Intel, on the other hand, is banking on 14th Gen Meteor Lake. It is the first Intel processor to use a tiled architecture, which allows it to mix and match different types of cores, such as high-performance cores and low-power cores, to achieve the best balance of performance and power efficiency.

Recently, Qualcomm also introduced the Snapdragon X Elite. The company’s chief Cristiano Amon claimed superior performance over Apple’s M2 Max chip and comparable peak performance, all achieved 30% less power consumption. Meanwhile, NVIDIA is also investing in edge use cases, and silently working on designing CPUs compatible with Microsoft’s Windows OS, leveraging Arm technology.

AI developers are increasingly running and experimenting with language models locally, and it’s truly fascinating to watch this space evolve. Given the latest advancements in the space, Apple is slowly but surely becoming the go-to favourite for AI developers.

Access all our open Survey & Awards Nomination forms in one place

K L Krithika

K L Krithika is a tech journalist at AIM. Apart from writing tech news, she enjoys reading sci-fi and pondering the impossible technologies, trying not to confuse it with reality.