AIM Banners_978 x 90

Yandex Unveils YaFSDP for 26% Faster LLM Training 

The open-source tool reduces GPU use by 20%
Run LLM locally on computer

Russian technology MNC Yandex has introduced YaFSDP, an open-source tool designed to improve the efficiency of training LLMs. This method improves GPU communication and reduces memory usage, offering a speedup of up to 26% over existing tools. 

YaFSDP outperforms the traditional FSDP method, showing improvements in training speed, especially for large models. For example, YaFSDP achieved a 21% speedup on Llama 2 with 70 billion parameters and a 26% speedup on Llama 3 with the same number of parameters. These enhancements make YaFSDP a valuable tool for AI developers working with large, complex models.

By optimising GPU consumption, YaFSDP can save developers and companies significant amounts of money—potentially hundreds of thousands of dollars monthly.

“Currently, we’re actively experimenting with various model architectures and parameter sizes to expand YaFSDP’s versatility,” said Mikhail Khruschev, senior developer at Yandex and part of the team behind YaFSDP. “

The open-source model is available on GitHub.

Benefits and Implementation of YaFSDP

LLM training requires substantial computing power and resources, often resulting in high costs and extended training times. YaFSDP addresses these challenges leading to faster training times and reduced resource consumption.

For example, in scenarios involving models with 70 billion parameters, YaFSDP can save the equivalent of about 150 GPUs. This translates to potential monthly savings ranging from $0.5 to $1.5 million, depending on the GPU provider. The tool is particularly effective during the most communication-intensive stages of LLM training, such as pre-training, alignment, and fine-tuning.

Previously, the company has developed and shared several other open-source tools including, DataLens, CatBoost, YTsaurus, AQLM, and Petals. 

📣 Want to advertise in AIM? Book here

Picture of Shritama Saha
Shritama Saha
Shritama (she/her) is a technology journalist at AIM who is passionate to explore generative AI with a special focus on big techs, database, healthcare, DE&I, hiring in tech and more.
Related Posts
AIM Print and TV
Don’t Miss the Next Big Shift in AI.
Get one year subscription for ₹5999
Download the easiest way to
stay informed