Last updated December 9, 2021
In AI News & Update

Google Launches A Tool That Can Scale and Parallelize Neural Networks

GSPMD separates programming an ML model from parallelization and is capable of scaling most deep learning network architectures

Published on December 9, 2021
by Meeta Ramnani

Google AI has launched GSPMD – General and Scalable Parallelization for ML Computation Graphs, to address scaling challenges. GSPMD is capable of scaling most deep learning network architectures and has been applied to many deep learning models which include GShard-M4, BigSSL, LaMDA, ViT, and MetNet-2. GSPMD has also been integrated into multiple ML frameworks, including TensorFlow and JAX, which use XLA as a shared compiler.

The solution separates the task of programming an ML model from the challenge of parallelization. It allows model developers to write programs as if they were run on a single device with very high memory and computation capacity. The user only needs to add a few lines of annotation code to a subset of critical tensors in the model code to indicate how to partition the tensors. With GSPMD, developers may employ different parallelism algorithms for different use cases without the need to reimplement the model.

The separation of model programming and parallelism allows developers to minimize code duplication. GSPMD is designed to support a large variety of parallelism algorithms with a uniform abstraction and implementation. It also supports nested patterns of parallelism. The solution facilitates innovation on parallelism algorithms by allowing performance experts to focus on algorithms that best utilize the hardware, instead of the implementation that involves lots of cross-device communications.

In the recent MLPerf set of performance benchmarks, a BERT-like encoder-only model with ~500 billion parameters to which the team applied GSPMD for parallelization over 2048 TPU-V4 chips, yielded highly competitive results, utilizing up to 63% of the peak FLOPS that the TPU-V4s offer. As a shared, robust mechanism for different parallelism modes, GSPMD allows users to conveniently switch between modes in different parts of a model. This is especially valuable for models that may have different components with distinct performance characteristics, like multimodal models that handle both images and audio.

“As this often requires building larger and even more complex models, we are pleased to share the GSPMD paper and the corresponding open-source library to the broader research community, and we hope it is useful for efficient training of large-scale deep neural networks,” wrote Yuanzhong Xu and Yanping Huang, Software Engineers; Google Research, Brain Team, in the blog post.

Access all our open Survey & Awards Nomination forms in one place >>

Meeta Ramnani

Meeta’s interest lies in finding out real practical applications of technology. At AIM, she writes stories that question the new inventions and the need to develop them. She believes that technology has and will continue to change the world very fast and that it is no more ‘cool’ to be ‘old-school’. If people don’t update themselves with the technology, they will surely be left behind.

Google Launches A Tool That Can Scale and Parallelize Neural Networks

Meeta Ramnani

Download our Mobile App

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

3 Ways to Join our Community

Telegram group

Discord Server

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Recent Stories

KissanAI Releases Dhenu Llama 3, an Indic LLM for Farmers

Enhancing AI Integration through Optimal Data Management in the Global Convenience Food and Beverage Sector

Is it Humane to Bash Humane Ai Pin?

Meta Llama 3 Now Available on Databricks For Enterprise

How Databricks is Enabling Agriculture’s Data Revolution with UPL

How Good is Llama 3 for Indic Languages?

OpenAI Hires Pragya Misra As Its First Employee in India

Meta Forces Developers Cite ‘Llama 3’ in their AI Development

India is Making its Own AI Servers

World's Biggest Media & Analyst firm specializing in AI

Advertise with us

AIM publishes every day, and we believe in quality over quantity, honesty over spin. We offer a wide variety of branding and targeting options to make it easy for you to propagate your brand.

Branded Content

AIM Brand Solutions, a marketing division within AIM, specializes in creating diverse content such as documentaries, public artworks, podcasts, videos, articles, and more to effectively tell compelling stories.

Corporate Upskilling

ADaSci Corporate training program on Generative AI provides a unique opportunity to empower, retain and advance your talent

Hackathons

With MachineHack you can not only find qualified developers with hiring challenges but can also engage the developer community and your internal workforce by hosting hackathons.

Talent Assessment

Conduct Customized Online Assessments on our Powerful Cloud-based Platform, Secured with Best-in-class Proctoring

Research & Advisory

AIM Research produces a series of annual reports on AI & Data Science covering every aspect of the industry. Request Customised Reports & AIM Surveys for a study on topics of your interest.

Conferences & Events

Immerse yourself in AI and business conferences tailored to your role, designed to elevate your performance and empower you to accomplish your organization’s vital objectives.

AIM Launches the 3rd Edition of Data Engineering Summit. May 30-31, Bengaluru