TensorFlow v2.4 Released: Roundup Of Major Features & Updates

TensorFlow New Release

TensorFlow released its new version 2.4 earlier this week. This version promises increased support for distributed training, a new NumPy frontend, and methods for monitoring and diagnosing bottlenecks. Notably, this release comes around the same time TensorFlow celebrated five years of its existence. Another important coincidence is that this release comes on the heels of an announcement from its rival Pytorch, which released its v1.7.1, just recently.

Some of the new features and updates of TensorFlow’s new release are discussed in this article.

THE BELAMY

Sign up for your weekly dose of what's up in emerging technology.

New Features For Distributed Training

Parameter Server Strategy: The TensorFlow Distribute module, which is TensorFlow API for distribution strategy, from v2.4 will have experimental support for asynchronous training for modules with Parameter Server Strategy. Experimental APIs are the ones which will be added eventually to TensorFlow but may be subject to change in the backwards compatibility ways.

Parameter Server Strategy is a common data parallel method for scaling up a machine learning model on several machines. It consists of workers and parameter servers, where the ‘worker’ reads and updates the variables created on the parameter server. Since the whole process of reading and updating is done independently without synchronisation with each other, it is also called asynchronous training.

Multi Worker Mirrored Strategy: This is an API Strategy that implements distributed training with synchronous data parallelism, like its counterpart Mirrored Strategy. The only difference between the two is that the former can enable training across multiple machines running on several GPUs each. To keep the variables in sync, Multi Worker Mirrored Strategy uses CollectiveOp, which is a single operation that can automatically choose an all-reduce algorithm based on the hardware and the network topology. 

In the new version, Multi Worker Mirrored Strategy has graduated from experimental and is now a part of a stable API.

New Updates For Keras

Mixed Precision: While most TensorFlow models use the 32-bit floating type, there are a few lower-precision models which may use 16-bit floating type to limit the use of memory. Mixed Precision makes use of both 16-bit and 32-bit floating-point types during training to make the model run fast and use less memory. The Mixed Precision API helps in improving the model performance by at least three times on GPUs and up to 60% of TPUs.

In v2.4, Mixed Precision has moved out of the experimental phase and has been instated as a stable API.

Optimizers: In TensorFlow, Optimizers are an extended class that includes additional information to train a specific model, while improving its speed and performance. Examples of Optimizers include — SGD, RMSprop, Adam, Adadelta, Adagrad, Adamax, Nadam, and Ftrl.

The new TensorFlow release includes reconstructing the Optimizers class in Keras to enable the use of custom training loops. This update will help in writing the training code that works with any optimizer. Additionally, all the Optimizer subclasses will now accept gradient_transformers and gradient_aggregator arguments to easily define custom gradient transformations.

Improvements to Functional API Model: The new release also includes a major refactoring of the internals of Keras Functional API. With this update, there is expected to be an improvement in the memory consumption of the functional model and also the simplification of the triggering logic.

Experimental Support For NumPy API

TensorFlow v2.4 now introduces the experimental support for a subset of NumPy APIs which helps in running the NumPy code. The NumPy API is built on top of TensorFlow, which enables to interoperate seamlessly with the latter. This allows access to all TensorFlow APIs and provides an optimised execution with the help of compilation and auto-vectorisation.

GPU Support

TensorFlow 2.4 will enable support for the newly introduced NVIDIA Ampere GPU architecture as it can run with both CUDA 11 and cuDNN 8. For the uninitiated, CUDA (Computer Unified Device Architecture) is a parallel computing platform, and API model from NVIDIA and cuDNN is a library for deep neural networks built using CUDA.
The GitHub link can be found here.

More Great AIM Stories

Shraddha Goled
I am a technology journalist with AIM. I write stories focused on the AI landscape in India and around the world with a special interest in analysing its long term impact on individuals and societies. Reach out to me at shraddha.goled@analyticsindiamag.com.

Our Upcoming Events

Masterclass, Virtual
How to achieve real-time AI inference on your CPU
7th Jul

Masterclass, Virtual
How to power applications for the data-driven economy
20th Jul

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, Virtual
Deep Learning DevCon 2022
29th Oct

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM
MOST POPULAR

What can SEBI learn from casinos?

It is said that casino AI technology comes with superior risk management systems compared to traditional data analytics that regulators are currently using.

Will Tesla Make (it) in India?

Tesla has struggled with optimising their production because Musk has been intent on manufacturing all the car’s parts independent of other suppliers since 2017.