A Beginners Guide to Federated Learning

Recently, Google has built one of the most secure and robust cloud infrastructures for processing data and making our services better, known as Federated Learning.

In Federated Learning, a model is trained from user interaction with mobile devices. Federated Learning enables mobile phones to collaboratively learn over a shared prediction model while keeping all the training data on the device, changing the ability to perform machine learning techniques by the need to store the data on the cloud. This method goes beyond the use of local models that make predictions based on mobile device APIs like the Mobile Vision API or the On-Device Smart Reply, bringing model training to the device as well. A device downloads the current model improves it by learning from data from the phone it is present in and then summarizes the changes as a small focused update. Only this little update on the model is sent to the cloud, using an encrypted communication method, where it is immediately averaged with other user updates to improve the shared model. 

All the other training data remains on the particular device, and no individual updates are stored in the cloud. Federated Learning eases the distribution of the training of models across several devices, the technique also makes it possible to take advantage of machine learning while minimizing the need and effort to collect user data. Such models make it possible to perform on-device inference. Tech conglomerates today are trying to bring their machine learning applications to the user’s devices to improve privacy and stability at the same time. 


Sign up for your weekly dose of what's up in emerging technology.

How can it be used? 

The Federated Learning model downloads the current model and computes an updated model on the device itself through the use of edge computing and in turn using local data. These locally trained models are then sent from the devices back to the central server where they are aggregated, i.e. processes such as averaging of weights are performed and then a single consolidated and improved global model is sent back to the target devices. Federated Learning allows for machine learning algorithms to gain experience from a broad range of data sets even if they are present at different locations. 

This approach enables multiple organizations to collaborate on the development of a model, without the need to directly share secure data with each other. Over the course of several repetitive training iterations, the shared models get exposed to a significantly wider range of data than what they would on any single organization. Federated Learning, therefore, decentralizes machine learning by removing the need to pool data into a single location. Instead, the same model is trained through multiple iterations at different locations. The potential hazard of sharing sensitive clinical data with each other is completely eliminated. 

Download our Mobile App

The Federated Learning Process

During initial training, the learning method returns a trained model back to the server. Popular machine learning algorithms such as deep neural networks and support vector machines could be parameters for the purpose of analysis. Once trained, it encodes with the statistical patterns of data in numerical parameters and they no longer need the training data for inference. So when the device sends the trained model back to the server, it does not contain raw user data. Once the server receives the data from its user devices, it updates the base model with aggregated parameter values of user trained models. This federated learning cycle must be repeated several times before the model reaches an optimal level of accuracy that might be satisfactory enough for the developer and his expectations. Once the model is finally prepared and ready, it can be distributed across all users at once for on-device inference.

Image Source 

Practical Paradigms of using Federated Learning

Several Federated learning tasks, such as federated training or evaluation with existing machine learning models can be easily implemented using TensorFlow. Using TensorFlow, Federated Learning can be implemented even without requiring prior knowledge of how it works under the hood, and also offers components to evaluate the implemented Federated Learning algorithms on a variety of existing models and data. 

The interfaces offered consist of the following three key parts:

  • Models: Classes and helper functions that allow to wrap the existing models for use with TFF. Wrapping a model can be done by calling a single wrapping function i.e tff.learning.from_keras_model, or defining a subclass of the tff.learning.Model interface for full customizability.
  • Federated Computation Builders: These are helper functions that help construct federated computations for training or evaluation, using the existing models.
  • Datasets: Collections of data that you can download and access in Python for use in simulating federated learning scenarios. Although federated learning is designed to be used with decentralized data that cannot be simply downloaded at a centralized location, to conduct initial experiments for research purposes, data that can be downloaded and manipulated locally, which becomes especially useful for developers. Two examples of testing with datasets are image classification and text generation. 

Example: Visualizing  MNIST digits for a particular client

import collections

import numpy as np
import tensorflow as tf
import tensorflow_federated as tff

Code and Image Source 

Example 2: Creating a list of datasets from a given set of users 

def make_federated_data(client_data, client_ids):
  return [
      for x in client_ids

Tensorflow also offers two levels of aggregation processes for Federated Learning :

  • Local aggregation: This refers to aggregation across multiple batches of examples owned by an individual client.
  • Federated aggregation: This refers to aggregation across multiple clients (devices) in the system.

Drawbacks Of Federated Learning 

Federated Learning currently can’t solve all machine learning problems, for example learning to recognize different dog breeds by training on carefully labelled examples. If the model becomes gigantic to run on the end user’s device, a developer might have to find other ways to preserve user privacy. When training data is available on the user’s device, data engineers at times might face difficulties by not having a way of evaluating the data and making sure it will be completely beneficial to the application. For this reason, federated learning currently must be limited only to applications where the user data does not need any preprocessing. Federated learning currently seems to be better suited for unsupervised learning applications such as language modelling amongst others. But Google still continues to advance its state-of-the-art for cloud-based ML and to research to expand the range of problems that can be solved using Federated Learning. They also work to improve the language models that power keyboards based on what you actually type on your phone and provide photo rankings based on what kinds of photos people look at, share, or delete. 


Federated Learning seems to have created a new era of safe and secured AI and seems to possess a lot of potentials to be worked on in the future. It provides a method to secure sensitive user information, but also aggregates results and identifies common patterns from a whole bunch, which in turn makes the model robust. It can train itself as per its user data while keeping it secure. Training and testing with Federated Learning seem to become smarter and self-sufficient. As it is still in its early stages and currently facing numerous challenges with its design and deployment it is too soon to provide a perfect verdict. 

End Notes

In this article, we understood what Federated Learning is and how it works. We also discussed how it can be used and the process under the hood. You can try Federated Learning hands-on accessing a sample notebook provided by TensorFlow using the link here.

Happy Learning!


Easy to Understand Federated Learning 

TF website for Federated Learning

Support independent technology journalism

Get exclusive, premium content, ads-free experience & more

Rs. 299/month

Subscribe now for a 7-day free trial

More Great AIM Stories

Victor Dey
Victor is an aspiring Data Scientist & is a Master of Science in Data Science & Big Data Analytics. He is a Researcher, a Data Science Influencer and also an Ex-University Football Player. A keen learner of new developments in Data Science and Artificial Intelligence, he is committed to growing the Data Science community.

AIM Upcoming Events

Early Bird Passes expire on 3rd Feb

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

All you need to know about Graph Embeddings

Embeddings can be the subgroups of a group, similarly, in graph theory embedding of a graph can be considered as a representation of a graph on a surface, where points of that surface are made up of vertices and arcs are made up of edges