Federated learning enables edge devices to use state of the art machine learning without centralising data and privacy by default.
The traditional way of implementing intelligence is on the server. AI algorithms primarily require centralising data on a single machine or a server. For example, when a company wishes to build an AI model to get insights to its users’ personalised preferences for services, they can execute the different iterations of models on data accumulated from the website or mobile application that users might have used the services on.
On the other hand, in federated learning, a centralised server first selects a few nodes and sends the initialised version containing model parameters of an ML model to all the nodes. Now, each of these nodes will execute the model, train the model on their local data and will have a local version of the model at each node.
You start federated learning, the model on the server and distribute it to some of the clients, and each client trains the model locally using local data and produces the new model locally trained and then sends it to the server. Now you average them out, and the combined average model will reflect the training from every client. After repeating the process, the combined model becomes the initial model for the next round. Each time the combined model gets a little bit better thanks to the data from all the clients.
Federated Learning Is Ideal For Edge & Mobile AI
ML models on server clients connect to the centralised server to make predictions, and all the data accumulates on the server itself. For edge devices, all this back and forth communication can hurt user experience due to network latency, lack of connectivity, battery life and sensitive data. Edge computing, the proliferation of IoT devices, along with data privacy hassles are giving way to federated learning as it permits for quicker testing times of the AI models with low latency and low power usage while also preserving the user’s privacy.
Federated learning enables edge devices to use state of the art machine learning without centralising data and privacy by default. A lot of data is born at the edge with billions of phones and IoT devices that generate data which can enable better products and smarter models. Data can be used locally at the edge with on-device inference, which is not only extremely fast and battery-saving but also gives better data privacy. This is because the server does not need to be in the loop for every interaction you have with the locally generated data.
For example, Google’s Gboard aims to be the most privacy forward keyboard by making use of on-device cache of local interactions, things like touchpoints, type text, context. This data is used exclusively for federated learning and computation. Google-led TensorFlow Federated enables companies to experiment with computation on decentralised data.
Addressing Data Privacy Issues Of Training AI/ML Models
With privacy regulations becoming more widespread, training AI models while maintaining privacy is becoming a key focus area. Federated learning can find use cases in this context. We all know Apple and Siri have been at the centre of controversy for potential misuse of recordings made to improve AI models assistants like Siri.
To overcome data privacy issues, Apple published a paper in which it talked about federated learning to train machine learning algorithms on different local datasets without exchanging data samples with a centralised server. This helps Apple to improve Siri with maintaining user privacy. In another incident, Google’s federated learning of cohorts (FLoC), presented targeted ads based on “flocks” of thousands of people with similar interests, instead of tracking users individually.