Listen to this story
|
Google has created the first federated learning and distributed differential privacy system with formal guarantees against an honest-but-curious server. However, a fully malicious server could still bypass the privacy guarantees by manipulating the public key exchange or introducing fake malicious clients, said researchers in a recent blog post.
In 2021, Google started using federated learning to train Smart Text Selection models, an Android feature to select and copy text easily by predicting what text users want to select and then automatically expanding the selection. Since the launch, Google has improved the models’ privacy by combining secure aggregation (SecAgg) and a distributed version of differential privacy.
The recent development is all thanks to an honest-but-curious server that follows the protocol but could gain insights about users from the data it receives. The Smart Text Selection models trained with this system have reduced memorization by over double. Data minimization is a key principle hence the server learns nothing about individual updates and only receives an aggregate model update. The SecAgg protocol ensures this due to cryptographic guarantees.
AIM Daily XO
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
For Smart Text Selection, all updates and metrics are aggregated via SecAgg using TensorFlow Federated, and stored in Android’s Private Compute Core. This enhances privacy because unaggregated model updates and metrics are not visible to any part of the server infrastructure.
SecAgg minimizes data exposure but does not guarantee against revealing anything unique to an individual. This is where differential privacy (DP) comes in. Last week, Google also announced a new method through which DP guarantees trusted servers control the process.
Download our Mobile App
In practice, SecAgg has other privacy challenges. Google has addressed them by introducing an approach for auto-tuning the discretization scale during training and added integer noise using the distributed discrete Gaussian and distributed Skellam mechanisms.
Read: For The Sake Of Privacy: Apple’s Federated Learning Approach
As privacy concerns grew among consumers, major tech companies like Apple and Google started investing heavily in this decentralised form of machine learning, which trains models without collecting raw data.
Federated learning is commonly used to power suggestion features and rank suggested items in context. The term was first coined by Google researchers in a 2016 paper, titled ‘Communication Efficient Learning of Deep Networks for Decentralised Data’ by Google researchers in 2016, and a heavily cited paper titled ‘Deep Learning with Differential Privacy,’ co-authored by Google and OpenAI researchers.