Google Returns to Federated Learning Over Privacy Concerns

The tech giant has guaranteed formal privacy due growing concerns.
Listen to this story

Google has created the first federated learning and distributed differential privacy system with formal guarantees against an honest-but-curious server. However, a fully malicious server could still bypass the privacy guarantees by manipulating the public key exchange or introducing fake malicious clients, said researchers in a recent blog post. 

In 2021, Google started using federated learning to train Smart Text Selection models, an Android feature to select and copy text easily by predicting what text users want to select and then automatically expanding the selection. Since the launch, Google has improved the models’ privacy by combining secure aggregation (SecAgg) and a distributed version of differential privacy. 

The recent development is all thanks to an honest-but-curious server that follows the protocol but could gain insights about users from the data it receives. The Smart Text Selection models trained with this system have reduced memorization by over double. Data minimization is a key principle hence the server learns nothing about individual updates and only receives an aggregate model update. The SecAgg protocol ensures this due to cryptographic guarantees.

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

For Smart Text Selection, all updates and metrics are aggregated via SecAgg using TensorFlow Federated, and stored in Android’s Private Compute Core. This enhances privacy because unaggregated model updates and metrics are not visible to any part of the server infrastructure.

SecAgg minimizes data exposure but does not guarantee against revealing anything unique to an individual. This is where differential privacy (DP) comes in. Last week, Google also announced a new method through which DP guarantees trusted servers control the process. 

Download our Mobile App

In practice, SecAgg has other privacy challenges. Google has addressed them by introducing an approach for auto-tuning the discretization scale during training and added integer noise using the distributed discrete Gaussian and distributed Skellam mechanisms. 

For The Sake Of Privacy: Apple’s Federated Learning Approach

Read: For The Sake Of Privacy: Apple’s Federated Learning Approach

As privacy concerns grew among consumers, major tech companies like Apple and Google started investing heavily in this decentralised form of machine learning, which trains models without collecting raw data. 

Federated learning is commonly used to power suggestion features and rank suggested items in context. The term was first coined by Google researchers in a 2016 paper, titled ‘Communication Efficient Learning of Deep Networks for Decentralised Data’ by Google researchers in 2016, and a heavily cited paper titled ‘Deep Learning with Differential Privacy,’ co-authored by Google and OpenAI researchers.

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Tasmia Ansari
Tasmia is a tech journalist at AIM, looking to bring a fresh perspective to emerging technologies and trends in data science, analytics, and artificial intelligence.

Our Upcoming Events

24th Mar, 2023 | Webinar
Women-in-Tech: Are you ready for the Techade

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York
MachineCon USA 2023 [AI100 Awards]

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Council Post: The Rise of Generative AI and Living Content

In this era of content, the use of technology, such as AI and data analytics, is becoming increasingly important as it can help content creators personalise their content, improve its quality, and reach their target audience with greater efficacy. AI writing has arrived and is here to stay. Once we overcome the initial need to cling to our conventional methods, we can begin to be more receptive to the tremendous opportunities that these technologies present.