8 Ways Google’s Newly Open-Sourced Differential Privacy Library Can Help Developers

In a recent announcement, Google released an open-source version of the differential privacy library which it currently uses to power some of its core products like Google Map. It will let developers and organisations implement features that are otherwise difficult to execute from scratch, hence promising them ease of use and deployment. 

THE BELAMY

Sign up for your weekly dose of what's up in emerging technology.

As the company explains, differential privacy strategically adds random noise to user information stored in databases so that companies can still analyse it without being able to single people out. Open-sourcing can, therefore, help other developers achieve that same level of differential privacy defence. The idea is to make it possible for companies to mine and analyse their database information without invasive identity profiles or tracking. The tech giant is hopeful that it can drastically help mitigate data breach. 

How It Works

Differentially private data analysis enables organisations to learn from their data while simultaneously ensuring that those results do not allow any individual’s data to be distinguished or re-identified. The goal of differential privacy for machine learning is to only “encode general patterns rather than facts about specific training examples.” 

This allows user data to remain private, while the system overall still learns and can advance from general behaviour. It will not only offer equations and models needed to set boundaries and constraints on identifying data but also include an interface to make it easier for more developers to actually implement the protections. 

Google has been working on this function along with other privacy settings such as Federated Learning and Google’s Responsible AI Practices. Google currently uses it to protect all different types of information, like location data, generated by its Google Fi mobile customers.

The Features

  • It will facilitate users with statistical functions allowing most common data science operations to be supported such as compute counts, averages, medians, percentiles and more
  • It includes an extensible ‘Stochastic Differential Privacy Model Checker library’ apart from an extensive test suite, to help prevent mistakes
  • Query engines are a major analytical tool for data scientists, and one of the most common ways for analysts to write queries is with Structured Query Language (SQL). It has included PostgreSQL extension along with common recipes to help get started. It also makes it ready-to-use
  • Google researchers have also included other functionalities such as additional mechanisms, aggregation functions, or privacy budget management

8 Ways It Can Help Developers

  1. It will allow developers to build their own tools with the help of this library. It will also allow them to aggregate data without revealing personally identifiable information either inside or outside their companies. It will allow developers to build tools that analyze personal data without compromising the privacy of the people whose data they are working with.
  2. It will bring strong privacy protections in place to make the most of the data and help maintain citizen trust.
  3. It will add to the existing privacy offerings by Google such as Tensorflow Privacy and Tensorflow Federated, which it had announced last year. 
  4.  TensorFlow Federated is an open-source framework which implements an approach called Federated Learning and allows experimentation with machine learning and other computations on decentralised data. TensorFlow Privacy, on the other hand, is also an open-source library that allows developers to train machine-learning models with privacy. 
  5. Its flexibility will ensure that it is applicable to as many database features and products as possible. 
  6. Differential privacy is usually complicated and is difficult to design one from scratch. Google hopes that its open-source tool will be easy enough to be a one-stop-shop for developers.
  7. This system is able to capture most data analysis tasks based on aggregations, performs well for typical use-cases, and provides a mechanism to deduce privacy parameters from accuracy requirements, allowing a principled decision.
  8. It can be used to produce aggregate statistics over numeric data sets containing private or sensitive information.

More Great AIM Stories

Srishti Deoras
Srishti currently works as Associate Editor at Analytics India Magazine. When not covering the analytics news, editing and writing articles, she could be found reading or capturing thoughts into pictures.

Our Upcoming Events

Masterclass, Virtual
How to achieve real-time AI inference on your CPU
7th Jul

Masterclass, Virtual
How to power applications for the data-driven economy
20th Jul

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, Virtual
Deep Learning DevCon 2022
29th Oct

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM
MOST POPULAR

What can SEBI learn from casinos?

It is said that casino AI technology comes with superior risk management systems compared to traditional data analytics that regulators are currently using.

Will Tesla Make (it) in India?

Tesla has struggled with optimising their production because Musk has been intent on manufacturing all the car’s parts independent of other suppliers since 2017.