MITB Banner

Facebook Gives Away This PyTorch Library For Differential Privacy For Free

Share

Recently, Facebook AI open-sourced a new high-speed library for training PyTorch models with differential privacy (DP) known as Opacus. The library is claimed to be more scalable than existing state-of-the-art methods.

According to the developers at the social media giant, differential privacy is a mathematically rigorous framework for quantifying the anonymisation of sensitive data. With the growing interest in the machine learning (ML) community, this framework is often used in analytics and computations. 

Differential privacy constitutes a strong standard for privacy guarantees for algorithms on aggregate databases. It is usually defined in terms of the application-specific concept of adjacent databases. The framework has several properties that make it particularly useful in applications, such as group privacy, robustness to auxiliary information, among others.

About Opacus

Opacus is a library which facilitates the training of PyTorch models with differential privacy. The library supports training with minimal code modifications required and has little impact on training performance. 

The library allows the client to track the privacy budget online at any given moment. The Opacus library also comprises pre-trained as well as fine-tuned models along with tutorials for large-scale models including the infrastructure that is designed for experiments in privacy researches. 

The developers stated that the core idea behind this algorithm is to protect the privacy of a training dataset. This can be achieved by intervening on the parameter gradients that the model will use to update its weights.

According to a blog post, the Opacus library is aimed mainly at two target audiences. 

  • Machine learning practitioners, who will find this library as an introduction to training a model with differential privacy.
  • Differential privacy scientists, who will find this easy to experiment as well as tinker with.

As mentioned by the developers, Opacus defines a lightweight API by introducing the PrivacyEngine abstraction. This takes care of tracking the privacy budget as well as working on the gradients of the model.

The Privacy Engine is attached to a standard PyTorch optimiser which makes training with Opacus easier than the traditional methods. After the training, the resulting artefact is a standard PyTorch model with no extra steps for deploying private models.

Features of this Library

Opacus provides a number of intuitive features, such as:

  • Speed: Opacus has the capability to compute batched per-sample gradients by leveraging the Autograd hooks in PyTorch. This results in an order of magnitude speedup when compared with the existing differential privacy libraries that rely only on micro-batching.
  • Safety: Opacus provides safety as this library uses a cryptographically safe pseudo-random number generator for its security-critical code. 
  • Flexibility: Using Opacus, developers can quickly prototype their ideas by mixing and matching the code with PyTorch code and pure Python code.
  • Productivity: Opacus comes with various tutorials as well as helper functions that have the ability to warn about the incompatible layers before the training of a model starts.
  • Interactivity: Opacus keeps track of how much of the privacy budget developers are spending at any given point in time by enabling early stopping and real-time monitoring. The private budget is referred to as a core mathematical concept in differential privacy.

Wrapping Up

According to the developers, the goal behind the development of Opacus is to preserve the privacy of each training sample while limiting the impact on the accuracy of the final model. The library accomplishes this by modifying a standard PyTorch optimiser in order to enforce (and measure) DP during training. 

Opacus is open-source for public use, and it is licensed under Apache-2.0. The latest release of Opacus can be installed via pip: pip install opacus
Know more here.

Share
Picture of Ambika Choudhury

Ambika Choudhury

A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.