Can machine learning and data privacy coexist?

ZKPs employs privacy-preserving datasets inside transparent systems such as public blockchain networks like Ethereum
Listen to this story

On May 25, 2018, a historical regulation came into effect in Europe that changed the course of online privacy across the world. On that day, General Data Protection Regulation (GDPR) was enforced in Europe that tightened the rule for companies when it comes to collecting online data. They have to take consent from online users to acquire their personal data. 

With the continuous rise in awareness about data privacy, users are more hesitant about sharing data, which has made data gathering more difficult for companies. 

To better understand their customers’ needs and improve their services, businesses must perform analytics on user data. Their insights become more valuable with the more data they can collect. As a result, there is a significant incentive to obtain data from other users or outside businesses. Additionally, there is a significant financial incentive to sell data to other businesses as an asset.

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

ZKPs: Get results without seeing data 

Zero Knowledge Proofs (also known as ZKPs) is a method for one party to demonstrate cryptographically to another that an insight from collected data is accurate without divulging the actual information below. 

Let’s look at an example to better grasp this. Consider a friend (Verifier) of yours (Prover) who is colorblind and unable to distinguish between a blue and red ball. He believes the balls are the same colour leading you need to convince him they aren’t. He only needs to know if they are different; he doesn’t need to know which exactly is red and which is green.

So, you give the balls to your friend while keeping track of which ball is in whose hand. Then, your friend puts the balls behind his back and decides whether or not to switch them around. He then displays them to you again, and you must now tell him whether or not the balls have swapped hands.

If you aren’t colour blind, this is very easy to do because you can see the difference in colour. You can clearly tell when the red ball was moved from his left to right hand, for instance.

In fact, we can determine that with an accuracy of 100%, because, again, the noticeable difference in colours.

If you are not colorblind, you can easily see the difference in colour and whether the red ball is switched from his left to right hand. 

But hang on!

Your friend is sceptical; the balls appear identical to him, and he suspects you are attempting to deceive him. After all, you have a one-in-two probability of accurately predicting whether or not he exchanged the balls. Those are reasonable chances, so you decide to repeat the experiment.

Your friend hides the balls behind his back, selects whether or not to exchange them, and then returns them to you. Again, you can see if he switched them or not. However, if the balls were all the same colour, you’d have to guess again. And, your odds of guessing right have now been cut in half—to 1 in 4 or 25%.

Repeat this method ten times, and your chances of guessing right decline to 0.09%. This likelihood is modest enough to persuade your friend that the balls are actually a different colour; nevertheless, you cannot be so fortunate.

Of course, you could keep going. Each time you repeat the process, the probability that you’re cheating decreases further.

So that’s it! We have “proven” to our friend that the balls have a different colour.

But, this proof does not convey the actual colours of the balls to our friend. Hence, the name ‘zero knowledge proofs’.

ZKPs: Application in ML

ZKPs employs privacy-preserving datasets inside transparent systems such as public blockchain networks like ‘Ethereum’.

While blockchains are intended to be highly transparent, with anyone running their own blockchain node being able to see and download all data stored on the ledger, the addition of ZKP technology allows users and businesses to leverage their private datasets in the execution of smart contracts—code that specifies predetermined criteria that, when satisfied, trigger results—without revealing the underlying data.

It is well known that data helps machine learning models improve. This provides a significant incentive for businesses to sell and buy data from one another. There is currently no method to assess the quality of a dataset without sharing it. This is when ZKPs come in handy once more.

As a sanity check, ZKPs may be used to build a Prover that employs certain calculations to verify if the data fulfils particular constraints or attributes. The data would be kept secret by the Prover, and the fact that it is on a public blockchain would ensure that there is no fraud in the verification process. This guarantees that the consumer has trust in the dataset’s quality before purchasing it.

When it comes to selling the ML Algorithm, we can use ZKPs to install a pre-trained model onto a Prover. The buyer can then provide a test dataset to see if the model performs well on the test set. This guarantees that the buyer has faith in the ML Algorithm prior to purchasing it.

In order to encourage innovation and build a more effective global economy, ZKP technology is effectively opening up a variety of institutional use cases for public blockchain networks that were previously unreachable.

Tausif Alam
Tausif Alam has been covering technology for almost a decade now. He is keen about connecting dots and bringing a wholesome picture of the tech world.

Download our Mobile App


AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIM Research

Pioneering advanced AI market research

Request Customised Insights & Surveys for the AI Industry

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox