Booked: Should You Allow Cab Services To Use Your Data

“Ridesharing apps usually store information about where users are going, like frequent destinations, how long they stay there, and origins of trips.”

All your favourite apps work the way they do because they rely heavily on personalisation. Algorithms devour the data that users provide and leverage the behavioural patterns of users to offer ‘5-star’ experience. There is nothing with the internet knowing your favourite movie, but how willing are you to give up details of your most visited place? Ride-hailing services use location data which in turn can reveal personal habits and preferences. Riders might not be keen to share the exact location of their origin and/or destination. 

Masking location data might preserve the privacy of the users, but the loss of this information could lead to inefficiencies in the services. There is a trade-off between privacy and performance. In order to address this conundrum, a group of researchers from Istituto di Informatica e Telematica and MIT have teamed up to understand the price of privacy in terms of decreased efficiency of the mobility sharing system.

“In this paper, for the first time, we address the privacy issues under this point of view and show how location privacy-preserving techniques could affect the performance of mobility sharing applications, in terms of both System Efficiency and Quality of Service. To this extent, we first apply different data-masking techniques to anonymise geographical information, and then compare the performance of shareability networks-based trip matching algorithms for ride-sharing, applied to the real data and to the privacy-preserving data,” wrote the authors. 

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

The paper also has cited instances when some companies might simply share general usage statistics, while others may send user-identifying data to third parties.

Who Will Bear The Cost

 In order to anonymise location data and guarantee privacy to carpoolers, the authors used three different data-masking techniques, namely k-anonymity, obfuscation and cloaking, and fine-tune their settings to achieve different levels of anonymisation granularity. 

The data used for the experiments were taken from the survey done on the commuters in the city of Pisa. The survey covered demographic data, workers commuting data and their attitudes towards using the service. 

The authors then applied the Carpooling Shareability Networks-based trip matching algorithms. Regarding the reason behind choosing the shareability network-based algorithms for matching the trips, the authors wrote that this is the methodology that accounts for shared travelled distance, detour time, and flexibility (i.e. waiting time), while performing trip matching, in a very efficient way, allowing to compare data-masking techniques and different parameter settings easily.

The results show that the price to pay to guarantee data privacy within a carpooling system is to increase its total carbon footprint, i.e. the total travelled miles, but this effect could be mitigated if users are willing to spend (or “pay”) for more time in order to share a trip on a daily basis.

The focus of this paper has been on the anonymisation of data. In this work, the authors have tried to

  • Quantify the effects of privacy control on ride-sharing applications in terms of quality of service and system efficiency loss.
  • Compare different techniques and different levels of location data anonymisation granularity.
  • Show the trade-off between data location privacy and data utility.

The authors believe that if companies start offering users benefits to nudge them to opt-out from the “privacy option”, might lead to different levels of data privacy within the same privacy-preserving system. “Performing tests on the sensitivity of the system efficiency and QoS with respect to the riders penetration rate and their geographical distribution could be another interesting research direction to investigate,” added the authors.

Check the original paper here.

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

Download our Mobile App


AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIM Research

Pioneering advanced AI market research

Request Customised Insights & Surveys for the AI Industry


Strengthen Critical AI Skills with Trusted Corporate AI Training

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox