How Predictive Analytics is used to Win Elections

The use of predictive analytics in election campaigns is increasing by the day. Due to the huge SBD available to political parties, data is now easier to access than ever before 
Listen to this story

The 2019 Lok Sabha election was a game-changer for the BJP. The Saffron party received 37.36% of the total vote share – the highest first-ever for a political party since 1989. In 2014 too, BJP had swept 31% of the vote share. What lies behind this winning streak? A mix of the ‘Modi wave’, political strategising, big money and Big Data.

Former US President Barack Obama was the first to use data analytics on a massive scale in his election campaign. He used a computer program called Project Narwhal, which had features like rapid iteration, minimal barriers between developers and operations staff, heavy use of cloud technology, and constant testing. With the help of such tools, he could interact with his voter base on Reddit, place ad campaigns on unconventional media and gauge the attitude and movements of his voter base. 

In India, BJP relied heavily on Big Data and political analysts, using these to maximise numbers. 

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Seven months before the 2014 general election, Modi had made a controversial statement – “build toilets before temples”. The BJP IT team noticed that 45% of the internet population agreed with the statement. This was the same group that fell into their prospective voter base. With the help of the data, the communication team of the BJP was able to convert that statement to one of the most used slogans India would see in the next decade: “Swachh Bharat”, which was received wholeheartedly by 68% of respondents. It was all possible due to the use of predictive analytics.

Predictive analytics is a tool that political parties occasionally employ to evaluate their potential voter base. In essence, it combines mathematical and statistical tools to create a model that predicts the future, based on past trends.

Predictive Analytics in political campaigns finds its base in SBD (Social Big Data). It can be used for voter modelling and personalisation, spam and social influence prediction, content segmentation and classification, and voter engagement. It is a well-known fact that Cambridge Analytica used more than 87 million Facebook user profile data to provide analytics assistance to competing political parties in the 2016 US election. 

Twitter, too, is used to get insight into public opinions. With 500 million daily tweets and 200 billion tweets a year, Twitter is one of the most sought-after tools for data collection by the political parties around the globe. Unlike other social media websites, almost all activities of users are public on this platform. Twitter also aids data collection as its API is easily accessible to the users.

The “user API” approach is one of the ways to gather data. It is carried out in two stages, the first involves mining the voter’s historical data while the second is gathering the voter’s current data. It is clear that previous data is used to forecast the political inclinations of users as well as their future ideologies.

Pre-processing procedures like data cleaning and quality improvement are carried out before the dataset is entered into the prediction module. Only after that are a person’s political preferences assessed. The two criteria used to quantify it are continuity and knowledgeability.

The number of political entities that can be identified from a user’s tweets during a specific time period is measured as continuity. Knowledgeability describes how familiar a user is with politics and whether they have done any work or research in the area. Political entities annotated from the user’s profile and tweets are the main factors in determining the knowledgeability of a user. This is done by running certain commands, like in the following table:

                                                        Source: ResearchGate

The list of political entities is also counted regularly, mainly because of interest changes. A person might get influenced by another party and change his ideology or their interest in politics may last only for the election season. 

The collected and cleansed data is divided into two groups: 1) On-topic users, who are interested in politics and 2) Off-topic users, who show minimal or no interest in it. Shortlisted voters from both groups are then used to develop a predictive model, which predicts a particular voter’s inclination for a political ideology. 

Lokesh Choudhary
Tech-savvy storyteller with a knack for uncovering AI's hidden gems and dodging its potential pitfalls. 'Navigating the world of tech', one story at a time. You can reach me at: lokesh.choudhary@analyticsindiamag.com.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR