Using Social Media For Predictive Data Analytics Is In Its Infancy

Social media is increasingly successful in putting across social phenomena which previously was studied only through traditional surveying techniques such as telephone or face to face interviews. Customers scrolling through Facebook, when coming across a post from one of their favorite apparel store showcasing a fantastic pair of jeans and some shirts, “like” it and at times leave a comment that they are eager to buy some or all of these; and eventually scroll down.

Now what needs to be paid attention to is how that apparels store puts the likes and comments at work? At the most what they will do is to use it and shape their social media marketing strategy. Chances of them using that data to make informed decisions for improving operational efficiency such as how many pairs of those jeans to manufacture or whether to hike or reduce the prices is negative.

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Yes, this is true. Social media data can be effectively used to improve sales forecasts. In the aforementioned example, information from apparel company’s Facebook interactions can be incorporated into predictive data analytics models, and ultimately it will help in estimating the purchases their customers will more likely make. The world of data and analytics has changed and has changed big time.

Today simply collecting social media data is not enough. Instead, companies are required use these collected data to upgrade their forecasting techniques. Though a subject of further research, it is speculated that data from social media most probably reflects how much attention customers are paying towards a brand, which equates to the good or bad word of mouth. However; the sad part is that companies still follow the conventional way of doing a thing where they care more about the technique’s effectiveness instead of the mechanism that makes it effective. This is an effort we have made to enlighten one and all about how to use social media for predictive data analytics.

Download our Mobile App

Social Media Data Types

Twitter and Google trends have proved themselves when it comes to modelling stock prices. Their higher data volumes and immediacy help them in leaving behind Facebook. But that does not bring an end to the proves of Facebook, as it is extensively used for modeling sales, human emotions, personalities and human relations to a brand. At the same time picture and video-based social media platforms such as Instagram, YouTube and Netflix are also gaining grounds, and are expected to become more relevant for predictive analytics in near future. Enlisted are some of the types of data, predictive data models are relevant to:

  • Time series – sales per month or sales per day
  • Cross-sectional – individuals such as customers, for a given period of time
  • Longitudinal/panel – a combination of the former two such as a set of customers observed through several months

Social media data processing

Upon concluding to use social media data for the predictive purpose, the data collected at the level of the individual action will require being pre-processed to enter it into a predictive model. At times the data is temporarily aggregated to match the chronological aggregation level of the outcome, i.e. monthly data. Also not to forget that social media data is also in form of text variables, filtering, interpretation, and classification is also required. Organizations willing to use social media data for preparing predictive models to forecast sales find preprocessing of the social media data challenging from the computational aspects of data analytics. Once individual actions such as posts, likes are classified and sequentially aggregated, what remains is a set of potential explanatory factors which are very limited. Due to outcome variables with fairly low frequencies, modelling process deviates less from conventional approaches within predictive modelling.

Model equation – theory-based Vs data-driven

Ideally, linear, non-linear, parametric, non-parametric and semi-parametric models are taken into consideration. The thought process behind it also is that non-linear models will require more data points/observations as compared to previous ones. There is a range of possible starting points for the search process.

Using predictive models for forecasting purposes

Upon finalizing the model, there comes a stage where considerations are required to be made regarding the implementation. One of the considerations is “how often the model needs re-estimation or specification updating”.

Looking at the robust general data pattern, the specifications updating is expected to take place only if new variables come into the picture. The second scenario could be where an adequate amount of data points become available, allowing more complex structures to be formed. A combination of forecasts from different basic predictive models may take the center stage, as a practical perspective.

Fine-Grained Forecasts

The only catch here could be that social media data may not be applicable across industries. Yes, you read it right. Social media data tends to be more relevant to products with uncertain sales and are heavily influenced by trends, fashion, and entertainment to name a few. It is still to be witnessed if adding Facebook data would improve forecasts for consumer goods like breakfast cereals –where sales are already fairly predictable.

There is a long but definite way of exploring why and how can social media improve forecasts. The next step here could be to implement best practices to predict sales for individual products and not only overall sales. A step further, with help of data based on geographic areas, the information can and would be used to decide what would be the penetration of a particular product or service in coming days.

Companies by strategizing their social media posts can learn and make better decisions. This will help them to specifically draw information to run their operations, effectively. As in more and more companies can opt to display potential products to decide what to manufacture, based on customer’s reactions.


Predictive models have succeeded big time by offering numerical forecasts and assessments, along with quantitative statements to improve decisions in companies and by public authorities. Going ahead with parsimonious, simple models that capture the most important features of the data is advisable. They fulfill model assumptions and provide a good fit both in a sample and out of sample. Furthermore, it is important that even during the phase where the model is applied for its purpose, it performances is still monitored.

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Chirag Shivalker
Chirag Shivalker is the content head at Hi-Tech BPO and has helped enterprise level clients globally in managing their data and providing analytics; ultimately empowering them to make insightful business decisions, take bold action, and execute quickly. He regularly writes about the importance of data management for data analytics and the changing landscape of the business process management industry.

Our Upcoming Events

24th Mar, 2023 | Webinar
Women-in-Tech: Are you ready for the Techade

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York
MachineCon USA 2023 [AI100 Awards]

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Council Post: The Rise of Generative AI and Living Content

In this era of content, the use of technology, such as AI and data analytics, is becoming increasingly important as it can help content creators personalise their content, improve its quality, and reach their target audience with greater efficacy. AI writing has arrived and is here to stay. Once we overcome the initial need to cling to our conventional methods, we can begin to be more receptive to the tremendous opportunities that these technologies present.