Machine Learning Approach In Fantasy Sports: Cricket

I’m very conscious of data and analytics and understanding how our body works and the different loads that we put on it throughout the course of games and practices. It helps you make adjustments if you need to, helps you be smarter about your workouts, and I think it protects you from injuries to not over-exert yourself.

Stephen Curry

A fantasy sport (also known as rotisserie or roto) is a type of game in which players assemble fictional or simulated teams made up of proxies of actual players in a competitive sport. It is mostly played online. These teams compete based on their players’ statistical success in live games.

Humans have historically made qualitative decisions in sports, based on gut instincts or commitment to team culture and tradition. Sports analytics provides new ways to evaluate players and teams abilities. Players and teams strengths and weaknesses can be analyzed using data to aid decision-making, and improvements to training sessions to maximize results can be made as a result.


Sign up for your weekly dose of what's up in emerging technology.

This article is based to analyze the performance of cricket depending upon player, team and environment attributes. The need for the article is because people are investing their time and income in fantasy sports.

Whensoever, there is an event there is an arise of queries in everyone whose ever is betting in fantasy sports are:

1.     Which players to choose?

2.     How many runs will that player score?

3.     How many wickets will that player be going to take?

4.     Will the player be going to score a half-century or not?

If we can answer these questions that will be going to summarize the impact of an individual player for a given match and helps to understand which player to choose.

With the use of: 

Regression Analysis: As we want to predict the runs scored by a particular player or a team in a session we have to formulate the hypothesis for selecting the attributes such as Average, the 50s, 100s, Strike rate with label data as Runs in case of the player

Runs = w0 X Average + w1 X 50s + w2 X 100s + w3 X Strike Rate +…..+ wN X Parameters N;

Prediction of Runs of an individual player in a single match or session:

Suppose you are given a following data set and you have to predict the scoring for a player;

Let’s deep dive in:

In the above data set all the attributes are features; in this, we can calculate a new column named “Run” and state it as predicted column “Label”

Creation of dummy variable for Player Name

Post creation of dummy variable we will be dividing data into train and test used in model building using Linear Regression

Creating a function to predict the run:


The features we are considering in an above models are:

  1. Player
  2. Average
  3. Ball Faced
  4. Strike Rate

Now let’s talk about prediction of run by batsman in an overall session

We will be changing the label data to Run of the overall session; would be formulating the following function using the Linear Regression model:


The features we are considering in an above models are:

  1. Player
  2. Mat
  3. Inns
  4. Not Out
  5. Average
  6. Highest Score
  7. Ball Faced
  8. Strike Rate
  10. FIFTY
  11. Fours
  12. Six

Above mentioned analysis could help to predict the score; however, we should also consider more attributes like climate and pitch condition to get more realistic insights. Similarly, we could create a model to predict the number of wickets (using Linear Regression) and whether the person will score fifty or hundreds or not (using classification model); however we shall also much more emphasis on Grid Search CV and Hyperparameter tuning to make the model performs much better.

Calculation of Impact of Performance of a Player

A simplistic approach to calculate the impact of the performance of a player in a given match is by using Z-score.

How To Easily And Quickly Calculate Z Scores In Excel

However, we could able to also predict the influence of a player depending upon the value we could be able to select an individual player in a fantasy team.

By analyzing the player, game better, teams can obtain a competitive advantage, and research that provides a better understanding of the dynamics of the game is therefore of great importance.

More Great AIM Stories

Swetank Pathak
I am a Sports Data Scientist with extensive knowledge of sports science aims to provide betterment of sports teams and individual players in a data-driven approach with a prime focus on athlete performance and injury management and with experience executing data-driven solutions to increase efficiency, accuracy, and utility of internal data processing. Currently pursuing Post Graduate Program in Data Science and Business Analytics at The University of TEXAS at AUSTIN, McCombs School of Business & Great Lakes Institute of management.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM