How To Use Predictive Analytics In Cricket

Analytics In Cricket

Cricket is one of the many sports that require a sphere ball and a bat to play, with a set of rules, which makes this game unique and different from others. Cricket has evolved over the years starting from test matches followed by one-day matches, and from past few years, T20 cricket has taken a lot of attention. But till date, the ICC Cricket World Cup has been the most prestigious tournament of the all, which is a form of limited 50 overs match. 

The ICC Cricket World Cup is an international sporting event that is held approximately every four years since its inception in 1975, with preliminary qualification rounds leading the teams up to the finals. Studies have been done in cricket in terms of physiological, psychological or the physical demands of batsmen, wicket keepers, spinners and pace-bowlers in different formats of play, however recently a few of the studies have focused on the performance analysis of individual players or a whole team by calculating the effect size. But to the best of our knowledge, none of the studies have focused on developing a predictive model to predict the outcome of the match based on the team as well as individual player historic performance. 

We believe this predictive analysis strategy would be very useful for viewers, sponsors, and team strategists. This would also give insights to various cricket analysts and commentators about the features that play a crucial role in statistical analysis.


Sign up for your weekly dose of what's up in emerging technology.


A model could have been built that could have predicted the outcome of every match of the 2019 World Cup to predict the winner of the tournament. In the above context, we feel, if we closely study the historical performance of the players in the one-day international matches, we should be able to associate a performance score for each player. We will decide on a methodology to derive the team performance score from the individual player’s scores. The performance scores of the team will further decide the chances of a team to be the winner of the match. Through this article, we hope to identify those key parameters at the player level that have a significant impact on the team’s outcome for a match.

Data Source

The data for this predictive analysis could be obtained from Wikipedia and ESPN Cricinfo Websites. ‘Statsguru’ service provided by ESPN Cricinfo could also be leveraged to extract individual player statistics.

Variables that would be required the most and will play a key role in getting batting and bowling are as below.

Player* – Name of the playerPlayer* – Name of the player
Mats – No of matches playedMats – No of matches played
Inns – No of innings playedInns – No of innings played
NO – No of not outsOvers – No of overs bowled
Runs – No of runs scoredMdns – No of maidens
HS – Highest scoreRuns – No of runs given
Ave – Average of the playerWkts – No of wickets taken
BF – No of balls facedBBI – Best bowling figure in an inning
SR – Strike rate Ave – Bowling average (Runs/Wickets)
100s – No of 100s scoredEcon – Bowling Economy
50s – No of 50s scoredSR – No of runs hit per over
0 – No of duck outs5W – No of 5 wickets taken
4s – No of fours hit
6s – No of sixes hit

Toss/Result and ground details collected will also help in analysis as it would be the home ground and team batting first will always get more advantage to win compared to opposition.


A team is a combination of batsmen and bowlers. There are selected 15 players in each team squad but only 11 are in playing. So, we need to model for 22 players per match to predict the winner. Sometimes, the playing 11 may change due to match tactics, injuries, venue, etc, so in this case, we can’t just consider a set of 11 players, they need to be revised as per the schedule and then the prediction should be made taking into account each and every individual playing. Apart from that outcome of toss, ground plays also plays a major role for a team to win or lose the match. Here as mentioned above supervised learning has been implemented as per below model diagram.

Feature Construction

The choice of right features plays a key role in the success of a prediction model. For the problem at hand, which is predicting the winner of the ODI cricket world cup, we choose two other important features along with the relative strength of one team against the other. The first one is the venue of the match, and the second is the outcome of the toss. The venue of the match is important because of the ‘home team advantage,’ which basically means that the team playing at their home grounds has an advantage over the visiting team. 

This advantage is directly attributed to the psychological support that the home team gets from the audience in the ground, to the familiarity of the ground, environment, etc. The second feature is the outcome of the toss, which has been observed and believed to have a major role in deciding the outcome of a match. The toss is directly associated with the nature of the pitch and the environment. For instance, a green pitch supports the pace bowlers, so winning the toss and opting to bowl first could give the team an upper hand over the opponent team. Similarly, in humid conditions it becomes difficult for the bowlers to control the wet ball, so batting first is an optimal decision in that case.

Therefore, every match played between team A and team B in our dataset has three features: toss, venue, and strengthA/B. StrengthA/B and venue have numeric features, whereas toss has a binary feature. The value of the toss is 1 if team A has won the toss, or 0 otherwise. The value of venue is 1 if the match is being played at a home ground of team A, and 0, if it has been played at a home ground of Team B, and 2 otherwise. The value of StrengthA/B is the relative strength of team A against team B which is calculated as

The target variable defines the winner of a match, which is a binary variable. The value of the winner is 1 if the winner of the match is team A, and a 0 if the winner is team B. Notice that out of the two competing teams, any one of them could be considered as team A and all the feature values and the target value would update accordingly.


Some of the Machine Learning models can be implemented mainly using R library. Naive Bayes and SVM module could be used from E1071 package, Decision Tree module from r-part package, random forest from randomForest package, logistic regression using glm, XGBoost from xgboost package and k-NN was used from class package. Model performance can also be measured with the help of confusion matrix.

Above methodology could even be used for other sports.

This article is a part of the AIM Writers Programme. If you wish to write for us, email us at

More Great AIM Stories

Netali Agrawal
Netali Agrawal is a part of the AIM Writers Programme. She is a Business Analyst who loves to explore new ideas in different industries through machine learning and artificial intelligence. She holds a bachelors degree in engineering along with post-graduation certification in business analytics and business intelligence. She is working with an MNC as a business analyst and leading a project for machine learning and artificial intelligence. Netali loves to write about analytics, machine learning and artificial intelligence. She loves to explore data and mould it in the best possible shape to get all possible insights from the data. She resides in Hyderabad, India. Linkedin Bio:

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

Conference, in-person (Bangalore)
Cypher 2023
20-22nd Sep, 2023

3 Ways to Join our Community

Whatsapp group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our newsletter

Get the latest updates from AIM