“It’s (probably) coming home,” said Goldman Sachs’ Christian Schnittker on England’s chances of winning Euro 2020 (the tournament was put off for a year due to pandemic). Three Lion fans were hoping Goldman Sachs would get it right in the third go-around and England would win the cup (Euro went to Rome).
The prediction model crunched the data from about 6,000 football matches played from 1980 and took into account factors such as current team strength, recent performances, home-field advantage, etc.
Euro 2020 prediction
Goldman Sachs’ prediction model performed the following steps:
The algorithm starts by modelling the number of goals by each team using a large dataset of international football matches. The number of goals provides the following information:
- The strength of the squad is measured with the World Football Elo Rating. The Elo rating system calculates the relative skill levels of players in zero-sum games such as chess. This Elo ranking did not incorporate individual player information for Goldman Sachs’ model but correlated highly with other metrics like the FIFA rankings and teams’ estimated transfer values.
Credit: Goldman Sachs
- Goals scored and conceded in the last five matches: This data helps in capturing the momentum of a team in the run-up or during the European Cup.
Credit: Goldman Sachs
- Home advantage is a pivotal factor in the number of goals scored. Goldman Sachs’ team found that, on average, the home team scored 0.4 goals more. With this, England seems to have an advantage since both the semi-finals and the finals are being hosted at Wembly.
Credit: Goldman Sachs
- Tournament effect– referring to countries (Croatia, the Netherlands, and Germany) punching above their weight apropos Elo ratings at major tournaments –is another crucial parameter.
Goldman Sachs said, while the prediction model captures the stochastic nature of the tournament, the forecasts are ‘highly uncertain’ as football is an unpredictable game.
Third time lucky?
In 2018, Goldman Sachs came up with a machine learning-based statistical model to predict the World Cup outcomes. The prediction fell flat.
The AI system ran a simulation of one million possibilities and variations. Originally it predicted that Germany and Brazil would face each other in the finals, however, the former made an early exit. The AI system then concluded that England would be the second finalist and Brazil would win the cup. The prediction of the AI system embarrassingly failed as France and Croatia were the two finalists, with France taking home the crown.
In 2014, Goldman Sachs used a simpler and less ambitious statistical model for predicting the 2014 World Cup results. It used fewer parameters such as the number of goals scored in the last ten official international matches and teams’ rankings. The model too failed at making accurate predictions.
Apart from Goldman Sachs, USB and ING have also tried their hand at model-based predictions, but with no success. However, the Japanese bank Nomura made a successful prediction in FIFA 2018. The bank used portfolio theory (profiling teams on the basis of players’ value, the momentum of team performance, and historical performance) to predict France as the winner. However, the bank got the second-runner up wrong.
Why is it difficult to predict?
The advent of machine learning and data analytics have made the prediction game exciting. The predictors analyse historical data, run simulations, and use state of the art statistical techniques to forecast match outcomes.
The programmers and AI researchers rely on quantifiable data to make observations. If the data is not authentic, results are bound to be inaccurate. Further, models can’t account for intangible (read unquantifiable) variables such as team dynamics, player emotions, fans sentiments etc.
Experts believe predictions for Football matches are inherently complex, as compared to other games. Director of analytics at Merkle, Debs Balme, said in an interview, “For sports like baseball or basketball where there are lots of games against the same opposition, it’s an easier solution to predict, as there is more data available. Baseball players play 162 games a season, for example. And the Mets and Yankees have played each other 115 times. So there is more history of performance [than in the World Cup], and a greater wealth of data to be able to make more accurate predictions.”