EURO predictions Day2
Predictions for the EUROPEAN FOOTBALL CHAMPIONSHIP 2024 based on Statistical Analytical Football Models
AUEB & Trieste Sports Analytics Research Group,
Athens University of Economics and Business and University of Trieste
This article was edited and co-authored by Ioannis Ntzoufras, Professor of Statistics at AUEB, and Argyro Damoulaki, PhD Candidate in the same department. The article is based on the analysis of the collaborating team of Trieste (Professors Leonardo Egidi and Nicola Torelli, PhD candidate Roberto Macrì Demartino, and data science master student Giulio Fantuzzi) with the assistance of V. Palaskas (OpenBet, application development) D. Karlis (AUEB Statistics, analysis consultant). The final result is a cooperation between the research teams of the two universities on Sports Analytics.
***
The first matchday of the European Championship was delightfully complete: some easy victories (Germany, Switzerland, Spain and Romania), reversals of results (Portugal prevailed in the Czech Republic), surprises (with Slovakia beating Belgium), beautiful goals (Turkey scored impressively against Georgia) and some favourites who managed to prevail "just by getting the job done" (Italy, Netherlands, England and France). Looking forward to the future, we will make a brief review of our predictions for the first matchday and present our "predictions" for the second matchday of the competition.
Reminder for friends of Statistics
The use of statistical techniques to predict football matches first appeared in the scientific literature in 1968 with the pioneering scientific publication of Reep & Benjamin. The next real innovations appear in the 80s (with the work of Michael Maher) and the 90s (with the work of Lee in 1997). However, the first important publications in the field, introducing models on which models are based and which we still use today, were the works of Dixon & Coles in 1997 and the bivariate Poisson model of Karlis and Ntzoufra in 2003 (two of the authors of this analysis). These two models formed the basis of modern models for predicting football match outcomes.
In this analysis we use the model of Karlis and Ntzoufras through the package "footbayes" in the statistical programming language R developed by Professor Leonardo Egidi from the University of Trieste with the assistance of Vasilis Palaskas (Analyst at Open Bet and active member of AUEB Sports Analytics Group). The model also includes the estimation of parameters that estimate the performance of each group that change over time. To learn the model, all international matches of the 2020-2024 period were used. The main explanatory variable is the difference between the two teams in the Coca-Cola/FIFA ranking. The model, first proposed by Karlis & Ntzoufras in 2003, extends the usual two-variate Poisson model. Details of the statistical and machine learning model used can be found at the end of this article.
Matchday 1 Report
The predictions of the first 12 matches (matchday 1) and the final results are provided in Table 1. The model managed to correctly predict (based on the probability of possible outcomes of the match i.e. Win, draw, defeat) an important number of matches (75%) among which we single out the match between Hungary and Switzerland. This match was the closest according to our model, giving a slight lead to Switzerland who eventually prevailed. Also, in 2 more races the model showed the final result to be quite likely (but not the most likely). More specifically, in the Slovenia-Denmark match we have a remarkable chance of a draw (27%), while in the Romania-Ukraine match the probability of winning was remarkable for Ukraine (47%) but also high for Romania (25%). As for Slovakia's big surprise, the model gave only an 8% chance of winning just like any reasonable model we could make based on data. It should be noted here that a logical model of statistical and machine learning will in no way be able to catch surprises such as those that may occur due to randomness or specific situations that are not considered by the model and the data on which they have been trained.
Table 1: Table with the odds of the outcome of the matches for Matchday 1 of the European Championship 2024.
Odds |
Prevalent |
Final Result |
||||
Rival teams (A-B) |
Win A Group |
Draw |
Niki B Group |
Result (Probability) |
||
Germany |
Scotland |
0.579 |
0.243 |
0.178 |
1-0 (0.143) |
5-1 |
Hungary |
Switzerland |
0.326 |
0.329 |
0.345 |
0-0 (0.176) |
1-3 |
Spain |
Croatia |
0.455 |
0.289 |
0.256 |
1-0 (0.140) |
3-0 |
Italy |
Albania |
0.723 |
0.190 |
0.088 |
2-0 (0.148) |
2-1 |
Poland |
Netherlands |
0.156 |
0.214 |
0.630 |
0-2 (0.113) |
1-2 |
Slovenia |
Denmark |
0.186 |
0.270 |
0.543 |
0-1 (0.167) |
1-1 |
Serbia |
England |
0.107 |
0.212 |
0.681 |
0-1 (0.150) |
0-1 |
Romania |
Ukraine |
0.254 |
0.277 |
0.469 |
0-1 (0.137) |
3-0 |
Belgium |
Slovakia |
0.729 |
0.190 |
0.081 |
2-0 (0.158) |
0-1 |
Austria |
France |
0.170 |
0.243 |
0.588 |
0-1 (0.145) |
0-1 |
Turkey |
Georgia |
0.491 |
0.240 |
0.269 |
1-0 (0.097) |
3-1 |
Portugal |
Czech Republic |
0.693 |
0.196 |
0.111 |
2-0 (0.134) |
2-1 |
Matchday 2 predictions
We proceed with optimism for the second matchday, therefore, with the model's predictions presented in Table 2.
From this table we distinguish the race
- Slovakia - Ukraine
as the most ambivalent.
As favorites stand out
- Portugal with a 69% chance of winning against Turkey
- Belgium with a 65% chance of winning against Romania
- Croatia with a 62% chance of winning against Albania
- England with a 58% chance of winning against Denmark
- Switzerland with a 57% chance of winning against Scotland
- Czech Republic with a 53% chance of winning against Georgia
Finally, we have five more races that are relatively close but with a slight lead of one of the two teams. In these matches we consider that the teams are relatively close and may even draw due to tactics and strategy. In particular, we have
- Austria (49%) prevailing over Poland (24%)
- Germany (48%) beating Hungary (24%)
- Spain (47.5%) prevailing over Italy (26%)
- France (47%) prevailing over the Netherlands (27%)
- Serbia (46%) prevailing over Slovenia (45%)
Table 2: Table with the odds of the outcome of the matches for Matchday 2 of the European Championship 2024.
Odds |
Prevalent |
||||
Rival teams (A-B) |
Win A Group |
Draw |
Niki B Group |
Result (Probability) |
|
Croatia |
Albania |
0.624 |
0.243 |
0.132 |
1-0 (0.170) |
Hungary |
0.482 |
0.274 |
0.244 |
1-0 (0.137) |
|
Scotland |
Switzerland |
0.187 |
0.239 |
0.574 |
0-1 (0.133) |
Slovenia |
Serbia |
0.248 |
0.292 |
0.460 |
0-1 (0.158) |
Denmark |
England |
0.157 |
0.260 |
0.583 |
0-1 (0.169) |
Spain |
Italy |
0.475 |
0.264 |
0.261 |
1-0 (0.124) |
Slovakia |
Ukraine |
0.314 |
0.304 |
0.382 |
0-0 (0.140) |
Poland |
Austria |
0.244 |
0.268 |
0.489 |
0-1 (0.129) |
Netherlands |
France |
0.269 |
0.264 |
0.467 |
0-1 (0.123) |
Georgia |
Czech Republic |
0.227 |
0.242 |
0.530 |
0-1 (0.112) |
Turkey |
Portugal |
0.118 |
0.183 |
0.699 |
0-2 (0.110) |
Belgium |
Romania |
0.649 |
0.221 |
0.130 |
1-0 (0.147) |
Chart 1 gives in more detail the odds for each score for each of the 12 matches of Matchday 2.
Diagram 1: Probability Chart of possible scores for the Games of Matchday 2 of the European Championship 2024.
Bibliography for reading fans
- Dixon, M.J. and Coles, S.G. (1997), Modelling Association Football Scores and Inefficiencies in the Football Betting Market. Journal of the Royal Statistical Society: Series C (Applied Statistics), 46, 265-280.
- Karlis, D. and Ntzoufras, I. (2003), Analysis of sports data by using bivariate Poisson models. Journal of the Royal Statistical Society: Series D (The Statistician), 52, 381-393.
- Lee A.J. (1997). Modeling Scores in the Premier League: Is Manchester United Really the Best? Chance, 10, 15-19.
- Maher, M.J. (1982), Modelling association football scores. Statistica Neerlandica, 36, 109-118.
- Reep, C., & Benjamin, B. (1968). Skill and Chance in Association Football. Journal of the Royal Statistical Society. Series A (General), 131, 581-585.
The Magic Equations of the statistical model