2018 FIFA World Cup Final aftermathematics

Two years ago we were living different lives. All crammed in front of TV, with no regard to social distancing (what is that?!), we were following greatest sporting event that there is – 2018 FIFA World Cup in Russia… and Croatia was all the hype. Compared to other football powerhouses, Croatia is indeed a miniscule country. For comparison, there are more registered football players in Germany (DFB members) than there are people living in Croatia. Still, for some reason (I won’t go into details) we are excelling at sports (football included), in contrast to our abysmal economic score (again, I won’t go into details).

Anyway, we reached the final and unfortunately lost to better team… France. I get it, you might wonder why “better team” is struck through, didn’t France won 4-2, what more proof do I need that it was indeed the better team? Well, not so fast my friends! I wont say France didn’t deserve it, they played fantastic tournament and were very good in the final, but… they weren’t better, they were luckier team that day, that was deciding factor! No way I can prove this? Well, hold my sore-loser beer.

Before we start, little bit of theory. It’s safe to say that for every sport there’s a whole lot of skill involved, people train and practice for decades in order to compete on highest level. However, results are not only decided on skill,… there’s something else, something we like to call “luck”. Think of it as random factors that are outside your skill realm, all that you cannot control, such as when when referee gives a bad call, when player gets injured or when ball bounces off the goalpost. Depending on the sport there’s more or less luck involved and that skill-vs-luck continuum is elegantly visualized on the following picture.

You can find many interesting articles and scientific papers on the role of luck, whether in real sports or fantasy ones (and finally settle debate whether fantasy league winners are just lucky bastards). However, before you begin to dig deeper there’s this cool introductory video I would definitely recommend.

There’s one other important thing to note here. When we have lot of talented and highly competent people competing for some “prize”, whether in sports or economy, when differences in skill or knowledge are minimal, it is actually luck or chance that very often plays the decisive role. So yeah, success is the sum of talent, hard work… aaand luck. Our lives are influenced by factors we cannot control so it’s important to stay humble, especially when you’re successful and you suddenly find it easy to attribute it all to your immense prowess and skill.

Anyway, back to the topic… even though goals are all that matters at the end of the game, there’s actually more precise way to measure quality of play. We can all remember numerous games where one team dominated whole game but the opponent scored from their only shot and managed to grab the win. So were they the better team then? Well, it’s a deeply philosophical question,… that day they were, but if that match would be somehow replayed they would lose 9 out of 10 times. Ok, so what’s this way for better measuring the quality of play? Well, there’s an approach called xG (or eXpected Goals), which is advanced analytical model that quantifies the quality of every chance/shot taken and determines the probability that aforementioned shot would result in a goal. So for example xG of 0.5 indicates 50-50% chance, which is a very good opportunity and the expectation is that the attackers will turn a chance of this type into a goal in 50 percent of cases. Penalty kick is worth 0.76 xG, corresponding to an average penalty conversion of 76%. Ok, but how can this even be quantified?

In short, StatsBomb, Opta, InStat, WyScout and other sport analytics services have been collecting detailed data from football matches for many years, recording every pass, dribble, touch with the ball, shot, player positions and other so-called event (and tracking) data. So, if we take, let’s say, five years of data from best five European leagues (Premier League, Serie A, La Liga etc.), we get more than 200,000 shots and contextual parameters related to each of them – distance from goal, angle to goal, what preceded the shot (dribble, pass, rebound…), type of assistance, defensive pressure, whether it came from a fast break, etc.

Each of these 200,000 data points in isolation means nothing, because a lot of it depends on chance, but such a large sample amortizes the impact of luck and allows us to come to a conclusion about how likely it is for every shot to result in a goal. If you have such a large dataset, with lot of contextual (independent) variables and labeled target variable (goal/no goal), well, it’s easy to train classification model and get estimated probability for every shot. In this way we can evaluate all the shots / opportunities in a particular game and see who actually played better and had better chances, because in football, final result is not always the best measure of how team has played. So what do you say we apply xG model on 2018 FIFA World Cup Final and see what it can tell us?

As we can see in the picture, the total xG for Croatia is 1.3, compared to 1.05 for France (out of which 0.76 is for Griezmann penalty), which means that in sum Croatia had slightly better chances and that France created very little from open play. We have not yet discovered how to travel through time, or whether there is a multiverse, but with the help of Monte Carlo simulation we can find out what would happen if the same game with the same opportunities was played 10,000 times. The result is that Croatia would win in almost 40% of cases, France in only 25%, and in 35% of cases after 90 minutes the game would “end” in a draw.

As already said, football is a game with a noticeable “luck” factor (which is why it is so damn interesting) and as you can see final score doesn’t do justice to Croatia this time. This whole game was a bit of a fluke, suffice to say that given the opportunities, the probability of this match ending 4-2 was less than 1% according to our Monte Carlo sim and since 1958 this has been the match with the most goals in the final. Due to the very nature of the game, in particular the small number of goals, luck factor in football can be very influential, especially when the difference in quality between two teams are minimal. This is exactly what happened to Croatia; ref Pitana’s mistake and the wrongly awarded free kick, then Mandžukić’s own goal and then the clumsy/unfortunate/unfair(?) penalty gave France advantage that proved insurmountable for Croatia. It would be unfair from me to say that France didn’t deserve this win, but 4-2 score also gives unfair impression that we were outclassed, because with a little more luck, the final could have ended much differently.

However, there’s no place for despair, we were third in 1998 World Cup, 20 years after that we were second, so I did extremely complex regression analysis and I can conclude without any doubt – we are winning 2038 FIFA World Cup!