Predicting Champions League group stage using club Elo ratings

Magnus Carlsen has Elo rating of 2876, Liverpool has Elo of 2058, while Novak Djokovic has Elo of 2203… hell, even the girl/guy you just swiped on Tinder has Elo rating, and if you’re on Tinder you have it as well. Haven’t heard of Elo rating? Well, let me bring you up to speed. The Elo rating system is a method for calculating the relative skill levels of players or teams in zero-sum games. Initially it was developed for chess, as improved rating system over the previously used Harkness system, but since then its use spread and today is intensively used in football (i.e. soccer), but for clubs and national team rankings (e.g. FIFA world ranking) and for predicting match outcomes due to its statistical precision and stability.

If  we are talking about football, methodology works in the following way: all clubs (or national teams) start with the same Elo rating which is then updated after each game, depending on the outcome and strength (rating) of the opponent. When one club wins (and other loses) they exchange points, so winning team’s Elo rating goes up while losing team’s goes down. Change in rating, depends not only on match outcome but also on relative strength of two teams. It’s not the same if you beat much stronger opponent or some total outsider, there could be a huge difference in gained points. Let us see few real life cases of Elo functioning to get intuition how it works and what better example is there than Hajduk Split – primary culprit for the rise of psychiatry patients in Dalmatia.

hajduk_elo2

In the last derby against Dinamo, Hajduk was outsider (-125 Elo compared to  Zagreb club) so win meant much more (+11.5 Elo points) compared to win against Istra in the first league round (only 2.4 points, considering Hajduk was +339 points favorite). Even more interesting was loss against Gzira United, amateur club from Malta. In the game where Hajduk was extreme favorite (99.8%) loss meant whopping -20.4 deduction in Elo points, serious setback for Split club resulting also in manager sacking.

I’ll digress a little bit in local (Croatian) matters and check historical Elo ratings for two of Croatians most popular clubs. Dinamo Zagreb (or should I say Croatia Zagreb) had its peak Elo (of 1749) in 1998. under the management of Kranjčar/Zajec. However, club had peak ranking among European clubs in 1982. under Ćiro Blažević (later lead Croatian national team to World Cup bronze in 1994.) when Dinamo was 23. club in Europe, having 1729 Elo rating.

hist_dinamo

Hajduk Split had its peak Elo in 1984. under manager Pero Nadoveza (who inherited great team/rating from Ante Mladinić). At that moment Hajduk had Elo of 1732 and was 23. club in Europe. Hajduk had very good decade and a half, from start of 70s to mid 80s, especially during tenures of Tomislav Ivić (greatest Croatian coach) who led Hajduk to 1700 rating multiples times and during which Hajduk was regularly TOP 30 club in Europe.

hist_hajduk.JPG

Ok, let’s get back to present day. Other than providing ranking methodology for clubs (even if they are not playing in the same league), Elo permits us to calculate match outcome probabilities, something we will use for simulation of Champions League group stage. Since developed, Elo provides the best predictive performance among ranking systems in association football (Lasek, Szlávik and Bhulai, 2013, “The predictive power of ranking systems in association football”). Still, don’t expect you’ll beat bookmakers and earn money 😉

EDS4e4CXYAEJded

Graph above displays Elo ratings for Champions League clubs, by group, along with mean Elo for each group. Croatian representative Dinamo Zagreb is in group C, second “hardest” group based on mean Elo, but as we can see, it is primarily due to Manchester City crazy rating. Dinamo, Atalanta and Shakhtar are very close strength-wise (50-something points), so matches against Italian and Ukrainian club will decide Dinamo’s fate. We can already see that groups G and H are most equal  strength-wise and should provide for interesting match-ups, something we can’t say for groups A and B where we have two pairs of clear favorites to advance (Real Madrid & PSG and Bayern & Tottenham). However, let’s leave this for now, cause we are bringing all the individual predictions and probabilities below.

club_elo_API.JPGAnyway, now that we know how Elo works let’s do something much more interesting. Getting the Elo data is quite easy, thanks to clubelo.com. All you got to do is call their Rest API using python Requests library which makes sending HTTP requests pretty straightforward. Content of the response is basically a CSV file which Pandas easily reads into DataFrame. Now that we have Elo ratings for all CL clubs we can calculate match outcome probabilities and since we know groupings and schedule, all we gotta do is little bit of coding to implement Monte Carlo simulation. By using Elo ratings on 1.9.2019. and simulating each group 10.000 times we get following results. We’ll start with Group C (having Croatian representative) and later we’ll provide predictions for other groups as well.

groupC

Columns P1, P2, P3 and P4 signify the probabilities of clubs finishing group stage in these positions (P1 = first position in the group, P2 = second position, etc.). As we can see, Manchester City is clear favorite and has 91% chance for winning Group C. Shakhtar is second favorite to advance to the next stage of CL competition, having 43.5% chance (P1+P2, probability of finishing first or second). Based on our calculations, Croatian champion has 18% chance of reaching next stage and 30% chance of finishing third which would see Dinamo advance to Europa League. In total, Dinamo has bit less than 50% chance of finishing in first three places (P1+P2+P3) which would see them in Euro competitions in 2020 (Champions League or Europa League). It’s interesting to note that bookmakers odds imply less than 10% chance of Dinamo advancing to the next stage, while our model is more optimistic (~18%).

For Dinamo Zagreb, a lot will depend on their first group game, one they are playing against Atalanta at home. There’s a huge probability “swing” regarding final group placement depending on the outcome of this match. In the case of victory against Atalanta, Dinamo’s chances to advance to the next stage of competition rise to solid 34.7%, while in the case of defeat they fall to measly 7.6%. In the case of draw Dinamo’s probability to advance will be 16.5%, less than initial 18% that Dinamo has now, before the game is played. This makes sense, because it will be very hard to win any points against Man. City, so Dinamo must rely on causing the upset against Atalanta and/or Shakhtar.

DinamoP12.JPG

Same can be said about Dinamo’s chances of finishing in first three spots, which would guarantee participation in European competitions even in 2020 (KO phase of CL or Europa League). With a win against Atalanta, this probability rises to very good 75.1%, while in the case of a loss it falls on 30.8%. As we can see, Wednesday game could very well decide Dinamo’s fortunes – to open the group stage with a win could be significant boost for Croatian representative.

DinamoP123

Now that we’re done with Group C and Dinamo Zagreb, let’s calculate probabilities/ predictions for other groups as well. In Group A, as expected, Real and PSG are the main favorites to advance, with Brugge behind them lurking and waiting for eventual slip. Galatasaray will try to steal 3rd spot from Belgians, but more likely it will finish last.  In Group B, Bayern and Tottenham are clear favorites to advance, Olympiakos looks set for 3rd spot while it will be surprise if Crvena Zvezda avoids bottom of the table.

groupAB

Man. City is extreme favorite in Group C, it would be huge surprise if “citizens” don’t finish at the top. Shakhtar and Atalanta will likely battle for the second spot while Dinamo will try to complicate matters. Group D could provide for interesting viewing, where Atletico, Juventus and Bayer will battle for the top two spots, with Germans behind favored duo. It will be very unlikely for Lokomotiv to avoid finishing last.

groupCD

In Group E, Klopp’s boys are main contenders for first position, while Napoli and RB Salzburg will most probably battle for the second spot. It will be hard for Genk to avoid bottom of the table. Barcelona should win Group F, while Borussia Dortumund is main contender for second spot – something Inter is eyeing as well. Slavia will try to catch 3rd spot, ticket for EL train, but more likely they’ll finish last in this very hard group.

groupEF

As we already mentioned, Group G and H are most equal strength-wise and hardest to predict. In Group G, Leipzig and Benfica are small favorites, while both Lyon and Zenit are able to disturb their plans. In Group H, Chelsea and Ajax will chase first two spots, while Valencia will try to complicate matters. Lille will try to catch EL spot, but more likely they’ll end up fourth and say goodbye to European competitions.

groupGH

If we order the clubs by the probability of advancing to the playoff stage of CL, we get the following graph. City and Liverpool are almost sure to go through, while Barcelona and Bayerns are only other clubs with more than 90% probability to advance. It will be very unlikely if one of these four slips this early in the competition. Tottenham and Real both have more than 80% chance to accomplish the same feat, while Atletico, PSG and Juve are close up there as well. If we are talking about clubs on other side of the spectrum, Crvena Zvezda, Lokomotiv Moscow, Galatasaray, Genk and Slavia Prague have less than 10% chance to advance to KO stage. It would be a surprise if we would see some of the aforementioned clubs in next stage of the competition.

KOphase_prob

There’s a drawback with this approach that I should mention. We are making predictions for later stages of the current season, while using ratings that mostly reflect previous season’s form. In football there are lot of changes during summer, from player transfers to management changes (e.g. Inter, Chelsea, Valencia, Juventus,…), so clubs’ form (ratings as well) will definitely change and we would need to update our predictions. By the end of September we should be much smarter, not only one round of CL group stage will be played but clubs will also play additional 4 league games which would tell us more about their form in 2019/2020 season.