Football Dataset Visualization and Analysis
While I may not be a football fan, one question has always intrigued me — Messi or Ronaldo? Or maybe Neymar? Like Professor McGonagall once asked the golden trio, I ask myself, why is it always these three? Which clubs are the most popular? This article is my attempt to answer these questions.
In this article, I will walk you through what I, as a complete novice in football, learnt from the dataset. I used the software Tableau to perform this analysis.
I have attempted to draw insights on the following topics —
- The Effect of Age
- Relationship between Value and Wage
- Top clubs and countries
- How various factors affect Overall
- Who is the best football player?
I will then use the insights I have drawn to put an end to the everlasting Messi vs Ronaldo battle.
Here is a link to the Dataset I used (provided by DataByte).
When I started with this dataset, there were many terms I was unable to understand. So, before I dive deep into my findings, here are a few definitions.
Value / Market Value — A player’s market value is an estimate of the amount for which a team can sell the player’s contract to another team.
Overall — It is an aggregation of various factors like acceleration, aggression, ball control, etc., similar to our college CGPAs, but out of 100.
Potential — Potential decides what overall a player will be able to reach after a few seasons.
1. The Effect of Age
The average age of the top 100 players is 28.41, which is much higher than what it was decades ago, thanks to modern technology sophisticated training. Even then, I hypothesize that certain factors like Stamina, Acceleration, Agility, etc., are bound to decrease with age, while factors like Composure increase.
If we look at a plot of Potential vs Age, Average Potential (the average potential of players in that age group) and Maximum Potential (the maximum or largest potential of players in that age group) both decrease with an increase in age.
Similarly, in the above plot, we can see that Average Strength, Shot power, Sprint Speed, Aggression, Acceleration, and Stamina also decrease with an increase in age.
Another way to look at the effect of age is to find the average age for different factors and observe the pattern.
In the above plot, the average age for a potential of 90 is much lower than the average age for a potential of 45, that is, as the potential increases the age decreases. Acceleration also follows a similar trend.
On the other hand, as seen below, Average and Maximum Overall seem to increase between the ages of 26 and 40.
On further analysis, I found Composure, Penalties and Reactions also increase with age (considering only till the age of 36 since that is the average retirement age). As older players have greater experience, they seem to be able to maintain their composure and reactions as compared to younger players. The average number of penalty shots taken also increases with age. This might be because older players, due to their age, have more experience penalty shooting as compared to youngsters.
To answer the big question — Why does overall increase with age? This is because older players are able to compensate for the decrease in stamina and acceleration by improving skills that require critical thinking, strategy, gameplay, etc., which come with experience, hence managing to improve their overall score (as shown in the graph below).
2. Relationship between Value and Wage
The above graph shows how Wages and Value vary for players with an Overall greater than 85. As shown in fig 8, both Normalized Value and Normalized Wages decrease with a decrease in Overall.
From fig 9, most of the players with an Overall greater than 83 have Normalized Wages > Normalized Value.
This can be explained pretty easily. Players with an overall above 85 (the better players ) are rare and sparse. These players are fewer in number and better than most others. Other clubs will be willing to buy them for high values, so if a club wishes to retain its good players, it will have to provide them a salary that is higher than their market value. Hence they end up with Normalized Wages>Normalized Value.
However, as we can see from the above graph, as the Overall reduces to below 83, the Wages slowly start becoming less than the Value.
As the overall keeps decreasing, the normalized wages of players continues becoming less than their normalized value. This is opposite to what was observed with players having an Overall > 83. Players with lesser Overall are plentiful and hence clubs don’t pay them a lot.
The below plots show the relation between wage and values.
In the above plot, we can see that for the same Overall, different players are given different wages. Some players are paid more than the Average Wage for that Overall whereas others are less than the Average Wage for that Overall. What we observe is, within a group of players with the same Overall Rating, players with a higher Potential have a higher Wage (above average) and players with a lower Potential have a lesser Wage (below average). Remember the definition of Potential? It decides what the players’ overall will be in a few seasons. If a player is expected to have a higher Overall in a few seasons (i.e become better ) his club would want to retain him and hence will pay him more as compared to players who might not improve over the next few seasons. This is also the reason why players with the same Overall have different Values. If a player is expected to improve over the next few seasons, clubs will be willing to pay more to buy him as compared to players who might not improve.
3. Top Clubs and Countries
The above graphs show the top 15 clubs in terms of Value and Wages. While Paris Saint — Germain has the highest valued player, Real Madrid and FC Barcelona have higher Average Values. Real Madrid is the most valued team, with FC Barcelona coming in at a close second. While Real Madrid and FC Barcelona have the same Maximum Wage, the former beats the latter in terms of Average Wages.
In terms of overall, the best club is FC Barcelona followed by Juventus.
The club with the highest average age is Clube Atletico Paranaense.
In terms of total Value, Spain has the highest ranking, as shown in the graph above.
The above two graphs show the top countries in terms of Average Overall and Maximum Overall. While Oman has the highest Average, it is not included in the list of countries with Maximum Overall. Similarly, while Portugal tops the countries in terms of Maximum Overall, it does not have the highest Average Overall.
4. How various factors affect Overall
As we can see in the above graphs, which show how Overall varies with other factors, beyond a certain point, even though the factors start to reduce, Overall doesn’t appear to change much. This shows that even if players lack in one skill they can make up for it with other skills.
5. Who is the best football player?
Football fans are often at war over this one question — Messi or Ronaldo. I would like to bring another player into this picture — Neymar. In this final topic, I will try to figure out who the Dumbledore of Football is.
While the above graphs suggest that the best player is Ronaldo, with Messi at a close second, we must not forget that Ronaldo is 35 years old and Messi is 33 years old. On the other hand, Neymar (in the third-place) is only 28 years old. The average retirement age in football in 36 years. In my opinion, regardless of whether or not Ronaldo and Messi retire, factoring in the effect of age, in two or three years Neymar would conquer the crown.
To conclude, while Ronaldo is currently the best player, he would soon be dethroned by Neymar (making him our future Dumbledore :p).
If you’ve made it to the end of this article, congratulations :D. I would like to mention that this article contains only a small fraction of a large number of insights that can be drawn, it is a summary of everything I learnt and understood. I hope this article was informative, and of course, has turned you into a football fanatic!
You can view all the graphs by clicking on the link below — https://public.tableau.com/profile/nandika.ramadurai#!/vizhome/DataByte-Football/Slidingandstandingtackle
Done by — Nandika Ramadurai, 2nd year, EEE, NIT Trichy.