Football Dataset Visualization and Analysis

Nandika Ramadurai
8 min readJan 25, 2021

--

While I may not be a football fan, one question has always intrigued me — Messi or Ronaldo? Or maybe Neymar? Like Professor McGonagall once asked the golden trio, I ask myself, why is it always these three? Which clubs are the most popular? This article is my attempt to answer these questions.

In this article, I will walk you through what I, as a complete novice in football, learnt from the dataset. I used the software Tableau to perform this analysis.

I have attempted to draw insights on the following topics —

  1. The Effect of Age
  2. Relationship between Value and Wage
  3. Top clubs and countries
  4. How various factors affect Overall
  5. Who is the best football player?

I will then use the insights I have drawn to put an end to the everlasting Messi vs Ronaldo battle.

Here is a link to the Dataset I used (provided by DataByte).

When I started with this dataset, there were many terms I was unable to understand. So, before I dive deep into my findings, here are a few definitions.

Value / Market Value — A player’s market value is an estimate of the amount for which a team can sell the player’s contract to another team.

Overall — It is an aggregation of various factors like acceleration, aggression, ball control, etc., similar to our college CGPAs, but out of 100.

Potential — Potential decides what overall a player will be able to reach after a few seasons.

1. The Effect of Age

The average age of the top 100 players is 28.41, which is much higher than what it was decades ago, thanks to modern technology sophisticated training. Even then, I hypothesize that certain factors like Stamina, Acceleration, Agility, etc., are bound to decrease with age, while factors like Composure increase.

Fig 1. Overall vs Age and Potential vs Age

If we look at a plot of Potential vs Age, Average Potential (the average potential of players in that age group) and Maximum Potential (the maximum or largest potential of players in that age group) both decrease with an increase in age.

Fig 2. Avg Acceleration, Aggression, Sprint Speed, Stamina, Strength, Shot Power vs Age

Similarly, in the above plot, we can see that Average Strength, Shot power, Sprint Speed, Aggression, Acceleration, and Stamina also decrease with an increase in age.

Another way to look at the effect of age is to find the average age for different factors and observe the pattern.

Fig 3. Average age for different acceleration and potential values

In the above plot, the average age for a potential of 90 is much lower than the average age for a potential of 45, that is, as the potential increases the age decreases. Acceleration also follows a similar trend.

On the other hand, as seen below, Average and Maximum Overall seem to increase between the ages of 26 and 40.

Fig 4. Overall vs Age
Fig 5. Avg Composure, Penalties, Reactions vs Age

On further analysis, I found Composure, Penalties and Reactions also increase with age (considering only till the age of 36 since that is the average retirement age). As older players have greater experience, they seem to be able to maintain their composure and reactions as compared to younger players. The average number of penalty shots taken also increases with age. This might be because older players, due to their age, have more experience penalty shooting as compared to youngsters.

To answer the big question — Why does overall increase with age? This is because older players are able to compensate for the decrease in stamina and acceleration by improving skills that require critical thinking, strategy, gameplay, etc., which come with experience, hence managing to improve their overall score (as shown in the graph below).

Fig 6. Avg age for different overall values

2. Relationship between Value and Wage

Fig 8. Trend followed Normalized Value and Normalized Wages

The above graph shows how Wages and Value vary for players with an Overall greater than 85. As shown in fig 8, both Normalized Value and Normalized Wages decrease with a decrease in Overall.

Fig 9. Normalized Value vs Normalized Wage (Overall>83)

From fig 9, most of the players with an Overall greater than 83 have Normalized Wages > Normalized Value.

This can be explained pretty easily. Players with an overall above 85 (the better players ) are rare and sparse. These players are fewer in number and better than most others. Other clubs will be willing to buy them for high values, so if a club wishes to retain its good players, it will have to provide them a salary that is higher than their market value. Hence they end up with Normalized Wages>Normalized Value.

Fig 10. Normalized Value vs Normalized wage (Overall<83)

However, as we can see from the above graph, as the Overall reduces to below 83, the Wages slowly start becoming less than the Value.

As the overall keeps decreasing, the normalized wages of players continues becoming less than their normalized value. This is opposite to what was observed with players having an Overall > 83. Players with lesser Overall are plentiful and hence clubs don’t pay them a lot.

The below plots show the relation between wage and values.

Fig 11. Value vs Wage for top 15 players and Relationship between Wage and Value
Fig 12. Plot showing Average Value and Average Wage against the Value and Wages of different players for different values of overall

In the above plot, we can see that for the same Overall, different players are given different wages. Some players are paid more than the Average Wage for that Overall whereas others are less than the Average Wage for that Overall. What we observe is, within a group of players with the same Overall Rating, players with a higher Potential have a higher Wage (above average) and players with a lower Potential have a lesser Wage (below average). Remember the definition of Potential? It decides what the players’ overall will be in a few seasons. If a player is expected to have a higher Overall in a few seasons (i.e become better ) his club would want to retain him and hence will pay him more as compared to players who might not improve over the next few seasons. This is also the reason why players with the same Overall have different Values. If a player is expected to improve over the next few seasons, clubs will be willing to pay more to buy him as compared to players who might not improve.

3. Top Clubs and Countries

Fig 13. Top 15 clubs in terms of player Wages
Fig 14. Top 15 clubs in terms of player Value

The above graphs show the top 15 clubs in terms of Value and Wages. While Paris Saint — Germain has the highest valued player, Real Madrid and FC Barcelona have higher Average Values. Real Madrid is the most valued team, with FC Barcelona coming in at a close second. While Real Madrid and FC Barcelona have the same Maximum Wage, the former beats the latter in terms of Average Wages.

Top clubs in terms of Overall

In terms of overall, the best club is FC Barcelona followed by Juventus.

Fig 15. Clubs ranked by the Average Age of players

The club with the highest average age is Clube Atletico Paranaense.

Fig 16. Nation vs Value

In terms of total Value, Spain has the highest ranking, as shown in the graph above.

Fig 15. Countries ranked based on Average Overall and Maximum Overall

The above two graphs show the top countries in terms of Average Overall and Maximum Overall. While Oman has the highest Average, it is not included in the list of countries with Maximum Overall. Similarly, while Portugal tops the countries in terms of Maximum Overall, it does not have the highest Average Overall.

4. How various factors affect Overall

Fig 16. How Overall varies with Aggression and Interception
Fig 17. How Overall varies with Heading accuracy and Strength
Fig 18. How Overall varies with Reactions, Short Passing and Ball control
Fig 19. How Overall varies with Sliding and Standing tackle

As we can see in the above graphs, which show how Overall varies with other factors, beyond a certain point, even though the factors start to reduce, Overall doesn’t appear to change much. This shows that even if players lack in one skill they can make up for it with other skills.

5. Who is the best football player?

Football fans are often at war over this one question — Messi or Ronaldo. I would like to bring another player into this picture — Neymar. In this final topic, I will try to figure out who the Dumbledore of Football is.

Fig 20. Top 10 football players
Fig 21. Top players in each country

While the above graphs suggest that the best player is Ronaldo, with Messi at a close second, we must not forget that Ronaldo is 35 years old and Messi is 33 years old. On the other hand, Neymar (in the third-place) is only 28 years old. The average retirement age in football in 36 years. In my opinion, regardless of whether or not Ronaldo and Messi retire, factoring in the effect of age, in two or three years Neymar would conquer the crown.

To conclude, while Ronaldo is currently the best player, he would soon be dethroned by Neymar (making him our future Dumbledore :p).

If you’ve made it to the end of this article, congratulations :D. I would like to mention that this article contains only a small fraction of a large number of insights that can be drawn, it is a summary of everything I learnt and understood. I hope this article was informative, and of course, has turned you into a football fanatic!

You can view all the graphs by clicking on the link below — https://public.tableau.com/profile/nandika.ramadurai#!/vizhome/DataByte-Football/Slidingandstandingtackle

Done by — Nandika Ramadurai, 2nd year, EEE, NIT Trichy.

--

--