In August 2014 ITV4’s Clark Carlisle reflected, during a Spurs Europa League game, that in his experience “smaller players concede less fouls than bigger players”. He stated this with the typical smugness of someone who thinks they know better than most and knows that what they say is unlikely to be checked. Now I’m not saying that I have a personal vendetta against the ex-Burnley defender but I do seem to be taking it upon myself to hold these commentators more accountable for their so called expert opinion.
So I started thinking about the logic of smaller players conceding less fouls than bigger players. On the face of it this seems reasonable; the referee might be more likely to interpret the strength of bigger players as an infringement of the rules and assume smaller players do not have the power to out muscle and impede bigger players. But what do the numbers say?
There are many ways to explore this question but I chose to keep this simple and use Pearson product-moment correlation coefficients. For those unfamiliar with these they are basically a measure of the linear association between two variables. Correlations do not imply causation, only that they are related in some way. The strength of the association is represented by a score between -1 and +1, where 0 means no association, +1 means they have a perfect positive association and -1 means they have a perfect negative association. There is no set way to interpret the scores in between however I tend to prescribe to what my stats book says which is that scores between 0.1 and 0.3 are ‘weak’, 0.4 – 0.6 are ‘moderate’ and 0.7+ are ‘strong’ associations. A positive relationship means that as one variable increases so does the other, whereas a negative relationship means that as one variable increases the other decreases. So if Clarke Carlisle is correct we should expect a moderate-strong positive correlation between Size and Fouls.
First, I needed some data to explore so I decided to use Tottenham Hotspur‘s 2013/14 Premier League season as a case study. All the data were collected from the excellent people at WhoScored.com. Then, I needed to set a few conditions:
- I needed to decrease the chances of outliers affecting the results so only included players with over 10 Premier League appearances
- I needed to define Size so I could actually have something to measure so decided that Size could be a product of Height x Weight
- Goalkeepers were deemed to be different from outfield players due to their height and decreased chances of conceding fouls so were not included
With that all set I simply looked up the Total Fouls conceded for the season and calculated the new variable Size for all qualifying players (see Table 1 for results).
Taking the variables Size and Total Fouls I then produced a scatterplot to illustrate the relationship (see below). As you can see the association looks a bit ‘messy’. Perfect positive relationships look like a sharp and tight upward slope moving from left to right whereas this looks much looser and flatter. There is a marginal positive slope but players of a similar Size are clearly getting a vastly different number of Total Fouls over the season.
It is not unexpected therefore to find that the Pearson product-moment correlation coefficient between Size and Total Fouls is only 0.2. This is interpreted as a ‘Weak’ association so sorry Clarke but you are wrong: smaller players do not concede less fouls, at least based on the 2013/14 Spurs season data anyway.
So this got me thinking that if Size is only weakly correlated with conceding Fouls then what is more strongly correlated? Players typically concede a foul when they interact with other players (except hand ball; infringements like dissent and time wasting would be classed as ‘misconduct‘ and not a foul). Players usually interact during tackles, aerial battles and take-ons so including these data points should be meaningful. As certain playing positions could make it more or less likely to perform any one of these interactions I decided it would make more sense to create a new variable called Total Actions. Also, as it was more about how often players attempted these actions, not how successful they were, I collated data on Tackles Attempted, Dribbles Attempted (a take on) and Aerial Duels Attempted (both defensive duels and offensive duels). These were summed to derive the new variable Total Actions. The results of this can be seen in Table 2.
Taking the variables Total Actions and Total Fouls I then produced another scatterplot to illustrate the relationship. This can be seen below.
Hopefully you can instantly tell the difference between this scatterplot and the previous one. The line of best fit (linear) has a steeper gradient which means a stronger positive relationship and the plots generally slope upwards from left to right. This all suggests that Total Actions and Total Fouls are more correlated than Size and Total Fouls and true enough the correlation coefficient is 0.62 (‘Moderate’ strength). In addition to this if we square the coefficient and convert it into a percentage (0.62 x 0.62 = 0.38) we can state that 38% of the variability in Total Fouls per player can be accounted for solely by how many Actions they attempted. When you think about all the variables in play in a game of football (e.g. whether the referee is having a good or bad day, whether the referee is lenient or harsh, whether players are up for it or not) that’s quite a lot!
There are two players I have highlighted in the scatterplot who can be considered to be outliers (if I remove them the correlation increases to a ‘Strong’ 0.78). Sandro is on one end of the scale with only 77 Actions and a whopping 30 fouls. All Spurs fans can well believe this and have keen memories of him marauding in front of the back four chopping down players when he actually decides to try the tackle. At the other end of the scale is Andros Townsend; a player who tries a huge amount of Dribbles and does very little else. If we reduced his Dribbles to the team average (28) he’d have 77 Actions and he’d fit snuggly where he should be towards the bottom end.
I also thought about other variables but none demonstrated a relationship as strong as Total Actions. Table 3 shows the correlation coefficients for some serious and not so serious variables.
Interestingly what this shows is that how often the Spurs players lost possession explained more of the variability in Total Fouls than Size did (0.28 > 0.2). So better luck next time Clarke Carlisle; smaller players don’t tend to concede fouls less often unless coincidentally they also tend to have fewer touches of the ball and attempt fewer player-on-player actions.
As a final point, I also gathered data on Total Cards (red and yellow counting equally as a card) and found that this correlated with moderate strength with Totals Fouls (coefficient of 0.65). This stands to reason as the more you foul the more chance you have of getting a card. Total Actions did not correlate as strongly with Total Cards (0.46) as it did with Total Fouls but nevertheless if you wanted to bet on who was more likely to get a card in this weekend’s Hull v Spurs game you could do worse than looking at the top 3 for Total Actions so far this season. For Hull this is Ahemd Elmohamady (106), Mohamed Diame (101) and Curtis Davies (97). For Spurs this is Eric Lamela (87), Younas Kaboul (74) and Danny Rose (68). Place your bets now!
So what do you think? Should expert commentators be more accountable for their opinions?