Firstly i hope all reading are safe & well. It’s been a crazy time personally, working with various clubs has it’s time challenges at the best of time but add in the factor that the staff are predominately just watching games and looking at recruitment and frankly i’ve never been busier outside of a transfer window! That said i’ve been meaning to fire up the blog again and last night a perfect piece caught my eye to do just that.
Here’s the blog in question https://hbheadcoach.wordpress.com/2020/05/04/the-eye-test/amp/?__twitter_impression=true
Harry Brooks is a professional academy coach making his way in the game. My aim here isn’t to agree or disagree with Harry, nor is it to go into the detail of his player assessments. My aim is to discuss the whole concept of data vs eye the test, which i will outline is not and should not be the question.
Data driven scouting or recruitment in clubs is without a doubt becoming more common place. Many clubs have used some form of data in recruitment for a while now, whether that be on the ball event stats, physical stats or even an Instat or SciSports index/score, people are using numbers (rightly or wrongly in some cases) to aid their talent identification & assessments on players.
There has also been a rise in data usage and visibility on social media and with the ever increasing access to a particular platform’s data set, the surge has also increased massively on those social networks where the world has now become covered in scatter plots, bar graphs and radars. Whilst this isn’t directly in the professional world, the point stands, football is now, more than ever, analysed to the enth degree through data analysis.
For some time now there has been the debate on where the balance should lie, how much data focus? how much do you trust the data? how much video scouting? how and when should we live scout? Is there too much focus on data?
I don’t feel that any of those are the right question. The question, in my view, is are we using the data, video and live scouting to ask the right questions about players we are assessing, in short are we checking each stage of our process, confident that the information we have is an accurate representation of the player and we are working towards a list of targets that through all forms of analysis, both subjective and objective, meet our profile of the player. Anyone who has followed me for some time knows my focus on social media and now professionally in clubs is data based. However i have also often shared video analysis and watch every single player/team i talk about. In my view, data has always been better at asking questions than it has at answering them. Let’s give an example of what i mean based on Harry’s piece.
Harry’s piece uses some pretty basic statistics. Having spoken to Harry, he knows these stats are without context and often baseless they were simply an example. However, the point is to look at something from a purely data perspective is wrong. Harry uses the following example:
“For example, Davinson Sanchez has averaged 1.9 tackles per game in the league this season. Virgil Van Dijk has averaged 0.8.”
Anyone with even a remote interest in football statistics knows that defenders and tackles is a sore subject where more does not neccessarily equal better. Harry makes that point himself here:
“Those figures are largely misconstrued due to a number of factors. Sanchez has far more tackles to make due to Spurs being far more open than Liverpool. Van Dijk is also a master in the art of jockeying; shepherding opponents away from danger to avoid needing to make the tackle.”
This is fair and an assessment i agree with. It’s also a great example of where at the moment video can give an edge data doesn’t. Sure, there are models out there that are looking to assess things like shot suppression (basically looking at the activity in a defensive channel, in VVD’s case his LCB side, and seeing who is limiting shots from there area), there’s obviously possession adjusting defensive metrics & there’s also some who try to look at defensive actions per attack faced however all of the above is based on the defender actually affecting the ball. Data isn’t strong enough to be able to tell us this type of assessment and all players that we might not have an intimate knowledge of.
Harry’s next point on tackle win % and the different in samatotype of Harry Maguire and Davinson Sanchez is one where we will have to agree to disagree. As stated before i’m no fan of tackles for CBs, even tackle success as such but to me this is where data can be used to remove bias or pre-conceived, “traditional” sterotyping. What Harry describes physically isn’t unfair, but it is his “assessment”. Physicality, size, mobility and frame are all important, vital aspects of football scouting. However, i’d want that subjective assessment of how the defender’s size/frame could affect them to be shown in the data, because if it’s going to have an effect, then it should be having an effect already. If it isn’t, perhaps it’s fair to concede that there may well be limitations in the players aesthetic but they have actuall found a way around that, there’s countless examples of players out there not being aesthetically pleasing to a scouts’ eye but having the same output as someone who is. Remember, data isn’t made up, it records actions/outputs and that actually happen on the pitch.
As stated this isn’t going to be a piece that highlights a player or a stat. It’s more about process. Harry highlights points about Ibrahim Sangare regarding his passing style being very dynamic and progressive, he’s right. In this instance, i do think data can be used better than they eye test example Harry provides. With the right level of data it is very simple to show the difference in a midfielder’s passing style, however much more detailed data such as how a player plays under pressure (yes i hear you statsbomb) or how a player receives the ball is something that only video or live scouting can show you. I think Harry’s point on Aura is interesting, there’s a general feeling in the analytics world that personality traits do show up the data, especially if you have data over a long period of time. However character, aggressiveness and physical attributes such as speed, acceleration and of course the big engines to get up and down the pitch are not aspects that are going to be caught by data. This is exactly why you need to ensure you’re using data in the right way. Looking to ID some talent? find the names in the top right of your scatter graph? Watch them. I work with clubs where obviously we use a much more detailed process than this example i’ve given here, but the basics of the point stands. Use the data to find players that fit the profile, then watch them to both confirm the data is accurate and also to see what the data isn’t showing you.
The point is ultimately this. Is data more prevalent? yes. Do i think it should be even more prevalent? yes. Do i think data is the “full” answer to IDing players? No. The answer lies somewhere in between, i’ve recently spotted 2 players. These players were not the one’s i was looking at, my guy had great data and i found a game where he had been fouled a fair amount to see if he had the character to continue to perform whilst being targeted. The 2 guys i spotted had average data, decent players but no stand out. Yet in this 1 game they were fantastic, so i flagged them, looked at their data profiles, watched clips over numerous games. I liked what i saw and continue to monitor them now. A lesson there, data will not always show you good players. Unless you have access to advanced models, players on teams not performing well may also not shine in the data, yet they could be performing exceptionally well given the environment they are in.
Another category where data needs questioning is in transfers that have occurred and the player hasn’t performed well, especially if his data previously was very strong. Is this new data that’s below average actually reflective of the player? Have they changed system at their new club, a big swing in style of play? A very different managerial style? Not adapted to their environment especially if in a new country? (see Joelinton!)
You can see from the above my point. Data can be such a great tool, it can highlight players performing at a high level that should be looked into, it can filter noise from agent recommendations to ensure any players being recommended are fit for purpose and it can also weed out the value from the over spending. However, as with anything, alone it’s baseless. The context that watching players, teams & games gives is invaluable. As is the internal knowledge of a player’s current environment.
Here’s my takeaway, Coaches and managers should 100% have at least a basic understanding of how data can be used in football, the strengths, weaknesses and how to apply it. I have grown quite bitter of the sentence “stats don’t win games” a) yes they do, a scoreline of 2-1 is 100% a statistic and b) No they don’t but good players do, and good players tend to have good data because all data does is show what happens on the pitch. Anyone disregarding data will be left behind because there’s only 1 direction the game is headed and that’s clear.
Data analysts should also watch games, see how a coded match actually plays out against a real one, does the data pass the eye test or actually can you hone in on an area where you feel data is useful as a indicator but needs more information? Or even that in isolation that data point alone doesn’t mean much at all and is more noise than information
The data and eye test relationship is a complex one but by using each correctly, by questioning the results or the bias in the process and by understanding the strengths and weaknesses of both you can have a rounded recruitment process that can go someway to raising you above your competition. The question to ask isn’t are we using too much of one way, the question to ask is are we asking the right questions and able to provide the answers both subjectively and objectively.
Here’s one quote i use all the time:
“Data raises more questions than it answers” That’s a very valuable thing indeed.