Big Data in Football: When Too Much Data Becomes Misleading
Over the last 10 years, there has been a common trend across all sports to gather as much data as possible, in an attempt to gain an advantage over opponents. This trend has been primarily driven by the belief of,
“The more we know, the better prepared we will be.”
Data capture comes in many forms, but the most commonly used methods include Heart Rate Monitors, GPS tracking, and at very high end clubs, VO2 Max testing, and daily urine/blood samples.
Unfortunately, acquiring more data about your players also means you need an accurate method for interpreting that data. Based on those interpretations, a coach must then decide what actions to take, in order to positively influence their team. But as Malcolm Gladwell states, just because we have more data, doesn’t mean that we are making the correct decision. We might be more confident in that decision, but we might be using that data in the wrong context.
In order to make positive decisions that will benefit the player and team, the key question becomes not how much data can we collect, but rather which data is relevant, and how heavily should we rely on it.
As a coach, I always try to make an observation with my eyes first, and then look to see if the data supports this. I also try to ensure that I am not getting bogged down with data that may not be relevant. For example, when using GPS Tracking that indicates a player ran 6 miles in a game, one might believe that this player “worked harder” than his teammates. One could also deduce that all footballers need to be able to run 6 miles.
Both of those deductions would be incorrect, because we are observing a football match, not a marathon. Running in football is much less about the distances covered, and more about the why, when, and how. A player who is smart with his movement and decision-making might cover a lot less distance than his teammates, and actually have a much more positive influence on the game.
When I built SoccerPulse, there were two subjective measures of players that I wanted to know when preparing for a match, or planning a training session.
How do my players think they are feeling?
How intense did they perceive the training session.
When preparing to run a football conditioning session with my players, I always use the game itself as the starting point. Then, I look at how my players said they were feeling on that day, and use it as the backdrop for the session.
Change from Week 1 to Week 2 with overload games of 7v7 (5.5m work, 2 min rest x4)
By using the game as the starting point, I want to know at what moment does the tempo begin to slow, and then use that zero-point to eventually push my players safely beyond that threshold. When players are feeling fatigued, that zero-point typically comes sooner than it would if they were feeling fresh. When reflecting back on the session, I want to know how much the players felt they had to exert to maintain that target tempo.
If I am doing my job properly (trying to develop my players to be as fit as possible) as the season progresses, my players should be able to play at a higher tempo, for longer, while feeling as if they are exerting themselves less. As we are able to hit those tempo thresholds safely, I can overload them more the following week and attempt to increase the amount of work, or decrease the amount of rest, depending on my objective for the session.
The good news about using daily questionnaires and RPE is that it has been found to be very accurate, without costing a fortune. I’ve found the greatest benefit to be when players indicate they haven’t been sleeping well. This indicates to me that their bodies are likely taking longer to recover than players who are sleeping well, so I need to be careful how hard I push them, and then I speak to the player about their sleeping habits and we find ways to improve their recovery.
I think the most important message when using data is that decisions should never be made simply based on a number. All these methods of data capture are simply tools for coaches to use. It’s important that the game itself always remain the center of all decision making, and any type of data capture surrounding that only acts as an influence, not the end-all be all.