Researchers are pioneering new statistical methods of analysing soccer
By Ken Early, Irish Times, Monday October 13, 2014
The age of data analysis in football began in Swindon on the 18th of March, 1950, some time around a quarter to four in the afternoon. In the crowd at Swindon v Bristol Rovers that day was an RAF accountant named Charles Reep, who was perplexed by a goalless, shapeless first half and decided to take notes on the second, to see if he could make some sort of sense of what was going on out on the field.
Reep logged 147 Swindon attacks and one Swindon goal in that second half, which enabled him to draw the first of many sweeping conclusions: more than 99 per cent of attacks in football came to nothing.
Over the following decades, Reep would assemble a statistical library of more than 2,000 matches. His method involved taking shorthand notes on a match and then spending up to 80 hours turning the notes into a match analysis. His report of the 1958 World Cup final took him months to write up and covered the back of a roll of wallpaper.
Sometimes an idea is too far ahead of its time. Reep understood that football could be represented numerically and he knew that those numbers could tell him something valuable about the game, but he lacked the informational tools to do the job. A football match contains so many measurable variables, so many numbers, that it’s impossible for a single obsessive to locate the signal in the noise. Even assuming you could somehow record a large amount of data, the numbers won’t be much use to anyone written on a roll of wallpaper.
Improvement by numbers
Reep was wrong in many of the conclusions he drew about how football should be played, but he was right that in the future, teams would use numbers to try to improve their play.
You could argue that the wider sporting world woke up to the potential of statistical analysis with the 2003 publication of Michael Lewis’s Moneyball, which documented how the Oakland A’s had used their superior understanding of statistics to outperform better-resourced rivals. The A’s decided to base their recruitment on certain statistical indicators which had been overlooked in baseball’s conventional wisdom.
The numbers in question were on-base percentage and slugging percentage. By recruiting players who scored highly in these areas, the A’s were able to gain an edge on their rivals, at least until the rivals woke up and started to copy what they were doing.
For years, people have wondered if data analysis of football can unearth similar shortcuts to success. Discussion of football is increasingly driven by statistics, but that doesn’t mean people understand which statistics are important and which are not.
The lack of understanding goes all the way to the top. In Rio Ferdinand’s recent autobiography, he claims that David Moyes told the Manchester United players: “I want us to have 600 passes today. Last week it was only 400.” For professional analysts, the story is something of a jaw-dropper.
Cologne research centre
I am sitting in a conference room at the German Sports University in Cologne with Dr Daniel Memmert, who is the head of the Institute of Cognitive and Team Sport Research.
When I tell him the Moyes story, he smiles. “It’s a problem in Germany also. There was a match in the first round of the Bundesliga. At the press conference, the coach, he lost 2-0, and he was like this,” Memmert picks up some papers and pores over them feigning bafflement. “‘I can’t understand this today. We had more ball possession . . . more free kicks . . . more corners . . . but we lost? I can’t understand this.’ And I thought, yeah, I can understand it. It had nothing to do with the result!”
If there is a vanguard in the science of game analysis, you can find it here in Cologne. Since 2005, scientists and researchers at Germany’s only sports university have been working directly with the German national team as “Team Köln”, informing them across areas such as scouting, tactical preparation, training science and so on. Germany has a strong tradition of research universities that co-operate closely with industry, ensuring that real-world practice is informed by the latest academic thinking. Team Köln is an example of the same phenomenon in the field of sport.
“We’re using a new kind of model called neural networks,” he says. “It’s a very intelligent algorithm - a pattern-recognition tool. The basis of the method is position data.”
Depth and detail
At every match in Germany’s first and second divisions, stadium cameras record the movement of every player and the ball and instantly turn the match into a data set of a depth and detail that would have thrilled Reep.
“We’re talking about big data – about three million data points per soccer match,” Memmert says. “We can analyse this data set and try to detect tactical features, tactical patterns. It’s very, very quick. The traditional game analysis would take a team of students working for four to six hours. We can now make such an analysis in three seconds. We press the button, and then we have the tactical patterns. It makes the role of the human beings more fun. The pure analysis and detection, we can give that job to a PC. Human beings can concentrate on interpreting the tactical patterns.”
Data analysis in football has so far failed to produce a “magic bullet” along the lines of the on-base percentage/slugging percentage productivity index made famous in Moneyball. What we have learned is that much of what we thought was important is in fact meaningless.
“You could generally say that statistics like ball possession, corners, fouls, and number of passes have no influence on the result,” Memmert explains. “Everybody talks about ball possession. Colleagues found in one paper that, if you discount the biggest teams like Barcelona or Bayern Munich, possession has maybe a negative correlation – that teams with more possession were less likely to win.”
The only simple statistical indicator so far that has proved to be correlated with the result is shots on goal: the more you shoot, the more likely you are to win. Memmert: “So we can say to coaches, you must maximise the number of times your players shoot. But of course, this is not a very surprising finding. The coach will say yes, that makes sense . . . but how?”
The problem with relying on simple indicators were illustrated a couple of years ago, by Liverpool. After enduring a season in which they struggled to create chances, they went out and signed three players who had ranked highly on “chance creation”. Once the new players were in the Liverpool team, they stopped creating chances. The players’ ability to create chances was highly dependent on the special conditions of their previous teams, which were not replicated in the new environment at Liverpool.
Fluid sport
The reality is that football is such a fluid sport that you’re not going to find magic bullet numbers that hold the secret of whether a particular player is good or bad. Instead, game analysis is looking beyond simple quantitative measures towards a deeper qualitative understanding of how these numbers combine into patterns. “Patterns” is a favoured word of Memmert’s; he mentions it 11 times during our conversation.
“We have to look to other criteria, to qualitative criteria, to patterns. Patterns have a key role in detecting trends: what is important and what is not important to win soccer games. Our neural network can search for those patterns. It gives you situations, patterns which are not obvious. You can say to the neural network, come on, show me all the actions that start on the left wing, or the right wing, or through the centre, and end in the box. Or all short passes to midfield players when the opposite team acts in specific kind of pressing scenarios. Give me all the situations where the ball goes in the box, show me what happens to the defenders there, and so on.”
Understanding these patterns can help a team devise ways of playing to maximise their success. Memmert calls these qualitative strategies (others might call them tactics). “For instance, at the moment a very popular idea, which comes from Barcelona and moves through Hoffenheim, Dortmund, and now it’s going to Leverkusen – is to try to attack very early and make very quick turnovers.”
Two weeks ago, in the Champions League match against Benfica, Leverkusen gave an impressive display of the style Memmert is talking about, pressing defenders into mistakes in their own half and flooding forward in numbers to score on the counter-attack. One of their goals came when Stefan Kiessling followed up a shot that was spilled by Benfica’s keeper, Julio Cesar – he had done the same thing to allow Miroslav Klose score Germany’s second goal in the 7-1 semi-final victory over Brazil.
Of course, it doesn’t matter how much information game analysts provide if the players don’t take the time to absorb it. You hear lots of stories about scouting DVDs going unwatched by players who think they have better things to do: this was supposedly a problem in the Ireland set-up during the Brian Kerr era.
Opposition videos
The Germans have taken a different approach. “I can tell you a story from the national team,” Memmert says. “In the middle of the national team’s campus in Brazil, there was a very large screen. And in this screen, the players know that there are all the videos of the next matches. It was not: ‘Here’s a CD, there are 30 sequences, have a look at that, then you know something of your opponent.’ That’s not the way. It’s the other way around. If the players are interested, they have to go to the screen. They look up their name – Lahm – press that, and then they have hundreds of sequences where they can learn something about their opposing player. And they used it all the time.”
“So that’s a kind of way round this – it’s the player’s job, it’s his responsibility to get the information. It’s not for the coach to give it to them, it’s the other way around.”
Access to cutting-edge game analysis is not the only way in which German footballers (and athletes in other team sports) are benefiting from the latest academic research. In a game that is speeding up all the time – the average time a player has possession of the ball is now scarcely more than a second – there is a new focus on how players can improve their cognitive skills: decision-making, breadth of attention and so on.
“We’re trying to develop new practice tools they can use to improve cognitive skills,” Memmert says. “For instance, decision-making tasks: they’re watching videos on a screen, then the video stops and they have to say what’s going to happen next. You can do this particularly well for goalies – where will the shot come next? It’s very famous in the area of badminton – you click where you think the other guy is going to hit the shuttle, where it’s going to land.”
Memmert points out that players can do this kind of cognitive training in the afternoons after training. “You can actually rest by doing this. I know clubs now where at 3pm after training, the player comes, the analysts have 4-6 sequences, and they’re talking about the sequences. That’s the first step. Then they do exercises to improve cognitive and decision-making skills.
“The players now are very open to this kind of work. They come to us and want to get better in their head – not from a psychological point of view, they are not ill! – but from a cognitive point of view. They’re very good with what they’re eating, they have their own fitness coaches, they do everything to be better than the next guy, so they want to improve their cognitive skills too. I’m very happy about this trend.”