Friday 23 May 2014

Get More From Data - Event Review

Wed 21st May saw the first of our 'Get More From Data' events which was held at Browns Courtrooms near Leicester Square.

The aim of the event was to get beyond the buzzwords often associated with the analytics industry and look at some practical applications of data and some of the tools that will be enabling this.

First up was Richard Lewis from Model Citizens, talking about the 'Statistical challenges of Big Data'.

Richard described how the ever increasing rise in data volumes required a slightly different approach to more standard methods of passing through the whole dataset multiple times to be able to calculate statistics required for modelling but how it was possible to have a situation to be able to analyse and model where the whole dataset needn't sit in the same location.

The next speaker Simon Field from Revolution Analytics whose presentation followed on neatly from Richard's in that it went on to describe the ever increasing rise in R and how the fact that it is Open Source means that new techniques and set routines (packages) are created at a far greater rate than you'd get with a traditional software vendor.

The extra layer that Revolution Analytics brings is the scalability that is not available with standard R, with the ability to run using multiple cores along with enhanced versions of R packages.

After the break, was a presentation by Dan Barnett from Analysis Marketing/Analysis Recruitment talking about the required mindset of a modern day analyst.  The presentation focused more on attitudes required rather than skill-set.

Dan talked about how it was important to have an inquiring mind, to be able to think like a customer as it's people's behaviour that ultimately you're trying to understand and it's not just numbers on a page.  He also stressed the need to be questioning of any figures you get, to try and break them to see if any issues exist.

Dan also gave a few practical examples of how with the increasing availability of Open Source tools (R/Tableau Public etc.,) along with Open data such as that available from Census/Government sources, it was easier than ever for an analyst to prove they truly understood data and analytics and it not just being a case of sticking some keywords on a CV.

Next up was Lee Witherell from Fuel UK (part of communications agency Engine).  Lee talked about the increasing number of touchpoints an organisation might have with a consumer, how our lives now are dominated by 'looking at rectangles' from smartphones to tablets to TV.

Lee mentioned about the trade off consumers are face with between getting a service (examples included an App created by BUPA) and providing some data in return along with the rise of the 'Quantified Self' (see links at end of this blog).

The final presenter was Scott Lutz from Alteryx whose presentation brought together a lot of the themes mentioned in previous presentations.  Alteryx provides an easy to use workflow system to enable users to integrate data, analyse and then output to a range of sources including Tableau/Qlikview along with integration with R, this process means you're not left with a situation where a project might be split into silos where people have their own way of coding and documentation often sketchy.

With the ever-increasing rise in Analytics, tools like Alteryx are going to be vital to provide tools that are analytically minded but not necessarily 'Data Scientists' can use.

We'd like to once again thank all those who attended and the speakers for coming along, it was great to see how in practice tools like R and Alteryx and changing the analytics landscape.  If anyone is interested in seeing what Alteryx can do, they can click here to download a 14-Day trial.

Lee mentioned a range of sources in his presentation around visualisation and the 'Quantified Self' and has kindly provided us with those below:











Thursday 10 April 2014

Away Fans - How would you spend £200k

I was thinking about writing a piece about away fan activity in the Premier League anyway but then earlier this week Southampton announced they were drastically reducing prices for their final away game of the season at Swansea.
Southampton announcing the discount on Twitter earlier in the week
Much is made on their site of it being a chance to ‘give back to fans’ but I’m a cynical guy and also know about the Premier League’s Away Fan Initiative.  It’s often referred to as the Away Fan’s Fund (see BBC article) although that’s slightly misleading in that 'Fund' suggests some sort of ‘charitable’ donation from the Premier League, when in reality it’s a plan for all 20 Premier League sides to commit £200k each per year of their revenue for this and the next 2 seasons (so £12m in total) on the Away Fan experience.

There's a couple of reasons my hunch Southampton's activity is a 'Oh Shit, we've got to spend this money' rather than a planned approach.  The first is that coach travel goes on sale 6 days after the tickets which seems odd.  The second was this forum piece I found which details a response from the club earlier in the month to a Southampton fan:

I can advise that the away supporters initiative was created for clubs to use to improve the experience offered to visiting fans. This could be used either for our supporters travelling to other clubs, or supporters visiting us. 

The £200k was not a donation by the Premier League to all clubs in the league, this is a top end figure that they suggested should be put aside to improve this experience. The £200k is paid for entirely by the club. 

Currently SFC have put money into improving the away fans concourse areas and provided supporters with a family fun day as part of the Fulham match day experience. 
We are still looking into further options for this, when decisions are made they will be announced on our website. 

Knocking £30 off the ticket price, heavily subsidising coaches and offering a free meal means that this will cost Southampton around £60k in tickets alone for 2,000 fans and possibly closer to £100k overall depending on cost of coaches.  If this gesture is above and beyond the £200k fund then I'm happy to stand corrected and not be so cynical in future,

This ambiguity of ‘Away Fan Experience’ has probably in part led to a wide variation in approaches from clubs, some have kept it simple with regular ticket discounts while others have taken a more proactive approach and others have seemed to sit on their hands.

The Football Supporters' Federation have played a large part in getting this initiative set up and have details on their website of what some clubs have said they will do but I've also found a few examples from Fans Forums/Supporter Groups that highlight the vastly different attitudes to the initiative.  It's not meant to be a comprehensive list or bashing certain teams but highlights that even something as seemingly straightforward as this ends up with a dozen different solutions.

Manchester United at their Fans Forum had this:

MB outlined the away fans’ initiative. Each club has set aside £200k. The Club could spend this money on its own fans travelling away, fans visiting Old Trafford or a combination of both. 
...
Kiosk vouchers for away grounds were discussed, as was the availability of transport for the disabled. But the most popular idea was a £5 discount on the cost of away tickets. The Club agreed to implement this and further investigate funding the remaining league matches that may not be covered (ie a £5 discount might just stretch to 16 games, assuming current allocations). 

This is the approach a number of clubs have had with a straightforward reduction in ticket price although some have given deeper discounts than others, the picture below is from a letter from Arsenal to the 'Spirit of Shankly' supporters group:
Stoke have given free coach travel for the whole season and other clubs have gone for bigger discounts (or free/discounted travel) for specific matches.  As a Swansea fan, I know the club have made a lot of effort with regards to the initiative and were probably slow to trumpet their work as I saw a greater amount of coverage of Newcastle and Aston Villa offering reciprocal price deals even though Swansea was the common link.

Aside from reciprocal deals, Swansea have tried to have some sort of 'Thank You' at every game, ranging from free food to a free programme to discounted travel and free scarves.  I appreciate the gestures and the effort made, but personally I'd rather the £3.50 (if full price is charged) that is being given to another club in return for a programme was used elsewhere as I'll discuss later.

Everton's approach to the initiative is quite different to a lot of other clubs (from a Shareholder's meeting with Robert Elstone, Everton's CEO):
We spoke for a while about this initiative and the different approaches that clubs had taken to investing the £200k per year that has been set aside by the clubs to enhance the away match experience. Mr Elstone reiterated what he had said in previous meetings that this is supposed to be about filling the away ends of grounds. He noted that the ticket price subsidy the Club had announced for our own fans accounted for about 25% of the total spend.
Additionally the Club would soon announce a new role of ‘Away Fan Ambassador’ who would be available 8am through 8pm on match days to support the needs of away fans including providing live updates (presumably through social media platforms) of things such as traffic and weather updates.
He is though really frustrated by the actions of what he called the ‘less proactive clubs’ who’d used the whole £200k to simply knock a few pounds off tickets for their own fans, especially those clubs who have a 100% away following anyway and so their actions are very unlikely to increase attendances.
He has (or will) asked the Premier League to be more specific about their expectation for the use of these funds as the scheme continues in 2014/15 and 2015/16.
This from a Newcastle Fans Forum in late Feb was also interesting:
TC(Fans Rep): "What is the Away Fans Fund being spent on?"

LM(Club Rep) explained that the Club has to disclose its spending in this area to the PL and will be doing so shortly. LM will also be attending a meeting with the PL and club supporter liaison representatives next week where this subject will be discussed and ideas shared.

The board stated that it does not agree with the concept of subsidising away match tickets as this simply means it has to hand money over the home club, which doesn't discourage it from setting fair prices. Instead, the Club has pursued reciprocal pricing deals with other clubs but that this still represents a loss of revenue for those participating, which is offset against the Away Fans Fund.

TC agreed with the principal of reciprocal pricing and thought it was a good idea.

The Club disclosed that a significant amount had already been spent on the visitors section at St. James' Park this season, with the Away Fans Fund designed to be spent on the clubs' own fans who travel and the designated away end in their stadium.

The Club is funding away travel for the Newcastle United Disabled Supporters Association (NUDSA) to Hull City next weekend, with tickets, travel, food and stewarding all provided.

Gareth Beard explained that NUDSA was unable to travel in large numbers to other PL away fixtures due to the lack of available space for disabled supporters.

The Club asked supporters to continue sending in ideas. The Club also noted guidance from Fans Forum members that discounted travel was not universally popular due to supporters who travel from different areas and on other forms of transport not standing to benefit.

Since the above, Newcastle have now set up £10 discounts for trips to Stoke and Arsenal, possibly like Southampton have realised they need to get the £200k spent.

Approaches by club therefore range from basic discounts to 'fan experience' activities to using some of the money to refurbish a clubs away end.  Fans who support Everton away on a regular basis might argue that they'd rather had the extra £50-£60 back in reduced ticket costs but at least Everton appear to have a coherent strategy as to what they are doing with the money rather than blowing it all in one fell swoop.

The moral of the story for me is that if you give clubs too much room to interpret what the reasoning is behind the initiative a some of them will look to tweak it to their own advantage.  Personally I'd recommend a simple plan where an extra category of tickets is set up and 16-21 year olds ticket prices are set half way between Adult and Child.  So for Swansea for example where Adult is £35 (for most games) and Child £17.50, introduce a £26.25 bracket.

Obviously not everyone under 21 is poor and everyone over 21 rich, but this seems to me to be a simple way of encouraging the kind of people you want to keep coming to matches for the good of the game.

Other Posts:
Match Predictions: Are you Smarter than Lawro 
--
Dan Barnett

Director of Analytics
Analysis Marketing Ltd


Twitter: @analysismktg 

Friday 4 April 2014

Football: Big Data and Small Data problems

Last week I attended the Sports Analytics Innovation Summit held at the Emirates which had a range of speakers talking about the use of data in a variety of sports and different areas such as performance, psychology and fitness. 

In his review Sky Sports' Adam Bate pretty much hits the nail on the head in that you need to actually apply the data not just collect it for the sake of it. There's no doubt that the use (or at least collection) of data in football is becoming increasingly prevalent but it's the 'So What' factor that's critical. 

I liked the story from a few months of Forest Green who are a Conference club ditching Prozone, not least for the use of the word 'malarkey' in the local newspaper headline (I imagine them outside the ground with a 'Down with this sort of thing' placard against the use of modern technology in sport).
Top 2 results for 'Forest Green Prozone' on Google, professional step in Feb '13 but binned by new boss in December after manager Dave Hockaday leaves 'by mutual consent' in October.
Some of the quotes by the new manager Ady Pennock make him look like quite a traditionalist:

"I am a great believer in what I see and my eyes don’t lie, so I don’t need a bit of paper...The most important stat is the scoreline and I don’t want Prozone for the sake of having it"
It might seem a bit backward but I'd rather someone had the courage of their convictions rather than spending money on a product that gathers dust just because 'it's what the elite clubs do'.

Clubs are at the stage now where they are facing both 'Big Data' and 'Small Data' problems and it's how they deal with these that will determine the level of advantage they get over their peers.  

Analytics is just another enabler like better training facilities, diet, sleep patterns etc., there's no magic solution but small gains can make big differences to final outcomes.

Big Data Problems
As a Data Analyst, I probably hear or read the phrase 'Big Data' a dozen times a day (almost as many times as I have to watch a presentation that has a YouTube clip of Moneyball included) and a lot of the time it's used like its predecessor CRM (Customer Relationship Management) as a buzzword to try and sell you something you don't really need or to make something that's relatively mundane sound a bit more interesting.

Ultimately it comes down to the fact that it costs very little to store information and has becoming increasingly easy to capture information, so there is the desire to capture as many things as possible as frequently as possible regardless of if it has any real value.

The most obvious example of 'Big Data' in football would be Prozone where each player (and the ball I'm assuming) are tracked 10 times a second (some systems in sports such as NBA track it 25 times a second) so a 95 minute match after injury time gives 1.3m records per match.  It's not small but nothing compared to what a web company may store.

Even the most data-savvy manager is not going to want to wade through that much data so it'll be the job of Performance Analysts (and Prozone themselves) to to try and gain insight from the data, naturally the first stop will be top level metrics like top speed, distance run, #sprints etc., but it'll be the ability to go beyond this and be able to interrogate the data in more detail that'll make the difference and is probably where the new Forest Green Rovers manager is coming from, if you haven't got the resource to even scratch the surface of what the data could tell you, what's the point in having it.

Similarly for training data, you may have GPS data, heart rate, saliva, sleep diaries but you need to be able to go from a bunch of data to something that can change what you do with players.

Small Data Problems
Football also suffers from 'Small Data' problems in terms of small sample sizes both in terms of number of matches and number of players involved.  If one player scores 10 goals in a season and another 13 which one is the better one?  Even if you factor in things like expected goals (chance of any shot being scored so an effort from 6 yards is different to one from 40 yards), you're still going to be left with a fair amount of doubt as to which will perform better next season even if any estimate is far better than just guesswork.

Where there is doubt, there is the overwhelming temptation to not even try and be scientific and just go on 'gut feel' which comes back to the '..my eyes don't lie' comment, even though it's incredibly difficult to be 100% objective.

One obvious issue is that of confirmation bias.  As a Swansea fan, a good example for me is Dwight Tiendalli but more well known examples would be Tom Cleverly or Martin Demichelis where they are expected to be terrible, so every bad pass or missed tackle is seen as confirmation of this and any good play conveniently ignored.  This isn't to say that people's opinions are necessarily wrong overall, just that in any given match, the presumption of failure is already present before kick-off.


There's been plenty of talk recently about over-playing players and injury and I had a look a couple of weeks ago at the link between playing time and hamstring injuries for a few high profile players, one of the charts looks at Mesut Ozil's playing minutes and injuries:
Ozil's playing minutes over the previous 7/14/21 days along with injury activity, was overwork after return from injury in Feb responsible for injury in March?
The problem here is that there is generally too little data (especially publicly) to have any real knowledge as to cause and effect (there's always the risk you're looking backwards for possible factors once someone is injured, ignoring the time they or others exhibit the same activity but didn't get injured). 

You may have a small pool of players who have started playing again after a relatively minor injury but how many of them then follow the same playing schedule as Ozil and also a similar playing style in terms of distance run, sprints etc., and a similar physique.  It was interesting to see some of the doctor's presenting at the conference talking about pooling data (anonymously) which would improve the situation and this is taking place at some levels within UEFA.

There's also the issue around short-termism, it may well be that a particular strategy/approach is the best over a longer period of time (e.g., limiting a player's match time) but a lot of the time anything more than a fortnight away might be classified as 'long-term planning'.

If you imagine Wenger deciding whether or not to start Ozil against Bayern on the Tuesday after playing him for 90 minutes against Everton the previous Saturday:  at what level of likelihood of injury would he decide not to play him? 5%,10%,20% 50%? And when he does get injured is that particular instance bad luck or bad planning?

Just buying a piece of software won't solve your problems, but just as surely it's impossible for any one person to be able to collate, retain, process and analyse all information that may be useful in creating better performance.

Analytics is taken most seriously at the elite clubs but I'd argue that the greatest incremental benefit would be for Championship (or ambitious League 1) clubs, there's enough money there for it to be worthwhile and also you're more likely to be doing something different to your peers.

Overall, clubs are faced with the dual issues of having 'too much' data in some instances and 'not enough' in others. This is where the skill of an analyst comes in who can process the data, find the insights and present back in a way that actions are actually taken off the back of the data.  They'll also know the difference between what's interesting, what's actually important and what is just noise.

For me a performance analyst and a statistician are needed to work together to combine technical/programming/statistical skills needed to work the data, in tandem with someone who will be more closely involved in the application of any findings (if you can find someone who can do both, fantastic, make sure they never leave).

I'm naturally biased given my background, but the obvious solution for me is for clubs to bring in Data Analysts for the off-field work (Season Ticket Analysis, Club Shop, Social Media Analysis) but to free up some of their time to look at on-field data.  This way you have someone 'paying their way' even before they get to the football data.  I'd argue they'd probably contribute a greater amount to the bottom line if working fully on football data but that may be too much of a leap of faith for some clubs, at least for now.  

Other Posts:
Match Predictions: Are you Smarter than Lawro 

--
Dan Barnett

Director of Analytics
Analysis Marketing Ltd


Twitter: @analysismktg