Showing posts with label Guardian. Show all posts
Showing posts with label Guardian. Show all posts

Wednesday, 23 January 2013

Comparison of Opta stats providers

It may just be self-selection based on the kind of things I read, but there seems to be an ever growing interest in data in football and the subject appears to be moving away from the niche into the mainstream with increasing mentions in the press such as a recent article in The Guardian.

This is partly due to more and more sites making use of data in football, in particular from Opta.  In this post I'll look at the pros/cons of a number of sites/apps that use Opta data and their comparative strengths and weaknesses.

When Swansea City reached the Premier League with promotion in May 2011, I decided to set up the blog www.wearepremierleague.com to combine my interest in stats with that of the Swans. Generally speaking there is a paucity of (publicly available) data around activity in lower leagues - although credit must go to Ben Mayhew for his attempt to rectify this at Experimental 361. The level of detail publicly available for the top leagues in Europe however is still far beyond that in the Championship and below.

Guardian Chalkboards
When I started out, this was one of the few resources about and had the advantage of being free and web (not app) based.  I won't go in to too much detail about it as its sadly no more (possibly ahead of its time?) but the thing I liked most about it was to be able to visualise the activity with regard to where on the pitch it took place.

The image below shows a Swansea goal against Blackburn where every Swansea player touched the ball in the move.
The addition of squad numbers to activity gives a level of detail not available anywhere else I've looked
Stats Zone
The demise of Guardian Chalkboards a couple of months into the season was the nudge I needed to get an iPod Touch to be able to use the Stats Zone app.

Stats Zone is great for both looking at the top level stats (e.g., Shots per Team) or delving in to the detail of a particular match (e.g., Long passes by a particular player).
Example of a Stats Zone Screen shot, in this case comparing the Aerial Duel activity of Peter Crouch with the Stoke team as a whole
Stats Zone is produced in conjunction with FourFourTwo magazine and their website includes blogs produced by Opta and Zonal Marking and others.

I combined my interest of football with that of data analysis in the creation of a Premier League Review dashboard, which is a interactive presentation taking a number of images from Stats Zone.

WhoScored.com
Whenever Swansea are linked with a particular player (usually from La Liga), WhoScored is the first site I go to as it has in-depth details for any player across the major European leagues:
WhoScored has details both on overall activity for that season as well as the ability to drill down in to activity for a particular game
WhoScored also has a fairly comprehensive list of stats for any particular match with the ability to order ascending/descending on these metrics for each played within a team (Long Balls, Chances Created etc,.) and also blogs from a number of respected writers.

Squawka Sports
Squawka.com is to some extent a cross between Stats Zone and WhoScored in that you can look at activity of individual players across the season as a whole, but also look at specific types of actions graphically for a specific player in a particular match e.g., Canas' passes vs. Malaga
Squawka goes for a dashboard approach for presenting a lot of its data
EPLIndex.com
The level of detail available in the sites/app mentioned above will be enough for the majority of people but for those wanting even more, there is the pay-for site EPLIndex.com (£3.95 a month/£40 a year) which has even more detail.

Where WhoScored for example might have total passes and pass accuracy, EPL Index will break this down even further e.g., Passes/Accurate passes in Own Half/Attacking Half/Final Third:
EPL Index Screenshot - huge amount of data across numerous tabs
The level of detail of this data is pretty much the same as the release from Opta/Manchester City of the summary stats for the 2011/12 season, just not in a single spreadsheet.

One of the other advantages of EPL Index is that it has data for multiple seasons making comparisons such as one I did recently comparing Danny Graham and Kenwyne Jones possible:
Example of the kind of thing its possible collate using data supplied by EPL Index
As well as the option of subscribing to stats, for those who just want to read about stats and football the site has an ever growing number of authors who use the data to write and publish their own analysis to a level of detail which is arguably a depth of analysis rarely seen anywhere.

Relative Strengths and Weaknesses of each source

Stats Zone - Strengths:
  • Ability to visualise activity e.g., location of Shots/Interceptions etc., 
  • Includes simple top level summaries e.g., total tackles made ordered by all players not split by team as is the case in the other sources
  • Ability to drill into data within the game e.g., compare first 62 minutes with last 28
  • Ability to create bespoke comparisons across matches/teams e.g., Chances made by John Walters in first 30 minutes vs. Aston Villa compared to Chances made by Stoke vs. West Brom  
Stats Zone - Weaknesses:
  • Apple devices only - no Android or Web version
  • Lacks ability to see multiple stats simultaneously e.g., Tackles/Passes/Shots per player
  • Doesn't have stats collated across a season

WhoScored - Strengths:
  • Includes data on all major European Leagues and Champions League
  • Easiest site to navigate around between stats for Team/Player/Match
  • Best for comparing statistics across teams, form/shots per game
WhoScored - Weaknesses:
  • Little visualisation of data - there is a nice image of shot areas but not the chalkboards such as those from Stats Zone/Squawka
  • No ability to analyse activity within a game e.g., compare 1st and 2nd half stats

Squawka - Strengths:
  • Has ability to easily track metrics for a team or player for a single match or across season
  • Includes heat maps of activity by player/team
  • Ability to drill down within part of the game (currently 5 minute intervals)
  • Lots of charts as well as raw data, multiple options for visualising the same data
Squawka - Weaknesses:
  • Doesn't have the same level of detail of stats readily available as other sites although only likely to bother the really in-depth user
  • Good to have charts but some could be better e.g., if a player has played in 15 of 22 league games only stats for those 15 shown.  Personally would like to see the blanks to know where over the season that player hasn't featured
  • Stats Zone plots chalkboards from the point of viewing of the team your analysing attacking from left to right, Squawka plots them with Home team playing from left to right which can be annoying when trying to compare areas of attack/passing

EPL Index - Strengths:
  • Most in-depth of any of the data sources
  • Has league data going back to 2008/9 season
  • Top-Stats feature gives ability to find best players across a range of metrics with ability to filter by those playing at least x minutes in a game or total minutes across a season (e.g., avoids problem of someone coming top in pass completion % with 1 pass from 1 attempt)
EPL Index - Weaknesses:
  • Pay-for site
  • No ability to analyse activity within a game
  • Generally best thought of as a source of data from which you create something yourself 

Turning Data into Insight
Although each of these companies is taking the same (or at least similar data) from Opta, it can be seen that they have each used it in different ways and are all still improving as time goes on. Eventually I'd imagine that one of these sites (or a newer entrant such as Sky) will bring all these parts together, possibly also including video for a complete experience.

As an example, a lot has been made recently about David de Gea pushing balls back into dangerous areas when he makes a save, the raw data will only tell you so much but to be able to view all his saves or saves where there is a goal in subsequent 10 seconds would give an even more detailed picture.

TV rights are far to precious to be given away but the ability to create your own highlights package (e.g., All chances created by Pablo Hernandez, with approx 10 seconds of footage per chance created) could take interactive entertainment to a new level.

Other Posts:  Man City and TwitterTwitter and Bookies - A Case Study , Premier League Weekly Review

Tuesday, 23 February 2010

Supermarket Pricing - Beware of Averages

On 13th Feb, the Guardian had a front page ‘special report’ on how Tesco and Asda had undertaken "...cynical and aggressive" price rises in the week before Christmas. The figures provided showed that for Tesco over 1,500 items had price increases between 9th and 22nd December with an average increase of 32p.

Tesco countered this by saying that they dropped the prices of 2,638 products with an average decrease of 54p over the same period (i.e., more products with prices dropped and a higher average price reduction).

Although both statements are factually correct they are both pretty much pointless, as an average (mean) is only really applicable if the data is not skewed. For the Tesco price rise data, the price rises are between 1p and £15.17, with a handful of items skewing the average:



Also, looking at the top 10 price rises it can be seen that 3 of them are for the same product (FIFA 10) on 3 different formats which, it could be argued, gives an unfair distortion of the figures.

If you were looking to show a lower level of average price rise you could just focus on Groceries rather than including non-foods where the absolute rise will be higher due to the generally higher item price.

It’s also likely, however, that Tesco’s price cut figures have a similar level of skew.

The other point to make is that the average in this case would only be valid in terms of the impact to the customer if all the items were sold in the same quantities. A penny on a pint of milk is likely to yield more profit that would be lost by knocking a pound off an obscure item.

The Guardian yesterday (22nd Feb) issued some new data around the supermarkets’ use of 1p discounts to promote a feeling of a ‘price war’ between supermarkets. The tone is that supermarkets are being ‘sneaky’ as the majority of price cuts (70% for Tesco) in the period 16-23 Dec 2009 were for just 1p when the typical price rise is higher.

The Guardian implies that when Tesco (and Asda) cut prices they cut them by a little amount and when they put them up they put them up by a greater amount so the consumer pays more overall. This implication is only valid if the overall comparative volume of sales from discounted and increased items results in a higher overall basket.

If customers buy 10 times the volume of an item discounted by 1p as an item increased by 5p then overall customers are better off than they were before the price changes.

With the huge range of products available it is easy for any supermarket to cherry-pick which prices it manipulates to look good in comparison to the competition. This should be taken for what it is, headline grabbing marketing, rather than each retailer cutting prices to the bone to give you, as the customer, the best deal possible.

All the noise in adverts about which retailer is the best probably cancels itself out so you are left with the same impression that you had as before of the main supermarkets. But none of the major supermarkets can risk not running similar campaigns for fear of customers believing the version of events its competitors run.

It would be interesting to see what this flurry of price cutting adverts has done to the extremes of the market – for example Aldi or Lidl at the lower end, Waitrose or M&S at the higher end.

Has the advertising of the big players made the low end seem irrelevant and the high end seem even more expensive by comparison? Or has the ‘price war’ lumped the main supermarkets in with the low cost brands and therefore created a distinction in quality for Waitrose and M&S?

The moral of the story is to make sure that you have access to the raw data behind any figures as, depending on which side of the fence you sit, you can ‘prove’ pretty much anything you want by defining the terms of the analysis and the metrics used.

Dan Barnett
Director of Analytics

blog@analysismarketing.com

LinkedIn: http://www.linkedin.com/in/danjbarnett