Top 10 Challenges Facing Digital Analytics

Digital Analytics is becoming more and more core to a business’ strategy. Without a good Digital Analytics team and implementation in place, a business cannot measure its performance, set KPIs, test new functionality and spot opportunities for new product development.

After working with different analytics platofrms and more recently running the Digital Analytics team at IPC Media, I am highlighting some of the challenges that companies and teams face on a daily basis to help better prepare you to make more informed decisions.

Finding the right analytics tool for the job

How do you decide which analytics platform to choose? If you want a traditional analytics product for free, then Google Analytics is the first port of call for most people – but there are limitations.

The free version is extremely fully featured and initially very simple to implement. Just past the code into your site and away you go. Some issues are that you only get 5 custom variables and is limited to 10million hits. Anything above 10million may not be collected.

This is is generally fine for a lot of smaller web sites, but isn’t really scalable for larger organisation. For that, Google has a premium version of Google Analytics that massively increases the number of custom variables and hits.

Some limitations that I find with Google Analytics are:

  • Data sampling – when looking at segmented data and longer time frames, Google Analytics samples the data to ensure the speed of the interface. This causes issues because the advertising and business want to see the raw numbers.
  • Unsampeld data – to get the raw data is cumbersome. Also, the GA APIs can contain sampled data as well. They do have an integration with Google Big Query, but again, this is not very straight forward
  • Limited dashboards – the dashboards can be effective if kept simple, but most of the time you cannot do what you want to do. No customisation is possible and you can only add 12 reportlets to the dashboard.
  • Custom metrics – you can only use the metrics deemed important by Google. You cannot create your own metrics such as pages per user, time spend per user, etc.

I currently use Google Analytics Premium at work. We did used to use Adobe Analytics which is actually a superior product, but is also considerably more expensive. It does not sample data, allows for more customisation and you can create your own metrics. Google Analytics is definitely getting there with this with Universal Analytics.

I have not yet upgraded our sites to Universal Analytics, so I cannot really comment on it just yet.

There are also a number of new tools on the block such as MixPanel, ChartBeat and Parse.ly. These offer a different take on analytics – looking at tracking events rather than traditional page views (MixPanel) and real-time analytics (ChartBeat and Parse.ly).

In addition, whatever tool you decide to use, a good Tag Management system is really beneficial. A Tag Management system allows you to paste a container tag into your templates and you can then serve any javascript (such as web analytics) into it via an interface.

It means that whenever you want to make a change you do not need to change code on the site. A lot of these Tag Management systems also allow you to listen out for certain events to fire before tracking is collected – so a click on a button, a specific page load, signing up to a newsletter etc.

Examples of tag management systems are Google Tag Manager, Tealium and Satellite.

Hiring good people

Now Hiring
Now Hiring

Having the right tools is one thing. The next most important thing is to ensure that you have the right people in place to be able to help set a sound analytics strategy, implement the code, analyse the data and make recommendations.

Ideally, you would also ensure that the culture within the business focuses on this data and uses it to help direct decision making. What’s the point in having great data if its not being used.

I am finding that you get a real mix of people through job sites like LinkedIn and TotalJobs. I would recommend trying to build a network of analysts through meet ups like Web Analytics Wednesdays and MeasureCamp.

Hiring a good analyst is key here. If you are just starting out on your Analytics journey, I could not recommend more you hiring a good analyst with at least 3-5 years’ experience. I would also suggest that you get someone who has implementation experience as this will really help you to have the right tracking in place for the metrics that are important to your business.

I would suggest either trying to headhunt or use an agency for this initial hire.

There are a lot of very good contractors out there, but they can be relatively expensive. In my experience, good permanent analysts are not actively searching for roles on LinkedIn and other job sites. They work with agencies and you need to work hard to find them.

The only other route would be to work with an analytics agency. I have only worked with two – Acceleration for an Adobe SiteCatalyst implementation project and Periscopix for our move to Google Analytics Premium and ongoing consultancy. Both of whom I would recommend to anyone asking.

Training

Having the right people in place is obviously important, but so too is keeping their skills inline with current trends and new technologies. So too is training up your existing workforce to be able to pull basic reports themselves – lightening the load on your Digital Analytics team so that they can focus on the bigger stuff.

It can be quite difficult to find decent training programmes and they can be quite expensive.

Google Analytics offers a well structured training programme which you can find here. There are also plenty of videos on the Google Analytics YouTube Channel which are extremely informative, although some of the voice-overs can be quiet annoying!

In terms of paid for courses, we have recently completed a review and found the following:

  • http://www.marketmotive.com/ – Market Motive has some great courses to keep your team up to date in the industry
  • http://www.digitalanalyticsassociation.org/ – This is the official Digital Analytics Association where they do courses with University of British Columbia. These are more long term courses and a great way to become a certified analyst

Tracking cookies vs tracking people

As most analytics platforms base their data on cookies, you need to understand that you are not tracking people. You are tracking a cookie on a device. This in itself leads to multiple tracking of individual users. A user could visit your site across their laptop, their mobile phone and their tablet. This one person would be tracked with three seperate cookies, and therefore as three unique users.

In addition, if a user deletes their cookies or uses Private Browsing then they will be tracked as a new user the next time they visit your site. This leads to further duplication and inflation of your user count. More on this in the next section.

Cookie deletion/Private Browsing

I have done some research and its hard to find any definitive numbers on how may users delete their cookies. I found this post from April 2011 which suggests that:

  • 30% of users delete cookies according comments by Jules Polenetsky on a panel at the W3C Workshop on Web Tracking and Privacy
  • 40% of users delete or block according to OpenTracker.net
  • 36+% of Australian Internet users delete their third-party cookies in a month according to comScore report (2011)
  • 76% of Safari users block third party cookies according to Gibson Research. (Note: This is super-high because Safari is the only browser that blocks third-party cookies by default)

Its hard to quantify, but there are a lot of people now using Private Browsing. This blog post from May 2012 suggests that 19% of survey respondents use private browsing. You can imagine that this number is now higher, two years on.

Private Browsing only stores cookies and browser history for the duration that the window is opened. As soon as the browser is closed, all cookies and data that would normally be stored by the browser is deleted.

One thing that Private Browsing doesn’t do is stop your internet activity being viewed by third parties – such as ISPs, governments or hackers – so be aware of this.

This means that any time a user comes to your web site in Private Browsing mode, they are tracked as a new user – and duplication occurs.

Campaign tracking

Campaign Tracking
Campaign Tracking

In terms of campaign tracking, I am referring to how you can use the UTM campaign parameters in Google Analytics to help to organise and define your marketing efforts. Google has a great tool called the URL Builder. This is really simple to use – you just paste the url you want to track, add the relevant parameters, click submit and it gives you the newly tracked url.

You can use this to track specific content that you are posting on social media, into email newsletters or for tracking PPC campaigns for example.

What you need to be very clear on is your strategy for naming conventions. I work for a very large publishing company and I see many different teams using the different campaign variables in different ways. Some people use capitalisation, they mix the source and the medium.

Also, sometimes they post the wrong tracking on the wrong web site – in other words, they post a link that they are attributing to Twitter but post it on Facebook.

This is bad because by using campaign tracking, you are overriding the defaults of the system.

This is the thing to remember. Let’s say that I wanted to promote a blog post from my site on LinkedIn. If I didn’t use campaign tracking, any traffic coming from the LinkedIn site would appear as a referral. I wouldn’t know if it was as a result of me posting the link or someone else. By using Campaign Tracking, I can attribute my posting to my efforts:

  • http://www.mike-dixon.com/?utm_source=linkedin&utm_medium=social&utm_campaign=blog-posting

You can see from the link above, that I am declaring that linkedin will be attributed to the source, social as the medium and blog-posting as the campaign name.

If I was to post this same link on Facebook, rather than Facebook being given attribution for the referral the UTM parameters will override the defaults and attribute the traffic to LinkedIn. So you need to be ever so careful when posting these links and make sure that whatever the values are for the parameters, that you have a consistent naming convention.

Also, one last thing on this point, do not use campaign tracking for internal links! Whenever a user clicks on a link with campaign tracking, a new visit is started, even if it is in the middle of an existing visit. Instead, look to use events to track internal marketing campaigns.

Tracking traffic from Apps

One of the issues with Web Analytics systems is how referrals work. I’ll try to keep it brief as it can get complicated. In short, web analytics systems look at header information stored in the web browser to work out whether there was a referrer and if so, the url of that refferal.

A referral occurs when a user clicks on a link from one web site to another in a web browser.

When a user clicks on a link from an app, the mobile device starts up a new browser session and sends the user to that web site. Your analytics tool will not see a referral in the headers because the user did not click a link in a web browser, they clicked the link in an app.

The only way around this is to use campaign tracking of any links that may appear on apps. The main situations where this could occur are social media and email apps.

As you have seen above, by using campaign tracking, you can override the defaults. The default action when clicking on a link for analytics tools is to set the source as direct. If you used campaign tracking, this would be replaced with the attributed described in the variables.

Obviously you cannot do this for content that is shared organically by your users, but as long as you do this for any content you share yourself, you at least know that you are attributing a large amount of traffic to the right source.

Implementation

IPC Media has well over 50 web sites, so having a standardised implementation across the board really helps with insight and analysis. I have mentioned earlier in the post about using a tag management system to serve your analytics tags. We have not yet done this, but we are working towards this – and it will make such a difference to us.

Just having the ability to centrally manage the implementation across all the web sites using an interface reduces the time to market of any changes that need to be made.

In addition, it takes a lot of the stress away from the development team as well as needing to get work prioritised by the business in the planning meetings. My analysts can make improvements/changes as and when required with changes easy to roll back.

Fortunately I was very clear on the requirements for the implementation of analytics at IPC Media, but I realise that others may not be. My recommendation here is to start with a minimum implementation and incrementally add to it.

Once I had my minimum implementation in place, the plan was to then incrementally improve the implementation by adding new features and events such as newsletter sign ups, competition entries etc once the page level implementation was complete.

Remember, your data is only as good as your implementation so I cannot stress enough how important it is to have someone who has experience of analytics implementation in your team looking at this on a regular basis.

Data Sampling in GA

If you didn’t know already, Google Analytics samples data for a number of reasons that you can read about in this article, How sampling works in Google Analytics.

Although Google Analytics Premium offers unsampled reporting, you have you specifically request each report which takes time and is a manual process. The API also returns sampled data.

Google Analytics Premium has integrated to Google Big Query which does allow you to look at hit level data to get the unsampled data. The sign up process is cumbersome with approvals required from the Google Analytics Premium team at Google. In addition, all data is put into daily tables, therefore you need to create complex queries to get the data you need out of it.

What I would love to see is that the regular Google Analytics API return unsampled data if you are a Premium user. Or at least I am given the choice of having a slower interface to get the raw data!

Big Data

Big Data
Big Data

Big Data is one of those terms that will either get you really excited or will daunt you. For me it is the latter. I really do think that there is a lot of value to implementing Big Data solutions but only if you need it. If you are well organised, then you can quite easily bring together multiple data sources relatively cheaply and without much difficulty.

Big Data works well when you are dealing with millions and millions of rows of raw data that is unstructured and that needs a lot of filtering. In essence, tools like Google Analytics are big data platforms – just specifically for web analytics.

I have seen many posts about how Big Data does not equal Big Insight – see this post from the FT entitled Big data: are we making a big mistake?

You can have all the systems in place, collecting every single scrap of data that your business is collecting, but without the right business questions and workers in place this data will just sit there and cost you money.

As with all these systems, I would suggest that you make sure you are extremely clear on what the KPIs are for your business and decide on the right analytics strategy. Don’t just collect data for data’s sake. Only collect the data you need to make the right decision.

Handy Regex Examples for Google Analytics

Regex, or Regular Expressions, can be a seriously powerful tool when creating custom segments, reports and filters. They allow you to go beyond the already powerful standard click and select functionality that comes as standard with Google Analytics.

There are plenty of resources available out there to help with this, but they can get quite technical. I hope to bring together the regex patterns that I find the most useful, and hopefully you will to.

If you have any use cases that you are having trouble with, please add them to the comments below and I will help you to find the best way to use regex to solve them.

I will be keeping this blog post up to date so please do bookmark and check back every once in a while.

Useful Regex

The Wildcard

If you want to see traffic to a specific section of your site using your url structure, then this regex is the one for you. This technique can be applied to anything where you want to match a word or phrase within a string (in this case a keyword in a url).

  • In your filter ensure that you select Page for the URL, Regex for the condition and (.*)2013(.*)

In this example, the filter will return any page that has “2013” in the url. You could use this for anything where you are trying to match a specific work within a string of letters and numbers.

The regex (.*) is essentially a wildcard that, when placed before and/or after a keyword allows you to say to Google Analytics, please give me everything where this word appears.

The OR Regex

There is a regex condition that allows you to find multiple keywords in the page title. Let’s say that you wanted to find all content that relates to Christmas. You could create a filter on a Page Title report that only pulls in data based on a certain set of words.

You do this by using a PIPE:

  • In your filter, ensure that you select Page Title, Regex as the condition and then your keywords – which should look like:
    • christmas|santa|noel|mince pie|holly|misteltoe

The filter will return any pages where the page title contains the keywords listed above. Remember not to add a PIPE at the end of your keyword list as this will result in returning all pages.

The give me something specific Regex

Let’s say that you wanted to filter only traffic from Facebook and Twitter in your All Traffic Report in Google Analytics. To do this, you need to use Regex.

Go to the All Traffic report and select Source as your dimension rather than Source/Medium, then click on Advanced Filter. In the filter, you need to add the following to ensure that you only include traffic from Facebook and Twitter:

  • facebook|twitter|^t.co$

By using the PIPE, you are ensuring that any of the keywords you added are included. The ^t.co$ ensures that you are only including traffic from Twitter’s redirecting url, t.co.

If you hadn’t encased the keyword with the ^ and $ symbols, you would have seen sources that contained t.co – for example pinterest.com and other domains.

  • ^ means starting here
  • $ means ending here

What is Cohort Analytics?

Cohort Analytics in this context is the measurement of how often a user returns to a website over a given period of time.  By understanding how well you retain your users, the better you will understand how best to monetise them – which is what we are all chasing in digital publishing.

This is not the same as the standard vanity stats that you may find in Google Analytics or Adobe’s SiteCatalyst for return visits or visitor retention, this will give you a more detailed understanding.

Also, if done correctly, you can use cohort analysis to measure not only general users to your website but also registered users, logged in users, users that purchase or convert to something.

Cohort analytics is quite a new concept for digital publishing.  In the past, CPMs have been high enough for publishers to only have to worry about unique users, visits and pages. But now, with the advent of Google Adsense and Facebook Ads, advertisers can now target audiences, so now publishers need to focus on other ways to measure and monetise audiences.

How to measure retention

Cohort Analytics is not something that is available out of the box with most standard web analytics tools.  Unfortunately it takes a bit of hacking to get it to work with Google Analytics – even then, it is quite limited as you can only measure over five units of time.  This will all become apparent shortly.

In addition, this explanation will only give you a basic overview for general tracking.

Google Analytics allows you to configure custom variables – of which there are five – and these segments can be persistent over a number of visits.

See Google’s documentation on custom variables here.

Custom variables should be set per time period that you would wish to track.  In this instance it would track user retention over a rolling 5 month period.

Here is some example code for month one using the first of five custom variables:

*** CODE ***

pageTracker._setCustomVar(
      1,                   // This custom var is set to slot #1 for the first month. 
      "Month",           // The top-level name for your online content categories.  Required parameter.
      "January 2011",      // Sets the value of "January 2011" to "Month" for this particular aricle.  Required parameter.
      1                    // Sets the scope to visitor level.   
 );
 pageTracker._trackPageview();

*** END CODE ***

Once this code is in your site, it will need to change each month. The two paramters that will need to change are the first and third variables where each will increment when month 2 begins.

*** CODE ***

pageTracker._setCustomVar(
      2,                   // This custom var is set to slot #2 for the second month. 
      "Month",           // The top-level name for your online content categories.  Required parameter.
      "February 2011",      // Sets the value of "February 2011" to "Month" for this particular month.  Required parameter.
      1                    // Sets the scope to visitor level.   
 );
pageTracker._trackPageview();

*** END CODE ***

Once this has been implemented correctly and has gathered the relevant correct data, you will see some hopefully nice results. In addition, if you are clever with your naming and strategy you will be able to measure much more than this.

The results

Cohort Analysis

Hopefully you will see something like the image above after a period of time.  What the above table shows is how many of the users return in the following months from their first visit.

The reason for the Month 1 statistics all being 100% is that all users in Month 1 are new. In Month 2 it shows how many users from Month 1 returned in Month 2. Month 3 highlights how many users returned to the site in Month 3.

By fully understanding how long your users keep coming back to your site, you can really start to focus on some new metrics.

You will be able to work out the Lifetime Value (LTV) of your users which would help you to work out how much you may want to spend on marketing. By understanding this, you can ensure that your marketing stays profitable.

You can also start to focus your development attention on lengthening the lifetime value of your users.