Avoid Google Analytics Data Sampling

This blog describes an easy and free technique to avoid Google Analytics Data Sampling without having to pay for Google Analytics 360 (aka Premium).  If you have Premium, it might save you from the two step process described here.

Sampled data is wrong data. There’s a lot at stake, if you are publishing wrong data!

Click the button that appears on your ribbon bar (in Windows), and you can get more detail. This can happen, even if you paid for Google Analytics 360 (aka Premium).

Avoid Google Analytics Data Sampling – Click by Click Instructions

The goal is to avoid break down large queries into smaller ones, to avoid 500,000 session threshold.

NEXT Analytics does this internally on your behalf. All you need to do is request it, by clicking the box.

Avoid Google Analytics Data Sampling

To get to this box, follow these steps:

  • Click Custom (on the ribbon bar) then
  • click Google Analytics if it is a new query, or Edit if it is an existing query
  • click Avoid Data Sampling

What it does is break the query up into day by day sections and then combine all the days in your range once it has all of them. In the business intelligence world, this is called “stitching” query results together.

If your query required has less than 500,000 sessions it won’t get sampled. Since it is day by day, then it is 500,000 in the one day, for each query. This avoids data sampling in many situations. Without this feature, you are querying for a date range, and that might sum to the 500,000, and that’s why you get sampling.

This technique has some limitations that we’ve seen people encounter.

  • if your dynamic segment has to evaluate 500,000+ sessions, then it will still cause sampling, even if your finished report has only a few rows. This feature won’t work in such situations. We advise that instead of using GA segments and filters, that you download the data and use NEXT Analytics own post-download filters.
  • some metrics are easily summed, so that works fine
  • some metrics are calculated, such as percent-of. These should not be summed, hence there is a drop down list of other aggregations. You could, for example, choose to compute an average of each day’s percent-of. That will get you pretty close to what Google would have returned for an entire date range, if you weren’t also querying individual days.
  • Because of this limitation, then you will realize that you should keep metrics that are alike in the same query, and move metrics that can’t use the same aggregation into another query.
  • the metric Sessions will likely vary from Google’s unless you are doing a daily query. This is because the way Session terminations are defined. Since you are now querying on a daily basis, some Sessions that occur at the end of day will be considered to be be completed at midnight. If your original query did not have a daily dimension, the Session might not have been terminated till the next day. This means the number will vary slightly, since Google Analytics will have access to the half hour past midnight to see if a Session ended or didn’t. This is pretty much the same as if you were running a query that had individual days in it, or not.
  • finally, some dimensions are numbers. Sadly, these get aggregated too. Our advice is to avoid these ones for this type of query. This may get resolved in a future release, so it is best to check with the support team if this is an issue for you.

In many many cases, the sampling is avoided, and the data can be used in a dashboard or report. The only cost is that it can be a little bit slower than before, and you might need to split the metrics that can’t be summed into a separate query and choose average.

How to work with the un-sampled downloaded data

Once you have the data on your computer, you can do many things with it to make your job easier when you want to build a dashboard or prepare a report.

  1. Excel. You can build a dashboard in Excel, using NEXT Analytics’ Import File feature
  2. Save the data in Google Spreadsheet and then use Google Data Studio to connect to the spreadsheet.
  3. Save the data Google Cloud SQL
  4. Save the Microsoft Azure database, use it with Microsoft Power BI
  5. Save the data in a local database

Added Value with Embedded Analytics

Once you are comfortable with the data flow, you should consider experimenting to NEXT Analytics value-add embedded analytics engine.  This engine modifies the and makes the data much easier and quicker to work with in dashboarding tools.

For more information, visit the Macros and Commands that you see, after you’ve created your Custom Queries.