Find 20% of missing site traffic with plausible analytics and some proxying

Published on August 15, 2022 - Tagged: #engineering

Follow me on twitter for more posts like this

Google Analytics (GA) has been a force in web site metrics since 2005. The metrics have always been incredibly useful but it’s a “free” product so you pay for it by providing all your site data to Google for tracking and advertising.

With Google Analytics your metrics are tightly coupled with tracking and advertising so when ad-blockers kick in to block tracking they also block your metrics!

The good news is that this is all fixable!

In this article I’ll describe the issues with Google Analytics and show how to have privacy-first site metrics without all the tracking and advertising.

All of your users will be reporting visits anonymously AND you will get back the missing 20% of data.

Google is replacing Universal Analytics with GA4

The current version of Google’s product called Universal Analytics (UA) was released 10 years ago in October 2012! I’ve used UA on this site since 2016 to motivate myself to write. Seeing reader sessions slowly rise year over year is the best motivator to keep writing.

This year Google have introduced a new system for analytics called Google Analytics 4 (GA4). GA4 is a big change from Universal Analytics, the two products are incompatible and Google are forcing users to upgrade. In July 2023 Universal Analytics will be retired.

Part of the reason for this upgrade is to introduce new privacy features to the platform such as

  • Anonymous IPs - GA4 doesn’t store IP addresses by default
  • User Data deletion - A user can request that their specific data is removed from Google.
  • A reduced data retention period - You can change the retention period to 2 months or 14 months

Result

You will have to upgrade your site statistics platform because Universal Analytics data collection will not work after this date.

GDPR and privacy laws

It’s 2022 and the internet is far more integrated in to our lives than it was in 2005 or even 2012. This internet adoption has forced a change in the way people and governments treat data privacy on the web. Legislators are introducing web privacy laws like the GDPR.

One issue is that Google tracks your users across time and devices. Google stores this information and uses it for advertising later. You can’t delete the specific user’s data easily.

Another major legal issue is that Google transfers data to the US for processing. This places European citizen’s protected data in the USA’s surveillance apparatus.

Google has been fined for capturing user data in Italy, Austria, France and there are continuing investigations in Norway and The Netherlands. This trend is only going in one direction unless a new agreement with the USA is negotiated.

Result

Google Universal Analytics does not meet GDPR requirements in many countries and as of August 2022 GA4 is also non-compliant.

source: https://www.datenschutz-notizen.de/the-dutch-data-protection-authority-has-published-brief-guidance-on-the-pressing-issue-of-google-analytics-2433548/

Ad blockers on the rise

Ad-blockers are on the rise all over the world.

Some statistics from various reports and surveys show 27% or 40% of internet users in the USA use an ad-blocker. Ad-blocking is specifically detected on around 18% of sessions on sites that track ad blocking. There’s similar statistics for the UK.

If you break the data down by age, the trend for younger demographics using ad-blockers is even higher so total ad-blocker usage will grow over time.

If you run a tech focused site, it’s likely that ad-blocker usage is higher for your users than in the full internet segment.

All ad-blockers prevent Google Analytics from download scripts and running.

Result

You’re missing out on anywhere from 20-40% of visitor statistics just by using a well-known tracking script like Google Analytics.

Source: https://backlinko.com/ad-blockers-users

Solution - use a different product!

To avoid all of these issues you need to find an alternative site statistic tracker. There are a few out there and you could roll your own, but I use Plausible Analytics.

Plausible Analytics is a privacy-focused tool with no advertising connections.

With Plausible you get:

  • A platform hosted in EU so no data transfer GDPR issues
  • A 24h time-limited, cookie-less user tracking system
  • No privacy cookie pop-ups required
  • It’s much easier to use than GA because it only provides useful statistics and features

Visit: https://plausible.io/

How to configure script proxying

Plausible is just a script you add to your <head> section. There are plug-ins for most popular platforms that will do this for you.

The real power with Plausible is proxying the script through your own domain so that ad-blockers will not block it.

This might sound dodgy but it’s perfectly fine to do this with a statistics tool that is privacy focused and does not track users but I wouldn’t try to do this with Google Analytics.

Proxying plausible script
Proxying plausible script

I use Gatsby and Netlify so I’ll describe that specific setup and some of the quirks I noticed.

1. Add the gatsby plugin

yarn add gatsby-plugin-plausible

// or

npm install --save gatsby-plugin-plausible

2. Gatsby configuration

Add the following to your Gatsby configuration. Note that the config property customDomain in the gatsby plugin for plausible is different to the documented tag on plausible’s script documentation.

{
  resolve: `gatsby-plugin-plausible`,
  options: {
    domain: `www.darraghoriordan.com`,
    customDomain: `www.darraghoriordan.com`,
  },
},

This plugin creates a plausible script tag on your page that allows for proxying through your website’s domain on /js/index.js. Note: You can’t change this path part if using the gatsby plugin for plausible.

<!-- This is just for reference here - you don't need to add this manually if using a plugin  -->
<script
  async=""
  defer=""
  data-domain="www.darraghoriordan.com"
  src="https://www.darraghoriordan.com/js/index.js"
></script>

3. Configure Netlify proxy

You have to tell Netlify to proxy https://www.darraghoriordan.com/js/index.js to Plausible’s script at https://plausible.io/js/script.js.

The following lines should be in the _redirects file in the static folder for your Gatsby site. You can also use the netlify.toml configuration. There is more information on Netlify’s documentation on how to configure redirects if you can’t use the example here.

/js/index.js https://plausible.io/js/script.js 200
/api/event https://plausible.io/api/event 200

The second line is proxying calls to Plausible’s event api.

And that’s it!

Plausible have recipes for most popular platforms here: https://plausible.io/docs/integration-guides

Conclusion

Google is forcing everyone to update their site statistics solution. This is the best time since 2005 to look around at the other site statistic options!

In 2022 you can get more comprehensive statistics, better usability and better privacy for your users with alternative tools like Plausible Analytics.

The best thing is you’ll get true statistics that aren’t ad-blocked and you’ll likely increase views and sessions by as much as 20% like I did!

Darragh ORiordan

Hi! I'm Darragh ORiordan.

I live and work in Sydney, Australia building supporting happy teams that create high quality software for the web.

I also make tools for busy developers! Do you have a new M1 Mac to setup? Have you ever spent a week getting your dev environment just right?

My DevShell tooling will save you 30+ hours configuring your dev environment with all the best modern tools. Get it here

https://darraghoriordan.gumroad.com/l/devshell


Read more articles like this one...

List of article summaries

#engineering

PostgreSQL and typeorm - Caching

With most web applications you can drastically increase performance by using caching for data that’s frequently read across network boundaries. This lesson will explore some common caching techniques, you’ll learn how some common tools and libraries provide caching for us.

While caching helps with performance it can also cause some surprises and bugs in applications and i’ll discuss some of those too.

#engineering

How to run Monica personal CRM on Dokku

I left my home country right after university and I worked and lived in a few countries since then. I’ve met lots of amazing people but I’ve always struggled to remember contact details and important dates for everyone.

#engineering

Open Telemetry in NestJs (and React)

Open Telemetry is good enough to use in production projects now and most cloud providers and telemetry services have integrated open telemetry into their products.