April 2023
Blog

Why Google Analytics is not the epitome of your data

Oliver Kampmeier

Cybersecurity Content Specialist

Our mission at fraud0 is to prevent ad and click fraud caused by bots and to make every company aware of this issue.

To this end, in addition to our easy-to-implement software, we also provide companies with tips and tricks on how to assess their own extent of the bot problem using existing data and metrics from analytics tools.

But as useful as it is to perform your own analysis of the data, one thing must not be forgotten: Analytics tools like Google Analytics only reflect a fraction of reality.

They are a good first indication to get a feel for whether and to what extent your website / app is affected by bot traffic. However, the numbers do not answer the open questions of how much bot traffic actually exists and what kind of bots they are.

This article will give you an overview of why analytics tools (in the following using Google Analytics 4 (GA4) as an example) only reflect part of the reality and why you should use fraud0 data to complement your analytics data.

100% reported traffic is not 100% total traffic

To start with, let’s talk about what is probably the biggest problem with Google Analytics: its privacy-friendly use on a website / in an app.

If you want to use GA4 in a GDPR-compliant way, you have to obtain the consent of your users. This request will result in you receiving only 30-70% of user data in the future.

Opt-in rates vary by country and industry. While people in countries like Poland (64%) and Spain (63%) are more likely to click the “Accept All” button in a consent banner, the figure is much lower for other countries like Germany (44%) and the US (32%) (Source).

A similar picture emerges for the industries: While there is only an average opt-in rate of 55% in the healthcare sector, it is almost 65% in the finance & insurance as well as telecommunications sectors. The average across all industries is only 60.14%.

In other words, you lose data of almost 40% of your website visitors!

A very similar picture can be seen not only on websites, but also in apps. With the introduction of App Tracking Transparency (ATT) in iOS 14.5, Apple required all app developers to ask users for their permission to use personal data for tracking.

Again, the picture is the same: The majority of users (>50%) refuse tracking if the option is available.

Image source

A case study from analytics expert Brian Clifton shows the impact opt-in requests can have on your analytics numbers. After implementing a Consent Management Platform (CMP) on a client’s website and linking it to Google Analytics, numbers plummeted 70%.

Image source

Google Analytics will thus never show you 100% of your real baseline data if you are aiming for a privacy-compliant implementation.

Your CMP provider may provide you with opt-in / opt-out rate numbers on your website (if not, be sure to request them). However, you can also use these numbers to extrapolate only certain metrics (such as visits). When it comes to other metrics, such as conversions, your data will always remain a rough estimate, as you may be combining multiple extrapolated values.

fraud0 – GDPR compliant view of 100% of traffic without consent

Now, the question for you: would you like to work with only 30% of your total data and make important decisions based on it?

Neither would we. That’s why we provide you with basic analytics data in our dashboard, in addition to metrics on bots and invalid traffic.

This way, you know with certainty how many total visitors have visited your website / app and how many of them were real people and how many were bots. Additional metrics such as referrer, campaign, and device type allow you to perform further, more in-depth evaluations.

The best thing about it? fraud0 presents you with 100% of your traffic, runs completely privacy compliant and may be used without consent. The use of our software is based on Art. 6 lit. f GDPR (legitimate interest).

It is in the interest of the website operator to classify the users of his website as valid or invalid traffic. First and foremost, fraud0 thus prevents fraud (recital 47 of the GDPR), but can also correct the website statistics by removing invalid traffic from the statistics. With these valuable insights, website operators and advertisers can use their online marketing budget more efficiently.

Non-transparent bot detection

The second reason why Google Analytics only reflects a fraction of reality is the non-transparent bot detection.

Google discloses very little information about how and which bots are detected and filtered from the data. On the one hand, the public IAB Spiders and Bot List is mentioned and on the other hand, reference is made to internal research.

However, the IAB list represents only a simple overview of various bot names in the user agent string. Fraudsters can simply use a different name for their bot in the user agent information and will not be identified as a bot from then on. This method of ad fraud detection, which relies on the so-called HTTP header, is thus completely unreliable, which is why fraud0 uses much more sophisticated mechanisms.

Nothing is known about other methods of bot detection in Google Analytics. However, it is also understandable that Google itself is not very interested in revealing too much information about this. This would only make it easier for fraudsters to circumvent the detection.

In addition, Google has a certain conflict of interest, just like all other advertising platforms, since they monetize clicks themselves, even if they are not of a human nature.

While it was still possible under Universal Analytics to manually turn bot detection on or off for a data view, bots are now automatically filtered from App + Web Properties by default under GA4.

In addition to the non-transparent bot detection, you can not see how much bot traffic was excluded from Google Analytics.

In GA4, there is no way to view the detected bots in the data. Was only 3% excluded or was it more than 20%? This makes a valid conclusion about quality as well as quantity of the filtered out “invalid traffic” impossible, and the displayed session data might be highly distorted.

Transparent numbers on bots with the help of fraud0

A different picture emerges when you use fraud0. In our dashboard, we not only show you the share of invalid and low-quality traffic in the total traffic, but also provide you more information about the bots and the reasons for blocking them.

This allows you to see at a glance what kind of bots (e.g. botnets, user agent spoofing, etc.) you are affected by.

Discrepancies in Analytics data between UA and GA4

The third and final issue we want to look at in terms of the dataset in GA4 is the changed data model between Universal Analytics and Google Analytics 4.

Image source

Probably the most significant difference is the different hierarchy of the data model. In Universal Analytics, users were assigned sessions that consisted of various hits (e.g. page view, custom event, etc.).

In the case of GA4, it’s a flat, event-based data model that assigns a series of events to a user.

Image source

A complete comparison of the changes between UA and GA4 is beyond the scope of this article. We recommend this article if you are interested in a more in-depth analysis of the changes.

Image source

The only important thing to know is that a comparability of the numbers between UA and GA4 is barely possible and may well show discrepancies between 5 – 60%.

Therefore, many people wonder to what extent the automatic filtering of bots is responsible for this.

However, since Google Analytics is neither transparent about bot detection, nor does it show the filtered bots in the data, the influence of automatic bot detection or its changed mode of operation between UA and GA4 cannot be determined.

Thus, it happens that some companies use both tools in parallel and have to look at two entirely different truths. For one of our customers, the difference in the data is rather below average at about 6%.

Get insight into 100% of your data

Do we want to sell you fraud0 as an alternative to Google Analytics 4 at this point? No. Our strength lies in the detection of Invalid Traffic and will not change.

Rather, we want to give you a sense of your current data foundation and why it doesn’t reflect the reality you might like. Google Analytics takes a piece of reality and presents it to you as the truth in your dashboard. However, you don’t have a way to take a look at the big picture because the data in GA4 is missing for that.

At this point, we would like to recommend using fraud0 in addition to Google Analytics 4 along with the resulting benefits:

Insight into 100% of your website / app traffic, through a GDPR-compliant implementation and firing without prior consent from your users.
Transparent detection of bots and insight into the specific bot types on your website / app

If you want to see the comparison between GA4 and our data for yourself, sign up now for a free 7-day trial of fraud0. All features will be fully available to you during this time, with no further obligations. You don’t even need a credit card, you can start right away for free.

Share this article