Adblockers can still (and do) protect against server-side GTM, as the requests a...

gigel82 · on July 27, 2022

Once server-side analytics get implemented widely, we've lost. We'll keep chasing each other with tricks like renaming the api endpoints, randomizing the javascript hash, etc. for a while but if we end up having to run an ML model in the browser to attempt to detect when our data is being stolen we've lost a long time before.

Might be better to shame any website caught using it with some crowd-sourced list of some kind - then at least we'd know who the bad actors are and force their content through an isolated container / proxy / VPN, or simply stop using them altogether.

closewith · on July 27, 2022

If that's the case, then the war is already lost.

But happily, in the EU - the market I operate in - server-side analytics is seen as an avenue towards compliance.

Obviously server-side GTM will be abused in the absence of regulation, but that was also true of the existing technologies. Strong and consistent enforcement can and is bringing companies into compliance.

trasz · on July 28, 2022

Unless we start actively jamming analytics by injecting lots of fake data. Which well eventually have to do anyway.

closewith · on July 28, 2022

Some competitors in hyper-competitive markets already do this, and analytics services are already pretty good at filtering it out.

trasz · on July 28, 2022

In the long run analytics will be on the losing side though - because it's possible to jam with patterns generated by other humans, just mashed together, and there is no way to reliably detect that without also making it possible for humans to use it masquerade as generated traffic.

closewith · on July 28, 2022

The ecosystem is much too advanced for that kind of tactic. Fraud in ad networks has been an issue for decades and the fraud detection systems will identify that immediately.

However, if you just want to avoid tracking, uBlock Origin will do the job or simply reject the tracking on the cookie pop-up. Reputable companies in the EU almost always respect those choices because no-one wants to face the wrath of a DPA (at least not without plausible deniability - hence all the dark patterns).

trasz · on July 28, 2022

That tactic has already worked a couple of times, and the usual problem was that it was used for fraud, ie against the law, not because it was defeated by technical means. Besides... how exactly would it be detected? Keep in mind that you can replay actual traffic gathered from humans.

uBlock can only defeat client-side tracking - which has so far been _the_ tracking, sure, but I believe it is in the process of being replaced by server-side, which can't be defeated this way.

closewith · on July 29, 2022

> Keep in mind that you can replay actual traffic gathered from humans.

Do you mean by replaying the HTTP requests of other humans? You can only do this against the most naive analytics tools. Most modern analytics will use nonces and unique event IDs to deduplicate/trash any junk. Already in competitive markets/industries (looking at you, travel) it's common for 95%+ of analytics data to be junk/fraud/poison.

I can assure you that, except in the very rare cases of people stupid enough to launch poisoning attacks from Western countries, the law has not stopped or slowed junk analytics data. It is purely a technical defence, and it works very well.

> uBlock can only defeat client-side tracking - which has so far been _the_ tracking, sure, but I believe it is in the process of being replaced by server-side, which can't be defeated this way.

For the most part, the current form of server-side analytics just means relaying data through a proxy you control so that you can control exactly what the downstream services get and they never see the user's IP address, user-agent, etc. The most popular service by far, Google Tag Manager, still uses a very obvious and blockable client-side Google Analytics tag (that you serve via the same proxy) to actually collect the data in the browser.

trasz · on July 29, 2022

>Do you mean by replaying the HTTP requests of other humans?

No, that would be too obvious. I’m thinking more of replaying human interactions with a browser within a VM, or perhaps replaying them using some sort of hidden tabs, so the user session cookies stay as they were, but the behavior that can be associated with them would be jammed.

As for server analytics - it’s not very relevant now, because there’s no need, but once google is prevented from exploiting it they will inevitably switch to tracking all the client details on the (google’s) server side.

gorhill · on July 27, 2022

> as the requests are not obfuscated in any way

How do you know for sure that the requests are "not obfuscated in any way"?

closewith · on July 27, 2022

Right now, because the requests are identical to the same requests sent to Google Analytics but with a different hostname. It's trivial to identify and block them, and current ad blockers already do.

gorhill · on July 27, 2022

> same requests sent to Google Analytics but with a different hostname

There are instructions out there to also modify the path of the requests[1]. Consider this paragraph in the Summary section:

> Cynics could say that this is an improved way to circumvent ad blockers. And they’d be right! This does make it easier to circumvent ad blockers, as their heuristics target not just the googletagmanager.com domain but also the gtm.js file and the GTM-... container ID.

* * *

[1] https://www.simoahava.com/analytics/custom-gtm-loader-server...

closewith · on July 27, 2022

You can do that, and you can also proxy encoded requests which obfuscates all data, but you could also do that with the previous version of Google Analytics via the Measurement API.

In practice - in the EU, at least - I haven't seen any examples of this, and it would be unlawful without consent anyway, thanks to the GDPR.

It's also still fairly easy to classify requests (if you have access to the unencrypted request in the browser) based on heuristics. That's partly what the company I work for does.

Separately, thank you for your contribution to the Internet - it's as big and important as all the behemoths, but unfortunately will never be rewarded in the same way.

pieterhg · on July 27, 2022