Show HN: TrackingTheTrackers – Is a website disguising its 3rd-party trackers?

munchbunny · on Nov 27, 2019

From the site:

> While our analysis tool were not able to confirm that session cookies were sent as well, a long list of leaking cookies could mean that they would be. Anyone in possession of those cookies can impersonate you on that website — i.e., access your account.

I hadn't considered that before, but they're right, it's extremely easy to accidentally leak session cookies through first party subdomains. I look forward to the inevitable conference talks that will be discussing this vulnerability.

teamjimmyy · on Nov 27, 2019

We now have trackers acting like first-party properties, but where do you draw the line between first-party and third-party? What I mean is, if I build my own in-house analytics app that does a lot of what Adobe's product does, should that be blocked too?

I specifically mean for site analytics, like Experience Cloud or GA, not serving ads. Ad block is different IMO.

If this is hosted on a first party subdomain you're already blocking the ability of it to set third-party cookies and track you across sites. So, in practical terms, what's the difference between this CNAME trick and building the same thing in-house to run your own analytics?

munchbunny · on Nov 27, 2019

It depends on why you care about the distinction. If you're talking from a purely privacy perspective, the important questions are (1) are you taking analytics? (2) what are you doing with them? (3) is that in line with users' expectations?

Third party, or third party disguised as first party, is only problematic because of an implied "there's very little keeping the third party from using your data for things that aren't just analytics for the first party." It's the red flag for "this site may not be using your data the way you would want it to."

Third party ad trackers disguised as first party cookies specifically violate the general assumption that first party data stays first party, because those third parties have specific mechanisms to track you across multiple first parties.

x0x0 · on Nov 28, 2019

> violate the general assumption that first party data stays first party

How? Because they can't do that with cookies.

I think the most prominent objection to 3rd party cookies is they allow systematic tracking. This seems like they just help eg apple or whomever understand what you're doing on that same site.

munchbunny · on Nov 28, 2019

> How? Because they can't do that with cookies.

My point is exactly that: they can and are doing it (using first party cookies through CNAME redirects to correlate your identity across multiple independent sites in order to sell a data product to more than just the first party). For an example, see Criteo.

> I think the most prominent objection to 3rd party cookies is they allow systematic tracking.

And now first party cookies are starting to be used for that too. It's not that important how you specifically want to phrase the general ickiness of third party cookies.

zonidjan · on Nov 28, 2019

There are plenty of other ways to fingerprint a user. https://amiunique.org/

x0x0 · on Nov 28, 2019

That really doesn't answer the question. Munchberry claims these analytics tools have cross domain tracking and I'm asking how, precisely. In part because of professional interest, and in part because I don't actually think it's true.

munchbunny · on Nov 28, 2019

Thanks for getting my username right. ;)

You specifically got one detail wrong: it's not just for analytics tools. It's the adtech industry in general using this technique, and Adobe offers its analytics as part of its marketing software suite.

From their own site: "What is Adobe Experience Cloud? It's a collection of best-in-class solutions for marketing, analytics, advertising, and commerce."

x0x0 · on Nov 28, 2019

Apologies Munch bunny, gonna blame that on a need for new glasses.

fwiw, Adobe Experience Cloud is generally not the sort of adtech that attempts to sell information.

munchbunny · on Dec 2, 2019

You're right, Adobe Experience Cloud doesn't sell information, so how problematic you find the product depends on where you draw the line on privacy.

Specifically, Adobe Experience Cloud definitely offers retargeting capabilities (ads following you around the internet) and the ability to get statistics on the effectiveness of that advertising. If they're at parity with competing marketing suites, then they also have attribution capabilities to track you with per-user, per-interaction granularity.

A site that serves Adobe Experience Cloud cookies in the third-party-disguised-as-first-party way is likely enabling this capability for all marketers that are going through Adobe Experience Cloud. So the interesting question would be whether you, a visitor to Fox.com, consider being watched by marketers who aren't Fox.com to be a privacy problem.

x0x0 · on Dec 3, 2019

All of the above is more private than eg google analytics because of the lack of cross domain tracking... I'd consider it a big improvement vis-a-vis google's product suite.

3xblah · on Nov 27, 2019

"So, in practical terms, what's the difference between this CNAME trick and building the same thing in-house to run your own analytics?"

In practical terms, based on what we have seen up to now, few websites will run their own analytics.

Thus, focusing on the third party services is effective against most tracking.

The "CNAME trick" might be viewed as a good thing because it is putting evolutionary pressure on users to utilise DNS to "block" ads instead of using popular graphical web browsers to do the work. These browsers are written by companies or organisations that derive substantial revenue from online advertising.

icebraining · on Nov 28, 2019

> it is putting evolutionary pressure on users to utilise DNS to "block" ads instead of using popular graphical web browsers to do the work

How so? Browser and browser addons are perfectly capable of blocking these subdomains.

agarren · on Nov 28, 2019

It's not that browser add-ons aren't capable of blocking them, it's that a) browser extensions generally don't have access to dns resolution, b) even if they did, cname tracker resolution would increase overall latency due to multiple requests to identify trackers /outside/ of normal resolution, c) it's trivial for a host to create a new cname for a tracker, so automating this makes a lot of sense from their perspective in order to avoid blocklists.

Extensions now have to identify which cnames are a front for which trackers, block the new tracker, and somehow manage to stay below the max rules-per-extension that the browser allows (30k for Chrome[0]). See original nextdns.io post for details [1].

[0] https://www.xda-developers.com/google-chrome-manifest-v3-ad-...

[1] https://medium.com/nextdns/cname-cloaking-the-dangerous-disg... ( https://outline.com/Y6PKr3 )

bscphil · on Nov 28, 2019

> it's trivial for a host to create a new cname for a tracker, so automating this makes a lot of sense from their perspective in order to avoid blocklists.

This is the part I'm skeptical of. If your site is HTTPS, and a user forbids mixed content, all your ads have to be served over HTTPS. That means whenever you create a new CNAME, the ad server (not run by you) has to generate a new certificate, or else you have to give them a cert for *.yourdomain.com. Both are pretty big asks. Even if the ad server can automatically generate new certs, they're going to show up in a transparency log and can be automatically added to blocklists based on that.

So I'm not sure it's going to make much of a difference which blocking method people use. All the Easylist (or whoever) folks have to do is add new subdomains to their blocklist - the exact same thing an ad blocking DNS provider is doing.

Edit: the process for obtaining a certificate from Adobe for a new tracking CNAME looks absolutely excruciating, which is good evidence for my point. https://docs.adobe.com/content/help/en/core-services/interfa...

3xblah · on Nov 28, 2019

"... the exact same thing an ad blocking DNS provider is doing."

Just to be clear, I was not advocating using third party DNS.

Using djbdns I can just "block" all non-www subdomains for a domain with a single line in a zone file (if using tinydns) or a single byte file in a directory (if using dnscache) and then add entries in the zone file for any specific subdomains I want to "allow".

It is only an opinion, but I think the ability to wildcard all subdomains makes DNS-based methods of blocking trackers easier to manage and allows it to scale better than having to list every tracker subdomain in a blocklist.

bscphil · on Nov 28, 2019

> Just to be clear, I was not advocating using third party DNS.

Well, you weren't, but the OP is a Show HN by a company offering such a service.

> It is only an opinion, but I think the ability to wildcard all subdomains makes DNS-based methods of blocking trackers easier to manage

I dunno, I think for most people uBlock rules are going to be easier to handle then setting up their own DNS server. Sure, I have my own resolver too (Unbound), but since you need an ad blocker on top of that anyway, I just keep my rules in uBlock. The following is all you need to block all subdomains but allow the bare domain:

  ||example.org^
  @@*/example.org/*

Or to allow www:

  @@www.example.org^

3xblah · on Nov 28, 2019

With uBlock, is it possible to block all subdomains but allow a specific subdomain?

A resolver (e.g., unbound) is only one half of the DNS method I use. The other is an authoritative nameserver (e.g., nsd). For my own purposes, the resolver is optional.

bscphil · on Nov 29, 2019

Yes, that's just this:

  ||example.org^
  @@www.example.org^

> The other is an authoritative nameserver (e.g., nsd). For my own purposes, the resolver is optional.

True, although I imagine for most people the nameserver part of it is the more optional. DNS ad blocking software tends to be a recursive resolver that returns 0.0.0.0 results for some unwanted domains. Unbound has the ability to do that (for the few domains I'm filtering entirely), and so I've stuck with that.

3xblah · on Nov 30, 2019

It is no wonder that uBlock is so popular.

Not sure I understand returning 0.0.0.0. What if the user has some other servers listening.

I return the address of some server I control that is bound to a local address, e.g., an authoritative nameserver.

Compared to the available solutions this is way too much work for "most people", however from a purist perspective a self-managed DNS approach is not under the ultimate control of a browser-authoring, extension/app-approving company/organisation or some third party DNS provider.

Whether that even matters is debatable.

As long as these easy solutions keep working, there's no incentive to try a different approach.

egdod · on Nov 28, 2019

All analytics should be blocked.

nextdns · on Nov 27, 2019

A few websites that are disguising third-party trackers:

Fox News: https://trackingthetrackers.com/site/foxnews.com

CNN: https://trackingthetrackers.com/site/cnn.com

BBC: https://trackingthetrackers.com/site/bbc.co.uk

WebMD: https://trackingthetrackers.com/site/webmd.com

ESPN: https://trackingthetrackers.com/site/espn.com

Ars Technica: https://trackingthetrackers.com/site/arstechnica.com

Go.com (Disney): https://trackingthetrackers.com/site/go.com

Washington Post: https://trackingthetrackers.com/site/washingtonpost.com

Walmart: https://trackingthetrackers.com/site/walmart.com

Weather.com: https://trackingthetrackers.com/site/weather.com

Apple: https://trackingthetrackers.com/site/apple.com

NFL: https://trackingthetrackers.com/site/nfl.com

T-Mobile: https://trackingthetrackers.com/site/t-mobile.com

State Farm: https://trackingthetrackers.com/site/statefarm.com

temp112719 · on Nov 27, 2019

Seems like most of these use Adobe Experience Cloud. I'm guessing they offer some kind of HOWTO to set it up as disguised.

Good on Adobe for keeping it up. Flash wasn't enough to fuck the internet up for years.

nextdns · on Nov 27, 2019

Yes, for Adobe Experience Cloud, see: https://docs.adobe.com/content/help/en/core-services/interfa...

VWWHFSfQ · on Nov 27, 2019

> In order to circumvent tracking limitations imposed by browsers and programs, you can implement first-party cookies.

Well at least they're forthcoming about knowing they're intentionally circumventing users privacy settings.

3xblah · on Nov 27, 2019

If we request each of those pages, e.g., foxnews.com, cnn.com, bbc.co.uk, etc. with a simple HTTP client (no JavaScript engine) and we look at the subdomains that appear in each page, we see that in each case the tracker links are absent.

The requests to the trackers rely on JavaScript being enabled. What happens when JavaScript is disabled?

Further, we can easily list the subdomains that do appear in the page and, assuming they are being used to host content that we want, we can "whitelist" them in our zone file.

pmoriarty · on Nov 27, 2019

So, other than simply not using the site, is there anything a user can do to avoid third-party tracking at sites like these?

bscphil · on Nov 28, 2019

For now, you can simply block these subdomains in the ad blocker of your choice. The fundamental risk to ad blockers that these pose is that there will be too many subdomains to block in a list of reasonable size. But really that's not fundamentally different than other techniques like serving ads and real content from the same server, which has been happening for years.

(In theory, a site might generate new subdomains and DNS responses for them on the fly, making this approach unworkable. In practice however, most sites have moved entirely to HTTPS, which means that unless you give your ad host a certificate for *.yourdomain.com, all the subdomains have to be known in advance and show up in certificate transparency logs, making them easy to block.)

dspillett · on Nov 27, 2019

> anything a user can do to avoid third-party tracking at sites like these?

Not really, other than as you say simply going elsewhere. The trouble is most users don't know to go elsewhere as they don't know about the matter without digging.

You could start treating changes of sub-domain the same way cross-domain references are handled by tools that block 3rd party cookies, but there are plenty of sites that use multiple sub-domains which have genuine uses for shared cookies (single sign-on for instance) that might be broken by this so you'll have an initial inconvenience of white-listing them. Also if a previously white-listed site goes rouge, detecting that could be difficult, or at least fraught with false positives.

mirimir · on Nov 27, 2019

Use nested VPN chains and/or Tor.

Tor exit IPs trigger lots of CAPTCHAs and other abusive behavior. But if it's just that you want to prevent tracking, it's enough to run a VM that connects through a VPN service. Or a nested chain of them.

I mean, whoever can track the whatever about Mirimir, and I couldn't care less. Or any of the other personas that I use.

poitrus · on Nov 27, 2019

NextDNS blocks those trackers, see: https://medium.com/nextdns/nextdns-added-cname-uncloaking-su...

beagle3 · on Nov 27, 2019

That's like a tiny bandaid; in the next iteration they'll copy the A/AAAA records instead of CNAMEing them; that would make CNAME uncloaking useless _and_ save one DNS roundtrip reducing browser latency.

3xblah · on Nov 27, 2019

Without using CNAMEs the third party tracker IP addresses would be less dynamic making them easier to block with a firewall.

xwvvvvwx · on Nov 27, 2019

disable javascript.

drdaeman · on Nov 28, 2019

JavaScript, cookies and browser cache.

xwvvvvwx · on Nov 28, 2019

You can also use temporary containers [1] to present a clean and isolated cache & cookie store for every new website visit. This is both less fingerprintable and more usable than disabling cache and cookies.

It is also worth noting that caches other than the browser cache can also be abused to track users: http://dnscookie.com/

[1]: https://medium.com/@stoically/enhance-your-privacy-in-firefo...

lstamour · on Nov 28, 2019

Back when I was working for a site on their first-party in-house analytics, the only blocker to catch it was https://apps.apple.com/ca/app/better-blocker/id1080964978 (Better blocker) and since then, I’ve been running it on my phone. Incidentally, it was not a fun job to have and I left shortly after.

Better blocker is by https://ind.ie/ and the rules are online at https://better.fyi/

That said, it’s not comprehensive, so I run it alongside another blocker that uses more traditional rule sources (1Blocker). I find that a diversity of rule sources and sometimes simply building rules manually for individual sites is required. I do think this kind of lookup/service is very useful though in advancing the state of tracker blocking. Next we’ll probably have to do behavioural analysis until they move the trackers into site code or binary data flows of the rest of the site... there is a point at which you have to decide if it’s worth putting up with tracking to use a site or service... as much as I dislike it. If it’s integrated, at least it will be faster than third-party, I suppose.

throwawaymath · on Nov 27, 2019

This tool is limited. It cannot be used to assess subdomains. For example, checking news.ycombinator.com redirects to checking ycombinator.com.

nextdns · on Nov 27, 2019

Fixed, not sure why we did that in the first place.

https://trackingthetrackers.com/site/news.ycombinator.com

throwawaymath · on Nov 27, 2019

Cool, fast update!

ogre_codes · on Nov 27, 2019

What used to be the web is now a toxic wasteland filled with increasingly obnoxious advertising backed by increasingly creepy/ invasive tracking. This makes using tools like Apple's News App greatly more appealing to me personally.

m463 · on Nov 27, 2019

It has been known for quite some time that CDNs like akamai are basically 3rd party tracking mechanisms.

What's funny is they have basically gone 180 degrees from their original design.

Originally they were meant as caching mechanisms.

Now they are cache-busting tracking mechanisms.

hanniabu · on Nov 27, 2019

What if they have a tracking tracker buster?

nextdns · on Nov 27, 2019

This is a v1 of our analyzer, but it's already doing a lot of things to mimic a real user (like using a real browser, moving the mouse, scrolling, etc.).

hanniabu · on Nov 27, 2019

Lol sorry, didn't think this comment would be taken seriously, it's a reference from The Big Hit

https://youtu.be/Iw3G80bplTg