Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: TrackingTheTrackers – Is a website disguising its 3rd-party trackers? (trackingthetrackers.com)
124 points by nextdns on Nov 27, 2019 | hide | past | favorite | 45 comments



From the site:

> While our analysis tool were not able to confirm that session cookies were sent as well, a long list of leaking cookies could mean that they would be. Anyone in possession of those cookies can impersonate you on that website — i.e., access your account.

I hadn't considered that before, but they're right, it's extremely easy to accidentally leak session cookies through first party subdomains. I look forward to the inevitable conference talks that will be discussing this vulnerability.


We now have trackers acting like first-party properties, but where do you draw the line between first-party and third-party? What I mean is, if I build my own in-house analytics app that does a lot of what Adobe's product does, should that be blocked too?

I specifically mean for site analytics, like Experience Cloud or GA, not serving ads. Ad block is different IMO.

If this is hosted on a first party subdomain you're already blocking the ability of it to set third-party cookies and track you across sites. So, in practical terms, what's the difference between this CNAME trick and building the same thing in-house to run your own analytics?


It depends on why you care about the distinction. If you're talking from a purely privacy perspective, the important questions are (1) are you taking analytics? (2) what are you doing with them? (3) is that in line with users' expectations?

Third party, or third party disguised as first party, is only problematic because of an implied "there's very little keeping the third party from using your data for things that aren't just analytics for the first party." It's the red flag for "this site may not be using your data the way you would want it to."

Third party ad trackers disguised as first party cookies specifically violate the general assumption that first party data stays first party, because those third parties have specific mechanisms to track you across multiple first parties.


> violate the general assumption that first party data stays first party

How? Because they can't do that with cookies.

I think the most prominent objection to 3rd party cookies is they allow systematic tracking. This seems like they just help eg apple or whomever understand what you're doing on that same site.


> How? Because they can't do that with cookies.

My point is exactly that: they can and are doing it (using first party cookies through CNAME redirects to correlate your identity across multiple independent sites in order to sell a data product to more than just the first party). For an example, see Criteo.

> I think the most prominent objection to 3rd party cookies is they allow systematic tracking.

And now first party cookies are starting to be used for that too. It's not that important how you specifically want to phrase the general ickiness of third party cookies.


There are plenty of other ways to fingerprint a user. https://amiunique.org/


That really doesn't answer the question. Munchberry claims these analytics tools have cross domain tracking and I'm asking how, precisely. In part because of professional interest, and in part because I don't actually think it's true.


Thanks for getting my username right. ;)

You specifically got one detail wrong: it's not just for analytics tools. It's the adtech industry in general using this technique, and Adobe offers its analytics as part of its marketing software suite.

From their own site: "What is Adobe Experience Cloud? It's a collection of best-in-class solutions for marketing, analytics, advertising, and commerce."


Apologies Munch bunny, gonna blame that on a need for new glasses.

fwiw, Adobe Experience Cloud is generally not the sort of adtech that attempts to sell information.


You're right, Adobe Experience Cloud doesn't sell information, so how problematic you find the product depends on where you draw the line on privacy.

Specifically, Adobe Experience Cloud definitely offers retargeting capabilities (ads following you around the internet) and the ability to get statistics on the effectiveness of that advertising. If they're at parity with competing marketing suites, then they also have attribution capabilities to track you with per-user, per-interaction granularity.

A site that serves Adobe Experience Cloud cookies in the third-party-disguised-as-first-party way is likely enabling this capability for all marketers that are going through Adobe Experience Cloud. So the interesting question would be whether you, a visitor to Fox.com, consider being watched by marketers who aren't Fox.com to be a privacy problem.


All of the above is more private than eg google analytics because of the lack of cross domain tracking... I'd consider it a big improvement vis-a-vis google's product suite.


"So, in practical terms, what's the difference between this CNAME trick and building the same thing in-house to run your own analytics?"

In practical terms, based on what we have seen up to now, few websites will run their own analytics.

Thus, focusing on the third party services is effective against most tracking.

The "CNAME trick" might be viewed as a good thing because it is putting evolutionary pressure on users to utilise DNS to "block" ads instead of using popular graphical web browsers to do the work. These browsers are written by companies or organisations that derive substantial revenue from online advertising.


> it is putting evolutionary pressure on users to utilise DNS to "block" ads instead of using popular graphical web browsers to do the work

How so? Browser and browser addons are perfectly capable of blocking these subdomains.


It's not that browser add-ons aren't capable of blocking them, it's that a) browser extensions generally don't have access to dns resolution, b) even if they did, cname tracker resolution would increase overall latency due to multiple requests to identify trackers /outside/ of normal resolution, c) it's trivial for a host to create a new cname for a tracker, so automating this makes a lot of sense from their perspective in order to avoid blocklists.

Extensions now have to identify which cnames are a front for which trackers, block the new tracker, and somehow manage to stay below the max rules-per-extension that the browser allows (30k for Chrome[0]). See original nextdns.io post for details [1].

[0] https://www.xda-developers.com/google-chrome-manifest-v3-ad-...

[1] https://medium.com/nextdns/cname-cloaking-the-dangerous-disg... ( https://outline.com/Y6PKr3 )


> it's trivial for a host to create a new cname for a tracker, so automating this makes a lot of sense from their perspective in order to avoid blocklists.

This is the part I'm skeptical of. If your site is HTTPS, and a user forbids mixed content, all your ads have to be served over HTTPS. That means whenever you create a new CNAME, the ad server (not run by you) has to generate a new certificate, or else you have to give them a cert for *.yourdomain.com. Both are pretty big asks. Even if the ad server can automatically generate new certs, they're going to show up in a transparency log and can be automatically added to blocklists based on that.

So I'm not sure it's going to make much of a difference which blocking method people use. All the Easylist (or whoever) folks have to do is add new subdomains to their blocklist - the exact same thing an ad blocking DNS provider is doing.

Edit: the process for obtaining a certificate from Adobe for a new tracking CNAME looks absolutely excruciating, which is good evidence for my point. https://docs.adobe.com/content/help/en/core-services/interfa...


"... the exact same thing an ad blocking DNS provider is doing."

Just to be clear, I was not advocating using third party DNS.

Using djbdns I can just "block" all non-www subdomains for a domain with a single line in a zone file (if using tinydns) or a single byte file in a directory (if using dnscache) and then add entries in the zone file for any specific subdomains I want to "allow".

It is only an opinion, but I think the ability to wildcard all subdomains makes DNS-based methods of blocking trackers easier to manage and allows it to scale better than having to list every tracker subdomain in a blocklist.


> Just to be clear, I was not advocating using third party DNS.

Well, you weren't, but the OP is a Show HN by a company offering such a service.

> It is only an opinion, but I think the ability to wildcard all subdomains makes DNS-based methods of blocking trackers easier to manage

I dunno, I think for most people uBlock rules are going to be easier to handle then setting up their own DNS server. Sure, I have my own resolver too (Unbound), but since you need an ad blocker on top of that anyway, I just keep my rules in uBlock. The following is all you need to block all subdomains but allow the bare domain:

  ||example.org^
  @@*/example.org/*
Or to allow www:

  @@www.example.org^


With uBlock, is it possible to block all subdomains but allow a specific subdomain?

A resolver (e.g., unbound) is only one half of the DNS method I use. The other is an authoritative nameserver (e.g., nsd). For my own purposes, the resolver is optional.


Yes, that's just this:

  ||example.org^
  @@www.example.org^
> The other is an authoritative nameserver (e.g., nsd). For my own purposes, the resolver is optional.

True, although I imagine for most people the nameserver part of it is the more optional. DNS ad blocking software tends to be a recursive resolver that returns 0.0.0.0 results for some unwanted domains. Unbound has the ability to do that (for the few domains I'm filtering entirely), and so I've stuck with that.


It is no wonder that uBlock is so popular.

Not sure I understand returning 0.0.0.0. What if the user has some other servers listening.

I return the address of some server I control that is bound to a local address, e.g., an authoritative nameserver.

Compared to the available solutions this is way too much work for "most people", however from a purist perspective a self-managed DNS approach is not under the ultimate control of a browser-authoring, extension/app-approving company/organisation or some third party DNS provider.

Whether that even matters is debatable.

As long as these easy solutions keep working, there's no incentive to try a different approach.


All analytics should be blocked.



Seems like most of these use Adobe Experience Cloud. I'm guessing they offer some kind of HOWTO to set it up as disguised.

Good on Adobe for keeping it up. Flash wasn't enough to fuck the internet up for years.



> In order to circumvent tracking limitations imposed by browsers and programs, you can implement first-party cookies.

Well at least they're forthcoming about knowing they're intentionally circumventing users privacy settings.


If we request each of those pages, e.g., foxnews.com, cnn.com, bbc.co.uk, etc. with a simple HTTP client (no JavaScript engine) and we look at the subdomains that appear in each page, we see that in each case the tracker links are absent.

The requests to the trackers rely on JavaScript being enabled. What happens when JavaScript is disabled?

Further, we can easily list the subdomains that do appear in the page and, assuming they are being used to host content that we want, we can "whitelist" them in our zone file.


So, other than simply not using the site, is there anything a user can do to avoid third-party tracking at sites like these?


For now, you can simply block these subdomains in the ad blocker of your choice. The fundamental risk to ad blockers that these pose is that there will be too many subdomains to block in a list of reasonable size. But really that's not fundamentally different than other techniques like serving ads and real content from the same server, which has been happening for years.

(In theory, a site might generate new subdomains and DNS responses for them on the fly, making this approach unworkable. In practice however, most sites have moved entirely to HTTPS, which means that unless you give your ad host a certificate for *.yourdomain.com, all the subdomains have to be known in advance and show up in certificate transparency logs, making them easy to block.)


> anything a user can do to avoid third-party tracking at sites like these?

Not really, other than as you say simply going elsewhere. The trouble is most users don't know to go elsewhere as they don't know about the matter without digging.

You could start treating changes of sub-domain the same way cross-domain references are handled by tools that block 3rd party cookies, but there are plenty of sites that use multiple sub-domains which have genuine uses for shared cookies (single sign-on for instance) that might be broken by this so you'll have an initial inconvenience of white-listing them. Also if a previously white-listed site goes rouge, detecting that could be difficult, or at least fraught with false positives.


Use nested VPN chains and/or Tor.

Tor exit IPs trigger lots of CAPTCHAs and other abusive behavior. But if it's just that you want to prevent tracking, it's enough to run a VM that connects through a VPN service. Or a nested chain of them.

I mean, whoever can track the whatever about Mirimir, and I couldn't care less. Or any of the other personas that I use.



That's like a tiny bandaid; in the next iteration they'll copy the A/AAAA records instead of CNAMEing them; that would make CNAME uncloaking useless _and_ save one DNS roundtrip reducing browser latency.


Without using CNAMEs the third party tracker IP addresses would be less dynamic making them easier to block with a firewall.


disable javascript.


JavaScript, cookies and browser cache.


You can also use temporary containers [1] to present a clean and isolated cache & cookie store for every new website visit. This is both less fingerprintable and more usable than disabling cache and cookies.

It is also worth noting that caches other than the browser cache can also be abused to track users: http://dnscookie.com/

[1]: https://medium.com/@stoically/enhance-your-privacy-in-firefo...


Back when I was working for a site on their first-party in-house analytics, the only blocker to catch it was https://apps.apple.com/ca/app/better-blocker/id1080964978 (Better blocker) and since then, I’ve been running it on my phone. Incidentally, it was not a fun job to have and I left shortly after.

Better blocker is by https://ind.ie/ and the rules are online at https://better.fyi/

That said, it’s not comprehensive, so I run it alongside another blocker that uses more traditional rule sources (1Blocker). I find that a diversity of rule sources and sometimes simply building rules manually for individual sites is required. I do think this kind of lookup/service is very useful though in advancing the state of tracker blocking. Next we’ll probably have to do behavioural analysis until they move the trackers into site code or binary data flows of the rest of the site... there is a point at which you have to decide if it’s worth putting up with tracking to use a site or service... as much as I dislike it. If it’s integrated, at least it will be faster than third-party, I suppose.


This tool is limited. It cannot be used to assess subdomains. For example, checking news.ycombinator.com redirects to checking ycombinator.com.


Fixed, not sure why we did that in the first place.

https://trackingthetrackers.com/site/news.ycombinator.com


Cool, fast update!


What used to be the web is now a toxic wasteland filled with increasingly obnoxious advertising backed by increasingly creepy/ invasive tracking. This makes using tools like Apple's News App greatly more appealing to me personally.


It has been known for quite some time that CDNs like akamai are basically 3rd party tracking mechanisms.

What's funny is they have basically gone 180 degrees from their original design.

Originally they were meant as caching mechanisms.

Now they are cache-busting tracking mechanisms.


What if they have a tracking tracker buster?


This is a v1 of our analyzer, but it's already doing a lot of things to mimic a real user (like using a real browser, moving the mouse, scrolling, etc.).


Lol sorry, didn't think this comment would be taken seriously, it's a reference from The Big Hit

https://youtu.be/Iw3G80bplTg




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: