Preventing Tracking Prevention Tracking

3pt14159 · on Dec 10, 2019

Some of the people I've talked with over the years study things like nuclear weapons arms control or cyberwarfare. The most paranoid of the bunch have resorted to having Virtual Private Servers screen shot websites with headless browsers once it loads and pipe it back to their research machine. I can't remember if it's a table of PNGs or just one big one, but either way it's sent back over a SSH tunnel and when you click the server knows what you're trying to click on and preforms the action for you, and will randomly forward the click to a new VPS.

It's not perfect because the IP blocks make it obvious that it comes from DigitalOcean, AWS, etc, but it's sure better than loading untrusted PDFs or JS locally. Still vulnerable to a network attack, though.

swebs · on Dec 11, 2019

Sounds like Stallman

>I generally do not connect to web sites from my own machine, aside from a few sites I have some special relationship with. I usually fetch web pages from other sites by sending mail to a program (see https://git.savannah.gnu.org/git/womb/hacks.git) that fetches them, much like wget, and then mails them back to me. Then I look at them using a web browser, unless it is easy to see the text in the HTML page directly. I usually try lynx first, then a graphical browser if the page needs it

https://stallman.org/stallman-computing.html

alasdair_ · on Dec 11, 2019

How does this stop something as simple as user-unique URLs for each link? A new VPS that fetches a unique URL is trivial to tie to the same user.

chii · on Dec 11, 2019

open multiple browser sessions for the user, and randomly choose one of them as the 'result' (but still click on all of them, even if the resultant page isn't viewed).

Or, just don't use the website if they do this.

specialist · on Dec 11, 2019

I keep thinking someone will reboot Opera's mini web browser for this purpose. (Their intermediate server renders the target website to an image.)

I also anticipate someone will do smart diffing on target websites to better auto nuke ads, trackers, etc.

Hexcles · on Dec 11, 2019

Isn't that worse, a big brother in the middle watching everything and even doing TLS termination? Unless it's running on a Tor-like distributed system?

specialist · on Dec 13, 2019

Much belated response, sorry.

I just don't know. I've stopped using VPNs for this very reason.

3xblah · on Dec 11, 2019

This type of tracking seems to assume the user is not bothering to send a fake Referer, e.g. she can just use the URL she is requesting, or just omit the header. One could argue such users are "low-hanging fruit".

Very few websites will vary the response if there is no Referer. Sending it really offers little benefit to the user.

Setting up a "headless" browser also seems like overkill. Firefox 57 and later has a -screenshot command line option which saves a PNG. No need to launch X11 for this to work.

leppr · on Dec 11, 2019

Payment flows often require a specific referrer.

3xblah · on Dec 11, 2019

Solution: Send a Referer when making payments, i.e., when using the web for commerce.

No need to send one when using the web for recreation.

core-questions · on Dec 11, 2019

So they're taking screenshots via the VM console? Why not just directly interact with the VM console, then?

ljm · on Dec 11, 2019

If they’re forwarding each click to a different VM to avoid persistent tracking then that wouldn’t work.

kube-system · on Dec 10, 2019

https://en.wikipedia.org/wiki/Browser_isolation

keyle · on Dec 11, 2019

Why don't they use isolated laptops with only 4G access or dedicated external line?

Spivak · on Dec 10, 2019

If you want this in Firefox you need to tweak an about:config setting. I really hope it becomes the default at some point.

    # Only send the origin cross-domain.
    network.http.referer.XOriginTrimmingPolicy = 2

This alone is a pretty liberal policy. People in this crowd probably want even more which can be found here: https://wiki.mozilla.org/Security/Referrer

xvector · on Dec 10, 2019

Why does this header need to exist in the first place? Seems like a huge privacy breach. Why can't 0 be the default setting?

kevinoid · on Dec 10, 2019

I can't speak to why it was originally defined, but since the Referer [sic] header has existed for decades, many sites depend on it to function. The Smart Referer extension whitelist[1] and bug tracker[2] have several examples.

1. https://gitlab.com/smart-referer/smart-referer/blob/gh-pages...

2. https://gitlab.com/smart-referer/smart-referer/issues?scope=...

eitland · on Dec 11, 2019

> I can't speak to why it was originally defined, but since the Referer [sic] header has existed for decade

I can remember my Dad getting a mail from someone he linked to that was about to move his website and politely contacted his neighbors on the internet to allow them to update their links.

Very useful at that time.

tempestn · on Dec 11, 2019

It can still be useful for that kind of thing. When I notice an unexpected spike of traffic on one of our sites I'll often look at our analytics to see where it came from and then potentially drop in there to answer comments and such. Not to say that's worth the privacy trade-off though, unfortunately.

RussianCow · on Dec 10, 2019

Believe it or not, there actually exist websites that rely on the Referer header for navigation. The last time I bumped into this was a few years ago, but a local government site refused to work unless my browser sent that header.

Granted, this is probably rare enough that it's safe to disable the header for the vast majority of websites, but it's something to keep in mind.

SAI_Peregrinus · on Dec 10, 2019

Atlassian requires it for Jira (& other bits of their crap) logins to work.

jlarocco · on Dec 10, 2019

I'm honestly not surprised.

Judging solely by the UI, I actually kinda like Atlassian's tools, but they're a huge pain in the ass to get working with privacy extensions installed (uMatrix, uBlock, etc.). They make cross-site requests all over the place (to weird servers like "some-huge-name-that-obscures-the-host-name.atl-pass.net", and even some third party servers!), tons of Javascript and css for basic features, etc. Using dubious features like referer headers seems right up their alley.

It's one of the main reasons I only use them at work, and won't use them for my personal projects. I'd rather pay for GitHub and Sourcehut so I don't feel like I'm opening my browser up to a bunch of security problems.

In the past they've also made some really brain dead (IMO) decisions like going out of the way to break middle-click paste on Linux.

dorgo · on Dec 11, 2019

>They make cross-site requests all over the place (to weird servers like "some-huge-name-that-obscures-the-host-name.atl-pass.net", and even some third party servers!), tons of Javascript and css for basic features, etc

If you like this, you should try Microsoft. They combine this crap with endless redirects. Usually, I give up after 5 minutes whitelisting + redirects.

_ps6d · on Dec 11, 2019

Beyond what other people mentioned, some sites and frameworks also rely on the Referer header as part of CSRF protection. It's not truly necessary to check, but it's an OWASP recommendation so it seems like a decent number of places implemented it by default.

I recently got the Pyramid Python framework to make it possible to disable Referer-checking for the built-in CSRF protection, but they're still going to keep requiring the header by default: https://github.com/Pylons/pyramid/issues/3508

More discussion about it in these pull requests too:

https://github.com/Pylons/pyramid/pull/3512

https://github.com/Pylons/pyramid/pull/3518

The new version with it being optional hasn't been released yet, so as of right now almost everyone using Pyramid will still require users to send a Referer header to get past any CSRF checks.

Firerouge · on Dec 11, 2019

I had an old website hosted under www. When it was decided to build a new website, to preserve the old content, the new site was built without a leading subdomain.

The problem was that chrome cached www as the default for anyone who'd visited the old site, and had started hiding www from the address bar.

I used Caddy to redirect all requests to the subdomain free site unless the request came with a referrer from that site, fixing the caching and allowing for free navigation between and within both the old and new site.

wayoutthere · on Dec 10, 2019

> Origin-Only Referrer For All Third-Party Requests

This is going to break a lot of things. Things that probably should be broken, but it will cause headaches nonetheless.

tinus_hn · on Dec 10, 2019

Luckily if a big browser makes this the default, these things will probably be fixed.

spartanatreyu · on Dec 11, 2019

Conversely, if a big browser makes a new default that ends up being the wrong decision, that default might spread to other browsers and things will definitely be broken.

The css value `100vh` meant the height of the viewport of the browser, until it didn't.

csande17 · on Dec 11, 2019

> The css value `100vh` meant the height of the viewport of the browser, until it didn't.

Huh, what's it mean now? Is there some subtle difference, like it doesn't include the horizonal scroll bar or something?

rcgs · on Dec 11, 2019

Mobile devices interpret it differently because of the hide/show browser UI they often have.

r-w · on Dec 11, 2019

I’m pretty sure that’s only true for Chrome at this point.

apacheCamel · on Dec 10, 2019

I hope there is a light at the end of the tunnel for all of this. It seems like there will always be a cat and mouse effort to be just one step ahead of the other. Like how many websites have those popups now where they ask you to turn off ad-blocking. Intrusive ads and website tracking should both be a problem by default. I guess not all ads can be a problem, but I am unsure if the same could be said about tracking...

om2 · on Dec 10, 2019

We're willing to play the cat and mouse game indefinitely, if that's what it takes. Widely deployed trackers are limited in how fast they can try new tricks. And in practice, we know that ITP is working pretty well to block cross-site tracking: https://daringfireball.net/linked/2019/12/09/the-information...

saagarjha · on Dec 11, 2019

> Widely deployed trackers are limited in how fast they can try new tricks.

How so? Tracking scripts are often included by a script tag that points at a website. Can’t the code be updated, “deployed” to websites immediately, and take advantage of the relatively slower release cycle of Safari?

om2 · on Dec 11, 2019

Maybe I should have said that some tricks are slow to deploy.

Sometimes the publisher only embeds an image form the tracker (the famed "tracking pixel"). Getting lots of sites to change that to script is a pain. Sometimes they need to deploy new server-side tech for a workaround. For the recent CNAME cloaking trick, they have to get sites to modify their DNS and change what URL they embed script from.

_pgmf · on Dec 11, 2019

You're doing good work, thank you.

umvi · on Dec 10, 2019

2021: "Preventing Tracking Prevention Tracking Prevention Tracking"

HeWhoLurksLate · on Dec 11, 2019

[2019/12/12] [Hotfix] Pre-Emptive Tracking of Track-Preventative Tracking Users by Home Address

paggle · on Dec 10, 2019

Whether it’s a light or not, the end of the tunnel is in sight, it’s the ads becoming the content.

baroffoos · on Dec 11, 2019

This is so prevalent already. Brands disguised as users posting "content" that is mostly just an advert for their brand.

It has got to the point where any time someone posts something that seems to too clearly show a brand name or speaks too highly of a product I suspect its the PR people at work and I downvote it.

paggle · on Dec 11, 2019

The old way: tracking you as you look for snowboarding videos on the Web and advertising you a snowmobile wherever you go.

The new way: making sure that 95% of the snowboarding videos you see are subliminally designed to sell you a snowboard (the guy riding the competitors snowboard goes slower and crashes... the guy riding your company’s snowboard wins the race and his girlfriend looks like a supermodel)

I think eventually we will pine for the old way. Already you can’t get useful reviews anymore because all of the “comparison” searches are run by manufacturer mouthpieces.

baroffoos · on Dec 11, 2019

>I think eventually we will pine for the old way. Already you can’t get useful reviews anymore because all of the “comparison” searches are run by manufacturer mouthpieces.

Absolutely. A lot of reviews these days from google results read like someone who has only ever read the feature list from the marketing page. There is a bit of a search engine hack where you just put "reddit" after any search and it brings up fairly real results for now.

perl4ever · on Dec 12, 2019

I find reviews useful anyway. I simply ignore the "good" reviews and always look at the worst ones. There are three kinds of bad reviews - people who had random bad stuff happen (postal service broke it) that is irrelevant and think everyone should know - people with some sort of vendetta (possibly disgruntled employees, or competitors, or crazy customers) - and finally, people who actually had a bad experience that might be characteristic of the product's quality or design.

If the third category can be used to construct a narrative about something that is a deal-breaker, then that's the information I'm looking for. Of course, it has to be taken in context of the competitors.

My expectation is that the best products have some type (I) and type (II) bad reviews, but no type (III). Almost as good is something with type (III) that are about something that either doesn't matter to me or is actually a positive from my perspective.

rhizome · on Dec 17, 2019

I'm pretty good at finding decent reviews. I'd never post my process on a public forum, but I apparently don't have as much trouble avoiding sponcon as a lot of others.

paggle · on Dec 11, 2019

If people are doing that, it’s fairly certain that there are marketers maintaining “humanoid” Reddit accounts which then chime in with opinions on Bluetooth headphones.

baroffoos · on Dec 11, 2019

No doubt. So far the only defense against crap products is buying them from a physical store so you can test and easily return them and having a good warranty.

applecrazy · on Dec 11, 2019

There’s a marketing term for it, too. It’s called “native advertising.”

rhizome · on Dec 17, 2019

"Native advertising" was originally ads that were served by the site itself, first-party. It's advertisers who have co-opted it (yet again), but that doesn't mean you have to go along with their preferences. Typing isn't hard and "sponcon" (sponsored content) is already perfectly cromulent jargon, as well as being shorter.

perl4ever · on Dec 12, 2019

I've said before, and I'll say it again, much of the content I want to consume is basically advertising, but the way today's internet works, is that I have to view ads for stuff I don't want in order to see the ads I want to see. And for some reason, people call avoiding this "stealing".

thayne · on Dec 11, 2019

And how much real functionality will be sacrificed to this war?

rypskar · on Dec 11, 2019

>> Like how many websites have those popups now where they ask you to turn off ad-blocking

Handle them the same way as websites with a cookie or gdpr warning that blocks everything else, vote with your valet by leaving the site and find another site instead

thayne · on Dec 11, 2019

> ITP now downgrades all cross-site request referrer headers to just the page’s origin

What is meant by cross-site here? Does it mean a different eTLD+1, or a different origin (as used by CORS)?

Specifically, if I make a request from https://www.example.com/path?query to https://api.example.com will the referer header contain the "/path?query"? or will that get blocked as well?

core-questions · on Dec 11, 2019

Trace buster Buster BUSTER!

https://www.youtube.com/watch?v=Iw3G80bplTg

choeger · on Dec 10, 2019

So what's next? Tracking the Prevention of Tracking Prevention?

Honestly, this shit gets confusing, can someone please ML us out of it? Or maybe we just design a sane and understandable First-Party only policy?

Toast_25 · on Dec 10, 2019

It's impossible to build a perfect system, even ML could have a bias towards a certain solution or the badguys could ML a way to track us again.

baroffoos · on Dec 11, 2019

Its funny how our brains have a kind of built in adblocker named banner blindness. There have been a few times I was unable to understand a UI because the important part was rectangle and too prominent so I ignore it entirely without realizing it.

rhizome · on Dec 17, 2019

Why do you think advertisers moved to moving ads, ads that fade in over the page once you scroll a little and can be assumed to be focusing on the page, reading? Autoplay video that moves down to the picture-in-picture corner? The more annoyingly distracting the ad is, the better. Or so advertisers think.

choeger · on Dec 11, 2019

You might need a sarcasm detector.

kirubakaran · on Dec 11, 2019

An ML based one of course.

OrgNet · on Dec 11, 2019

cat and mouse game because no software is perfect, yet

saagarjha · on Dec 11, 2019

Intelligent Tracking Prevention uses a machine learning classifier.

rapind · on Dec 10, 2019

Why can't a browser solve this (except for IP) by simply having an option to not leak any data? Make audio and GL calls constant time, and don't persist anything past the tab / window / site? No fonts or cache reuse beyond the host? No referrers etc.

What's the hard problem here that prevents major browsers from having an option like this?

cpeterso · on Dec 10, 2019

Firefox has an about:config preference called "privacy.resistFingerprinting" to enable some of Tor's mitigations against fingerprinting. Tor is based on Firefox code and Mozilla merges some of Tor's code changes into Firefox to make updating easier for the Tor team.