Some of the people I've talked with over the years study things like nuclear weapons arms control or cyberwarfare. The most paranoid of the bunch have resorted to having Virtual Private Servers screen shot websites with headless browsers once it loads and pipe it back to their research machine. I can't remember if it's a table of PNGs or just one big one, but either way it's sent back over a SSH tunnel and when you click the server knows what you're trying to click on and preforms the action for you, and will randomly forward the click to a new VPS.
It's not perfect because the IP blocks make it obvious that it comes from DigitalOcean, AWS, etc, but it's sure better than loading untrusted PDFs or JS locally. Still vulnerable to a network attack, though.
>I generally do not connect to web sites from my own machine, aside from a few sites I have some special relationship with. I usually fetch web pages from other sites by sending mail to a program (see https://git.savannah.gnu.org/git/womb/hacks.git) that fetches them, much like wget, and then mails them back to me. Then I look at them using a web browser, unless it is easy to see the text in the HTML page directly. I usually try lynx first, then a graphical browser if the page needs it
open multiple browser sessions for the user, and randomly choose one of them as the 'result' (but still click on all of them, even if the resultant page isn't viewed).
Isn't that worse, a big brother in the middle watching everything and even doing TLS termination? Unless it's running on a Tor-like distributed system?
This type of tracking seems to assume the user is not bothering to send a fake Referer, e.g. she can just use the URL she is requesting, or just omit the header. One could argue such users are "low-hanging fruit".
Very few websites will vary the response if there is no Referer. Sending it really offers little benefit to the user.
Setting up a "headless" browser also seems like overkill. Firefox 57 and later has a -screenshot command line option which saves a PNG. No need to launch X11 for this to work.
I can't speak to why it was originally defined, but since the Referer [sic] header has existed for decades, many sites depend on it to function. The Smart Referer extension whitelist[1] and bug tracker[2] have several examples.
> I can't speak to why it was originally defined, but since the Referer [sic] header has existed for decade
I can remember my Dad getting a mail from someone he linked to that was about to move his website and politely contacted his neighbors on the internet to allow them to update their links.
It can still be useful for that kind of thing. When I notice an unexpected spike of traffic on one of our sites I'll often look at our analytics to see where it came from and then potentially drop in there to answer comments and such. Not to say that's worth the privacy trade-off though, unfortunately.
Believe it or not, there actually exist websites that rely on the Referer header for navigation. The last time I bumped into this was a few years ago, but a local government site refused to work unless my browser sent that header.
Granted, this is probably rare enough that it's safe to disable the header for the vast majority of websites, but it's something to keep in mind.
Judging solely by the UI, I actually kinda like Atlassian's tools, but they're a huge pain in the ass to get working with privacy extensions installed (uMatrix, uBlock, etc.). They make cross-site requests all over the place (to weird servers like "some-huge-name-that-obscures-the-host-name.atl-pass.net", and even some third party servers!), tons of Javascript and css for basic features, etc. Using dubious features like referer headers seems right up their alley.
It's one of the main reasons I only use them at work, and won't use them for my personal projects. I'd rather pay for GitHub and Sourcehut so I don't feel like I'm opening my browser up to a bunch of security problems.
In the past they've also made some really brain dead (IMO) decisions like going out of the way to break middle-click paste on Linux.
>They make cross-site requests all over the place (to weird servers like "some-huge-name-that-obscures-the-host-name.atl-pass.net", and even some third party servers!), tons of Javascript and css for basic features, etc
If you like this, you should try Microsoft. They combine this crap with endless redirects. Usually, I give up after 5 minutes whitelisting + redirects.
Beyond what other people mentioned, some sites and frameworks also rely on the Referer header as part of CSRF protection. It's not truly necessary to check, but it's an OWASP recommendation so it seems like a decent number of places implemented it by default.
I recently got the Pyramid Python framework to make it possible to disable Referer-checking for the built-in CSRF protection, but they're still going to keep requiring the header by default: https://github.com/Pylons/pyramid/issues/3508
More discussion about it in these pull requests too:
The new version with it being optional hasn't been released yet, so as of right now almost everyone using Pyramid will still require users to send a Referer header to get past any CSRF checks.
I had an old website hosted under www. When it was decided to build a new website, to preserve the old content, the new site was built without a leading subdomain.
The problem was that chrome cached www as the default for anyone who'd visited the old site, and had started hiding www from the address bar.
I used Caddy to redirect all requests to the subdomain free site unless the request came with a referrer from that site, fixing the caching and allowing for free navigation between and within both the old and new site.
Conversely, if a big browser makes a new default that ends up being the wrong decision, that default might spread to other browsers and things will definitely be broken.
The css value `100vh` meant the height of the viewport of the browser, until it didn't.
I hope there is a light at the end of the tunnel for all of this. It seems like there will always be a cat and mouse effort to be just one step ahead of the other. Like how many websites have those popups now where they ask you to turn off ad-blocking. Intrusive ads and website tracking should both be a problem by default. I guess not all ads can be a problem, but I am unsure if the same could be said about tracking...
We're willing to play the cat and mouse game indefinitely, if that's what it takes. Widely deployed trackers are limited in how fast they can try new tricks. And in practice, we know that ITP is working pretty well to block cross-site tracking: https://daringfireball.net/linked/2019/12/09/the-information...
> Widely deployed trackers are limited in how fast they can try new tricks.
How so? Tracking scripts are often included by a script tag that points at a website. Can’t the code be updated, “deployed” to websites immediately, and take advantage of the relatively slower release cycle of Safari?
Maybe I should have said that some tricks are slow to deploy.
Sometimes the publisher only embeds an image form the tracker (the famed "tracking pixel"). Getting lots of sites to change that to script is a pain. Sometimes they need to deploy new server-side tech for a workaround. For the recent CNAME cloaking trick, they have to get sites to modify their DNS and change what URL they embed script from.
This is so prevalent already. Brands disguised as users posting "content" that is mostly just an advert for their brand.
It has got to the point where any time someone posts something that seems to too clearly show a brand name or speaks too highly of a product I suspect its the PR people at work and I downvote it.
The old way: tracking you as you look for snowboarding videos on the Web and advertising you a snowmobile wherever you go.
The new way: making sure that 95% of the snowboarding videos you see are subliminally designed to sell you a snowboard (the guy riding the competitors snowboard goes slower and crashes... the guy riding your company’s snowboard wins the race and his girlfriend looks like a supermodel)
I think eventually we will pine for the old way. Already you can’t get useful reviews anymore because all of the “comparison” searches are run by manufacturer mouthpieces.
>I think eventually we will pine for the old way. Already you can’t get useful reviews anymore because all of the “comparison” searches are run by manufacturer mouthpieces.
Absolutely. A lot of reviews these days from google results read like someone who has only ever read the feature list from the marketing page. There is a bit of a search engine hack where you just put "reddit" after any search and it brings up fairly real results for now.
I find reviews useful anyway. I simply ignore the "good" reviews and always look at the worst ones. There are three kinds of bad reviews - people who had random bad stuff happen (postal service broke it) that is irrelevant and think everyone should know - people with some sort of vendetta (possibly disgruntled employees, or competitors, or crazy customers) - and finally, people who actually had a bad experience that might be characteristic of the product's quality or design.
If the third category can be used to construct a narrative about something that is a deal-breaker, then that's the information I'm looking for. Of course, it has to be taken in context of the competitors.
My expectation is that the best products have some type (I) and type (II) bad reviews, but no type (III). Almost as good is something with type (III) that are about something that either doesn't matter to me or is actually a positive from my perspective.
I'm pretty good at finding decent reviews. I'd never post my process on a public forum, but I apparently don't have as much trouble avoiding sponcon as a lot of others.
If people are doing that, it’s fairly certain that there are marketers maintaining “humanoid” Reddit accounts which then chime in with opinions on Bluetooth headphones.
No doubt. So far the only defense against crap products is buying them from a physical store so you can test and easily return them and having a good warranty.
"Native advertising" was originally ads that were served by the site itself, first-party. It's advertisers who have co-opted it (yet again), but that doesn't mean you have to go along with their preferences. Typing isn't hard and "sponcon" (sponsored content) is already perfectly cromulent jargon, as well as being shorter.
I've said before, and I'll say it again, much of the content I want to consume is basically advertising, but the way today's internet works, is that I have to view ads for stuff I don't want in order to see the ads I want to see. And for some reason, people call avoiding this "stealing".
>> Like how many websites have those popups now where they ask you to turn off ad-blocking
Handle them the same way as websites with a cookie or gdpr warning that blocks everything else, vote with your valet by leaving the site and find another site instead
Its funny how our brains have a kind of built in adblocker named banner blindness. There have been a few times I was unable to understand a UI because the important part was rectangle and too prominent so I ignore it entirely without realizing it.
Why do you think advertisers moved to moving ads, ads that fade in over the page once you scroll a little and can be assumed to be focusing on the page, reading? Autoplay video that moves down to the picture-in-picture corner? The more annoyingly distracting the ad is, the better. Or so advertisers think.
Why can't a browser solve this (except for IP) by simply having an option to not leak any data? Make audio and GL calls constant time, and don't persist anything past the tab / window / site? No fonts or cache reuse beyond the host? No referrers etc.
What's the hard problem here that prevents major browsers from having an option like this?
Firefox has an about:config preference called "privacy.resistFingerprinting" to enable some of Tor's mitigations against fingerprinting. Tor is based on Firefox code and Mozilla merges some of Tor's code changes into Firefox to make updating easier for the Tor team.
The difficult bit is that your browser is programmable and browsers are different across vendors, devices and releases. This means whatever a bad guy can think of as a test can be sent back over the wire, and you can't realistically block sites from sending data back to servers. So long as there are different browsers etc, there will be tests that can differentiate between them.
Currently canvas fingerprinting [1] is a popular option, but there's quite a lot of options for that next thing you can use. Even generic code execution time could be used to an extent. Realistically there is hope at the end of the tunnel, but it's a very long way to go given just how complex of a corner we've painted ourselves into with modern web standards.
While it wouldn't stop purely malicious actors, I personally think it might be easier to address the whole situation on the legislation side rather than with technology. Imagine GDPR, except tracking would be illegal altogether: there will always be actors who will work to bypass it, but the majority would do their best to conform, lest they want to go bust with fines.
Of course, Google suggests modifications that would hinder their competitors, but not themselves. I wonder what percentage of browsers have a first-party cookie from Google?
If Google had any motive besides research and responsible disclosure, it would more likely be to persuade us that ITP is not viable. But I think their issue was fair and submitted in good faith.
> We’d like to thank Google for sending us a report in which they explore both the ability to detect when web content is treated differently by tracking prevention and the bad things that are possible with such detection.
Its interesting that Google being an ad-tech company is doing something against their own interest.
They have an app store they earn money from. They have ads system. For them websites are competition. Because people loose time elsewhere than in apps. They click other ads than theirs.
How should anyone believe these actions are for privacy? And not against competition? Against the Internet?
Have you seen any consideration how it will impact website owners? I didn't. It seems they really don't care. And it is very dangerous.
Apple doesn't have an ads system, unless you mean the App Store ads that only show up in App Store search (and thus are completely unrelated to Safari). They had an ads system at one point but it was shut down over 3 years ago, and was for in-app ads anyway, not browser ads.
Why not both? The market is a brilliant system. Privacy is objectively good for users and so is competition as it can lead to lower prices and better products for users.
It's not perfect because the IP blocks make it obvious that it comes from DigitalOcean, AWS, etc, but it's sure better than loading untrusted PDFs or JS locally. Still vulnerable to a network attack, though.