> what URL they have used to arrive there This is yet another example of why sen...

commoner · on July 18, 2019

If you're using Firefox, the Smart Referer add-on strips out the HTTP Referer and the value of document.referer (in JavaScript) from cross-domain requests. It includes a default whitelist, and is customizable.

https://addons.mozilla.org/en-US/firefox/addon/smart-referer

It's also open source.

https://gitlab.com/smart-referer/smart-referer

This extension is most effective if you also use an ad blocker (like uBlock Origin) and Firefox's first-party isolation feature, although Smart Referer will still help prevent tracking even if you don't.

https://www.ghacks.net/2017/11/22/how-to-enable-first-party-...

jakejarvis · on July 19, 2019

You can also set this directly in about:config under network.http.sendRefererHeader:

  0 = never send the header
  1 = send the header only when clicking on links and similar elements
  2 = (default) send on all requests (e.g. images, links, etc.)

If you want more granular control (like sending referrers but only the root of the domain) all of the various network.http.referer flags for Firefox are listed here:

https://wiki.mozilla.org/Security/Referrer

Doesn't have a few of the features that your extension has, but it's done the trick for me!

commoner · on July 19, 2019

I'm not the developer of the extension (just a user), but thanks for the about:config tip!

copperx · on July 19, 2019

Has the 0 setting broken anything in your experience?

spiderfarmer · on July 18, 2019

"something designed to violate user privacy"

I fail to see why sending the referer is a privacy concern. Following that logic, every datapoint is a privacy concern. From screen resolution to mouse movement, everything can be abused to build profiles. Referer headers have a host of valid usecases but if you are opposed any data being shared you'll probably dismiss all of them.

JohnFen · on July 18, 2019

> Following that logic, every datapoint is a privacy concern.

Yes, every data point is a privacy concern.

oehpr · on July 18, 2019

Even the DNT header is a privacy concern.

oytis · on July 18, 2019

> Following that logic, every datapoint is a privacy concern

I think it's a reasonable login when we're talking about data points sent implicitly. In my ideal world the website just gets the resource identifier, and browser cares for the rest.

pdkl95 · on July 18, 2019

> I fail to see why sending the referer is a privacy concern.

From the article:

>> The Zeus platform monitors contextual data such as ... what URL they have used to arrive there ... The publisher will then match that data to its existing audience data pools ... to create assumptions on what that news user’s consumption intent will be. The technology uses machine learning to decipher the patterns.

They are explicitly stating they use Referrer-like data to track users.

snowwrestler · on July 18, 2019

> They are explicitly stating they use Referrer-like data to track users.

To me, "track user" means persistently ID one person. This sounds more like inferring anonymous interest, like inferring that someone arriving from ESPN.com might be interested in your sports section.

If that person comes back tomorrow from The Financial Times, you might infer that they are interested in the economy.

But without cookies, I don't see how you would recognize that visit as the same person as yesterday and integrate the sports and economy interests into a persistent profile. Each visit would be self-contained, which doesn't fit my definition of "tracking."

JaRail · on July 18, 2019

They didn't say no cookies. They said no third-party cookies.

JohnFen · on July 18, 2019

> But without cookies, I don't see how you would recognize that visit as the same person as yesterday and integrate the sports and economy interests into a persistent profile.

If you gather enough of that "anonymous" data, and particularly if you combine it with other data sets (as they claim they are intending to do), then it's not that hard to recognize individuals based on their usage patterns and metadata.

spiderfarmer · on July 18, 2019

But does it matter if you don’t know who they are?

mr_crankypants · on July 18, 2019

Going with that premise for the sake of argument: Nah, probably not.

But that premise is well-known to be in conflict with established reality. Identifying specific individuals from these sorts of data points is famously, disturbingly easy to do. People even do it just for fun, almost like it were an Advent of Code challenge. That's the reason why there will never be another Netflix Prize.

JohnFen · on July 18, 2019

It does to me.

eclipxe · on July 18, 2019

spiderfarmer · on July 18, 2019

So that probably makes me part of a group composed of several hundreds of other visitors who exhibited the same behavior. I fail to see how that violates my privacy. You'd probably learn a lot more about me by watching me walk from my office to the parking lot.

Barrin92 · on July 18, 2019

>You'd probably learn a lot more about me by watching me walk from my office to the parking lot.

well I don't really expect the newspaper that I read to watch me walk from my place of work to the parking lot. In fact I don't want them to watch me at all because I don't expect newspapers to be in the surveillance business.

When I buy a newspaper at a store the guy behind the counter doesn't follow me three blocks to figure out how I drink my coffee at the coffeeshop, yet curiously enough this is how the internet works, everyhwere

spiderfarmer · on July 18, 2019

> When I buy a newspaper at a store the guy behind the counter doesn't follow me three blocks to figure out how I drink my coffee at the coffeeshop, yet curiously enough this is how the internet works, everyhwere

But if you keep returning to the same news stand, he'll probably reach for the newspaper you like when he sees you coming. This is the equivalent of what the Washington Post does now. Not following you to the coffeeshop like the ad tech of today.

Barrin92 · on July 18, 2019

I don't object to a business or individual I interact with to getting to know my preferences better, that's inevitable and a good thing. What he doesn't do however is commodify my personal information and sell it to third parties and advertisement agencies so that they in turn can try to manipulate me and show me stuff I don't want, and I also suspect no newspaper vendor runs a high tech operation in the basement that, without my explicit knowledge runs some sort of panopticon like experiment on my personal data.

Do you know what I'd really like to see? A sort of frame in frame of what the algorithm sees that tracks me while reading a Wapo article, directly shown to the reader. It'd be interesting to see how people would react if they were aware of how exactly they're being followed around and analysed.

spiderfarmer · on July 18, 2019

>What he doesn't do however is commodify my personal information and sell it to third parties and advertisement agencies so that they in turn can try to manipulate me and show me stuff I don't want, and I also suspect no newspaper vendor runs a high tech operation in the basement that, without my explicit knowledge runs some sort of panopticon like experiment on my personal data.

Where in the article do you see that WaPo does this? I was under the impression that this is WaPo-only data, collected by WaPo and used by WaPo. Too sell advertising space, yes. But that's because you're not paying them directly.

>Do you know what I'd really like to see? A sort of frame in frame of what the algorithm sees that tracks me while reading a Wapo article, directly shown to the reader. It'd be interesting to see how people would react if they were aware of how exactly they're being followed around and analysed.

I would love that too, but as long as it doesn't explicitly mention them by name I guess people don't care. Look at Facebook, here people never had any problem sharing really private information in exchange for free information and entertainment.

acheron · on July 18, 2019

But that's because you're not paying them directly.

No ads or tracking are disabled for subscribers. Paying them directly makes no difference.

spiderfarmer · on July 18, 2019

That's utterly stupid.

JohnFen · on July 18, 2019

> as long as it doesn't explicitly mention them by name I guess people don't care.

I should not be subjected to spying just because most of my neighbors don't mind being spied on.

dredmorbius · on July 18, 2019

every datapoint is a privacy concern

Yes, this.

33 bits.

majewsky · on July 18, 2019

> 33 bits.

Context: There are about 2^33 people on earth, so it takes roughly 33 bits of information to identify a single person.

(In practice, it's probably slightly more bits because not all bits carry unique information.)

spiderfarmer · on July 18, 2019

33 bits is not nearly enough: https://www.innovationfiles.org/33-bits-of-nonsense/

>"Anonymity in real life is much different than anonymity in the lab, and most people are content to be “one in a million” even if they cannot be “one in 6.7 billion.” In any data set, highly unique individuals (i.e. the outliers) may stand out, much like today’s celebrities do not enjoy the same level of anonymity as the average citizen. However, the fact that some individuals may be identified in a particular data set does not mean that any (or all) individuals may be identified in the data set."

nerdponx · on July 18, 2019

From screen resolution to mouse movement, everything can be abused to build profiles

That's correct. I want none of these available without my explicit consent.

rayiner · on July 18, 2019

What’s a valid use case?

spiderfarmer · on July 18, 2019

If my site is getting hammered by visitors I would like to be able to easily discern if it's because I'm featured on HN's frontpage or if I'm victim of a DDOS attack.

y0ghur7_xxx · on July 18, 2019

The referrer header is in no way a tool to differentiate real users from a ddos attack.

spiderfarmer · on July 18, 2019

I disagree. A fake referer is easily checked: Is my link really on the frontpage? If so: all good. If not: it's getting suspicious.

close04 · on July 18, 2019

A similar line of argumentation has been historically used to push every outrageous thing on innocent people since forever. You sell the "abuse" as defense for a shocking crime. Ok, you only said DDoS when the usual is terrorism and child abuse. But the bottom line is the same: I need to take something private from you to defend myself.

What would you think if all stores took every measurement they could about you without disclosing it and eventually justified it by saying "how else would I know you're not a thief"?

rubinelli · on July 18, 2019

A referrer header is not an outrageous amount of information. It's the store-equivalent of asking "Where did you learn about us?" Taking it away would hurt smaller sites and do nothing against large companies and ad networks.

JohnFen · on July 18, 2019

> A referrer header is not an outrageous amount of information.

But it does reveal information that is none of the website's business.

> It's the store-equivalent of asking "Where did you learn about us?"

No, it's not. Actually asking that question would be the equivalent. What this is is surveillance.

close04 · on July 18, 2019

The store is asking, the site is not. And 99% of people are trained to click "Accept" after years of dark pattern abuse and they have very little understanding of what happens in the background. I hope you understand that my point isn't to bash a webmaster but rather bring in discussion the principle of the whole thing. Seems that everybody draws the line for what is acceptable in such a way that it perfectly covers their own needs.

I've seen people that insist that using facial recognition is not different from what humans are doing naturally, now done also with electronics. We can agree the implications are different.

spiderfarmer · on July 18, 2019

  You sell the "abuse" as defense for a shocking crime.

This works the other way around too. You use the abuse of non-personally identifiable information (by combining it with other data points, illegal without consent in the EU) to take useful data away from innocent webmasters.

JohnFen · on July 18, 2019

> to take useful data away from innocent webmasters.

Webmasters who are collecting data about me or my machines (excluding the data about my direct use of their site) without my permission are not "innocent webmasters".

close04 · on July 19, 2019

I'm surprised that in 2019 people (especially on HN) still believe/claim that users trying to hang on to their personal data "abuse" this to "take useful data away from innocent webmasters".

There are dozens of real life situations where covertly collecting such data would be considered completely unacceptable and yet my comment arguing this was still substantially downvoted.

But I guess my point is being in a technically literate community makes no difference when it comes to making a buck. Once one agrees to take a "not an outrageous amount" of private data for a bit of money, they'll agree to take an outrageous amount for outrageous money. And I think this is a perfectly accurate explanation for what FB, Google, [you name it] are doing.

close04 · on July 18, 2019

Doesn't your argument work against encryption just the same? With such an argument aren't you actually punishing 99.9% of the internet population for what the 0.1% is doing?

jimktrains2 · on July 18, 2019

But in general it's the only way to understand who's linking to you. Sure, not essential, but useful to see in general, especially when search engines could send it and you could see what keywords people used to find your site. If it were gone, as it is in many cases now due to https, people will adjust.

spiderfarmer · on July 18, 2019

  "as it is in many cases now due to https"

That's not exactly true. Referrer is only hidden if it's explicitly asked by using a meta tag:

  <meta name="referrer" content="no-referrer" />

Or by using Referrer-Policy:

  Referrer-Policy: no-referrer

The default behavior is no-referrer-when-downgrade. This means that referrers from https to http are hidden. But https > https is still visible. And with https adaption reaching saturation, referer headers are usually still sent.

ABeeSea · on July 18, 2019

Google has used encrypted search terms in the referrals for years now.

amarshall · on July 18, 2019

Cross-origin sending of the Referer header can be disabled in Firefox with network.http.referer.XOriginPolicy, along with a variety of other Referer-related options [1]. I have it set to 1 (and XOriginTrimmingPolicy to 2) and haven’t experienced (m)any issues.

[1] https://wiki.mozilla.org/Security/Referrer

rubinelli · on July 18, 2019

There are very goods reasons for the Referrer header to be used. If you see a lot of traffic going to a URL with a typo, you will want to know where that typo is. If someone hotlinks to a large file in your domain, you will want to know who it is and block it. Any alternative would be much more intrusive.

pdkl95 · on July 18, 2019

> you will want to know where that typo is ... you will want to know who it

I know you want to know those things. Find another way to handle those issues.

To be a "good reason", you need to show why your reason is worth paying the high price of betraying every user's browsing path to every server. Worrying about hotlinks and typos... "ain't the same fuckin' ballpark, it ain't the same league, it ain't even the same fuckin' sport".

> Any alternative would be much more intrusive.

Did you consider only serving that "large file" only when accompanied with a proper session cookie created when they loaded the HTML file? There are many solutions to those problems, including some that are sever-side-only.

spiderfarmer · on July 18, 2019

  Find another way to handle those issues.

If there's another way, it would lead to the same privacy concerns.

  why your reason is worth paying the high price of betraying every user's browsing path to every server

First explain why it's a) betrayal b) a high price.

   a proper session cookie

This again could lead to privacy concerns.

dvfjsdhgfv · on July 18, 2019

I understand your concerns very well, but I have a different perspective. I don't modify my Referrer header. I want to let the websites I'm using where I came from. A referrer by itself is innocuous - only when you combine it with other nefarious techniques it wreaks havoc on users privacy. But on it's own, in an anonymous browser environment that I tend to use, it's actually quite useful.

turbinerneiter · on July 18, 2019

Sadly, the industry chose to abuse it to the detriment of the users. Enough reason to take it away.

dehrmann · on July 18, 2019

It's not really "design"--the header name is even misspelled. This one always felt like, at the time in the early days of the web, it'd be interesting data to pass along. Since then, things like image hotlinking started to depend on it, and Google got better about hiding referrer data, so there wasn't the same motivation to fix it as implementing same-origin policy. If the web were invented today, yes, I doubt that this would be a thing.

stubish · on July 18, 2019

It was how you did sessions before Cookies and JavaScript existed, and existed because it was a problem that needed solving. Converting forms to wizards and the first Internet shopping carts.

mosselman · on July 18, 2019

I agree. There are some add-ons that spoof/disable this header for you, but as you said, this breaks some sites. I agree, as a consumer, that website that rely on the header are out of luck with regards to my business, but at work I don't always have a choice with regards to which online tools we use. But white listing the things that break is a fine solution in that case.

pessimizer · on July 18, 2019

I forge the referer as the root of the site, except in the case of news sites that allow referers from google news to bypass the paywall, in which case I always forge that. This very rarely breaks anything (one out of a million sites expect an external or specific referer.)

ggg3 · on July 19, 2019

and I wanted to recall that google employees repeatedly removed chromium's project code to restrict or disable referrer headers.

I personally was involved in 3 distinct times. And after that gave up chromium and the lie of google-independence completely.

and so should you. If they tweak things to reach their profit goals, they will also do the same when any agency "asks" them to. it's a slippery slope, and they already crossed