Hacker News new | past | comments | ask | show | jobs | submit login
Twitter Direct Message Caching and Firefox (hacks.mozilla.org)
287 points by feross on April 3, 2020 | hide | past | favorite | 69 comments



So basically Twitter wasn't setting its Cache-Control directive correctly? Why is this behavior different from Safari and Chrome?

It looks like "no-store" is in this IETF RFC: https://tools.ietf.org/html/rfc7234#section-5.2.2

EDIT - Ah, I see, it's in the original post:

> Testing from Twitter showed that the request was not being cached in other browsers. This is because some other browsers disable heuristic caching if an unrelated HTTP header, Content-Disposition, is present. Content-Disposition is a feature that allows sites to identify content for download and to suggest a name for the file to save that content to.

> In comparison, Firefox legitimately treats Content-Disposition as unrelated and so does not disable heuristic caching when it is present.

It looks like Twitter only tested this in Safari and Chrome and called it a day. Their wording implies that it's Firefox's fault, which is misleading.


They should have absolutely added a no-store cache directive. It is crazy that they didn’t and instead just checked if browsers cached it or not. To rely on undocumented behavior when there is a specific documented way to do what you want is just bizarre.


I wouldn't even be so sure that they tested caching behavior in any browser.


Sure, but now look at general HTTP API's and see how many set any kind of Cache header. I wouldn't be surprised if _many_ API's used by (Phone)Apps and a (Web)App do not do so and might leak private data into the cache, maybe even secret keys or one-time tokens like recovery codes.


I have fuming anger over the whole world deciding Google will set the standard for browsers. I wonder if EFF and similar orgs are doing anything legally, perhaps EU can armtwist again?


https://tools.ietf.org/html/rfc7234#section-4.2.2

   Since origin servers do not always provide explicit expiration times,
   a cache MAY assign a heuristic expiration time when an explicit time
   is not specified, employing algorithms that use other header field
   values (such as the Last-Modified time) to estimate a plausible
   expiration time.  This specification does not provide specific
   algorithms, but does impose worst-case constraints on their results.
The standard does not say what heuristics should be used, so Firefox’s heuristics is no more “legitimate” than Chorme/Safari’s heuristics of expiring immediately when Content-Dispostion is present.

“Firefox legitimately treats...” is highly misleading, making it sound like Firefox is more standard-compliant, but it’s not; it’s just different.


I am fairly confident that the article is not trying to imply that Chrome/Safari are NOT standards compliant. The author seems to be trying to heavily emphasize that this is not a Firefox bug, and that Firefox can legitimately do this while being standards compliant.

I feel like the author was anticipating a lot of people blaming Firefox for this incident, and is attempting to shut those responses before they get made.

I think that trying to write to the responses you expect from your audience unfortunately ends up with these kinds of miscommunications of intention more often than not.


Twitter explicitly blamed Firefox for the behavior, so the anticipation is warranted.


And it's allowed to treat it differently than the other browsers. That's what "legitimately" means. And that needs to be pointed out, because Twitter was very careful to avoid saying it wasn't a Firefox bug.


Please don't use code blocks for quotes. It makes it very hard to read text on mobile, narrow viewports or via screen readers.


RFCs are published as plain text with line breaks, which I literally copied without even changing the indentation, so barking up the wrong tree here.


I'm not. If copied without formatting as code blocks, the text would be a lot more readable and wouldn't require horizontal scrolling on narrow viewports.


Since this handling of Content-Disposition is also in Safari it likely predates Google's involvement, and might even go all the way back to KHTML.

(Disclosure: I work for Google)


Mozilla dev here, I’ve been told that Gecko’s behaviour matches IE’s and Netscape’s.


What would the legal issue be, exactly? Twitter is the one deciding to code against chrome’s features, not google.


Regulating browser standards so that a commercial org can't publish browsers that don't comply. Much like how cars are regulated to be "road legal".


So now we want the government to define what can be defined as a browser? So what is this government agency that has the technical wherewithal to “regulate browsers”?

What’s going to happen when someone decides that browsers must have content restrictions?

It amazes me how willing some people on HN just willingly give up their freedom to the government.


>So what is this government agency that has the technical wherewithal to “regulate browsers”?

NIST for one.


You really trust NIST to know the intricacies of creating a browser? Besides, hasn’t the last few years shown you what happens when you have government organizations led by people who aren’t experts but who are appointed for political reasons?

Even when you do have competent people in the government like in the CDC, they are still hamstrung by the ideology of the executive branch.


Yes. Have you ever read a NIST standard?


So before Apple, Google, and Mozilla release a browser they should have to go through an approval process?


My suggestion, though, is different. If NIST has such an approval process, then, Apple/Google/Mozilla/anyone else doesn't have to go through an approval process, but if they do go through such process then they can put on their product, the sign "Approved by NIST as web browser compliance".

However, I am not so sure that such a thing is even necessary at all. I think the use of the web browser should be reduced in order that this is not necessary.


Yeah because being approved by NIST will carry as much weight to consumers as being W3C compliant...


Do you believe that consumers even know that the W3C exists?


That’s kind of my point....


By an organization for use by the public at large? Yes!!

Devs can still make browsers explicitly for other devs or for non-public use. If society depends on your product to function then your product must be regulated. Imagine google auto-updates chrome to stop supporting http or something, imagine the economic chaos. The whims of Googlers are not something the public should rely on. Everything from their standards compliance to their change control should be regulated,same goes for mozilla.


And in this case, both Firefox and Chrome behaved according to spec. What would regulating browsers have helped?


Perhaps stanrdizing the QA process would have prevented this? The problem I was addressing is this issue falling inline with a pattern set by google. They wouldn't test only against Google if Google's whimsy divergences didn't mean it cost too much to support anyone elss in your QA.


Maybe I'm underestimating Twitters developers, but in this case I honestly doubt Twitter had a QA process testing if things remained in browser cache at all, until someone pointed out that there is a problem. If they had thought about it as a problem, adding the cache header would have been easier to implement and test for than browser testing somehow.


Oh please, give me a break here. Browsers are essential to modern life. Anyone from a young student to an old person about to retire needs them to survive these days, so yeah I want them to regulate this.

> What’s going to happen when someone decides that browsers must have content restrictions?

You're joking right? You think the government can't do that already? Ok,let's see the logic here , let's say that's the case. What happens when Google decides to have content restrictions? Nothing! It's not regulated activity so it falls under Google's freedom of speech,they can restrict any content they want. However b.s. it might be, in theory at least you have some control over your government. Google does what is in their best business interest, you are the product, advertisers are their customers.

Your illusion of freedom is to have your own government as far away as possible. Would you be comfortable if General Motors or Ford decided the safety standard or road-readiness of cars? Certainly you can't have government take away your freedoms by telling them what is safe and acceptable for the general public to use? What if the government decides to have them restrict cars from driving to certain places!


Oh please, give me a break here. Browsers are essential to modern life. Anyone from a young student to an old person about to retire needs them to survive these days, so yeah I want them to regulate this.

You really have no idea how many seniors don’t have internet access. I don’t think a browser not properly supporting the CSS box model will kill anyone....

What happens when Google decides to have content restrictions? Nothing! It's not regulated activity so it falls under Google's freedom of speech,they can restrict any content they want.

How can Google restrict the websites I go to?

However b.s. it might be, in theory at least you have some control over your government. Google does what is in their best business interest, you are the product, advertisers are their customers.

As if government officials don’t do what’s in their own best interest. Because of the way that both the electoral college and the Senate is designed, if you live in a more populous state, you will always have less voting power than someone in “Middle America.” Not to mention gerrymandering.

So exactly how does a browser not adhering to standards affect my survival?


I don't know about seniors but I routinely need a browser to fill out government forms. I routinely need browser to apply for employment or unemployment,to purchase esse tial goods, to operate a viable business. Your logic is that people can walk so cars should not be regulated. My logic is even if people can walk, cars are used so much and are critical for so many and their lack of safety could be so catastrophic that it cannot be left to the manufacturer to have good faith and know what is best for the public. I don't really care about your survival as an individual but I do care about stability of society,economy and not having to depend on corporation's good will for things I need.

Your logic can apply to healthcare as well. Plenty of healthy people. Plenty of people have good jobs and insurance. So does that mean healthcare affordability for everyone should not be regulated?

Oh and the whole electoral college b.s., so you're saying we should disband FTC,FDA,EPA and all other regulatory agencies as well? Come on!

> How can Google restrict the websites I go to?

Easy, they can block it out right or simply deprecate support for the site. A close-enough example is ublock origin and how google basically crippled their ability to block ads. Were there regulations, it would not be up to google. The first thing that should be regulated is their ability to deprecate random things on a whim.


So now before Google can make any changes to their browser, they should have to wait for a regulatory committee?

This is the same government whose ancient COBOL systems can’t handle the influx of unemployment claims and their is a submission currently on the front page of HN about one agency forcing people to fax a claim in.


significant changes will have to go through a browser comittee much like CAB for PKI CA. We wouldn't need this if Google and friends didn't act in bad faith and in abuse of the power they have as a monopoly. Unregulated monopolies people depend on are always a bad thing.


Instead of requiring that, a better alternative is: You can still use software that doesn't comply, and commercial organizations can write software that doesn't comply, but then they can't call it a compliant web browser program. (This way, the user is properly informed what it is.) This way, it won't stop anyone's freedom to do something, especially that of the user (who might or might not also be a programmer).

But I think WWW is too messy and complicated. Better is to not require a web browser at all, if it can be avoided. (There is also gopher, telnet, postal mail, etc, depending what you need. The web pages can still be provided as an option, of course.)


Both Chrome and Firefox behave correctly according to the relevant standard.


How are you ever supposed to add any experimental features to a browser, then?


Google hasn't decided a standard here. It just happened to be the case that their heuristic avoided the problem here, so Twitters bug only caused a potential problem for Firefox.


I don't think GP meant that Google decided a standard, but that the world (and in this case, Twitter) decided that whatever Google does is considered the standard.

In other words, rather than checking with the standard how to indicate that content should not be cached, they simply checked whether Chrome did not cache it, and since it didn't, decided that whatever they were doing was the way to prevent caching.


Yes, that is what I meant.


I suspect they didn't test for it at all, noticed later or got a report, and then checked which browsers are affected.


> This is because some other browsers disable heuristic caching if an unrelated HTTP header, Content-Disposition, is present.


The thing that boggles my mind is that the only affected users are:

(a) - Firefox users &&

(b) - who downloaded their messaging history on a buried menu option in the account page &&

(c) - in the last 7 days prior to disclosure &&

(d) - who did this on a computer where someone else has access

The number of affected people is presumably very small, and the only metric that twitter can't know here is (d). How on earth does it make sense to alert every Firefox user with a scary wall of text? Don't they have logs to cross-reference (a), (b) and (c) and e-mail these users?

I'd believe that if there's only one API endpoint that would be crucial to log to protect against major leaks, it would be this one to download all your history at once...


re (b) Twitter says "took actions like downloading your Twitter data archive or sending or receiving media via Direct Message,", so it isn't just downloading the archive, and to me "actions like" suggests there might be more than the ones named here explicitly. And they might not be able to tell who did it for all these things.

I also didn't see any notice, but I don't know if I missed it, my adblocker ate it, or if they actually did only inform users that did one of the at-risk things and I happened to not do so.


Lovely how Twitter's blog post does nothing to explain is not Firefox fault.


You don't say.

Pasting an URL into a tweet entry form in their web client has been resulting in "Something went wrong" in 50% of cases for several months now. Attaching a GIF to a tweet randomly fails with no explanation given, again - for months.

I don't know what Twitter's development priorities are, but they sure as hell don't include proper web client testing.


Never got any of those. Perhaps you're using slow/unreliable network and hitting timeouts? Or some privacy tools interfering? Using Firefox web Twitter.

One thing I did use to repro though was text field behaving erratically on Firefox Mobile or when using installed PWA from home screen. Had to give up and use Chrome directly.


Twitter is a very mediocre tech company that thinks it's stellar and brilliant. Their product is consistently subpar, riddled with countless bugs, and they neither acknowledge it or fix it.

So yeah, no surprise that they would blame others for their failings.


The sad part is then, why is no one coming up with a better alternative?


But Twitter is a terrible idea so anything better than it doesn’t resemble it at all.


Then why are people using it? (I do not)


It gives them the feeling someone cares about their lifes


They are deeply entrenched. It's hard to move people to a different platform.


In fact seems to imply the opposite.


Mozilla Hacks continues to be the best webdev blog


Why would Twitter send a Content-Disposition header with their API responses?

    content-disposition: attachment; filename=json.json


I believe this is for additional XSS protection - if a user clicks a suspicious API link, the worst that happens is the browser downloads a json.json file instead of rendering a potentially malicious payload.


How would rendering the file perform a malicious action?


If the content-type header failed to get sent, or if a browser (lookin' at you, IE) chooses to ignore it in favor of what it thinks it probably should have been, then the result can get rendered as HTML. If you have some json that has angle-brackets in it (which, being JSON, would obviously not be HTML-escaped) and the result is rendered as HTML, this can result in your browser executing attacker-defined javascript in Twitter's origin.

Sending it as a "file download" has no effect on what happens when the endpoint is called via AJAX, but in the event that a browser navigates to it directly, ensures that even the dumbest of browsers do not render it as HTML.


Through a bug in the renderer that the file exploits.


because you can legitimately download your direct message history.


A small amount of discussion on the previous post from the Twitter Privacy team: https://news.ycombinator.com/item?id=22762467



I think a similar issue exists with Twitter and Safari. (Maybe related to web worker or page caching) Sometimes I need to restart Safari, otherwise Twitter fails to load. When I type twitter.com, the site occasionally hangs in that browser. Just putting this to see if some others are effected.


Expect to see more of these as fewer and fewer people use Firefox and web developers simply don't try their websites on this browser. And as more websites stop working correctly on Firefox, they will lose more market share. Some kind of spiral, you know?

I was thinking that a rendering bug (which is what one expects from a browser compatibility bug) is not that bad, but when we are talking privacy and security, now that's a problem.


Clearly a bug on Twitter's side but the behavior to disable caching when content-disposition is used seems sensible.


Why? It’s not in the spec, how am I, the developer, supposed to know that the browser is being non-compliant?


It's in the spec that the browser is allowed to pick a duration if you don't tell it one, so you know that you don't know how long it is cached if you do not set a header.

https://news.ycombinator.com/item?id=22775479


Both browsers are compliant. However there is generally no reason to cache a resource with content disposition so it seems sensible for Firefox to adopt this behavior.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: