So. The question I'm asking myself now is how to fix this. Giving <wildcard>.domain.com a shared quota will allow one tumblr or github pages user to monopolize all storage, effectively removing local storage for this kind of scenario (also removing it for the host which is even more annoying).
A maybe workable solution would be to only allow creation of new keys for the first-party origin. What I mean is that whatever.example.com has full access if that's what the user is currently viewing directly in their browser.
<wildcard>.example.com embedded via iframes could either get read-only access, or read-write access for existing keys. Also maybe limited to, lets say, 4K.
This sounds like a really complicated solution though. Any better ideas?
Any kind of DOM storage (cookies, localStorage, IndexedDB, etc.) is ephemeral. The browser needs to decide the maximum amount of disk space that it wants to consume, and then when it hits that limit, it needs to start throwing away (garbage collecting) some of the data based on some policy like LRU.
If the web app really needs permanent storage then that permanence of storage needs to be granted explicitly by the user.
> An application can request temporary or persistent storage space. Temporary storage may be easier to get, at the UA's discretion [looser quota restrictions, available without prompting the user], but the data stored there may be deleted at the UA's convenience, e.g. to deal with a shortage of disk space.
> Conversely, once persistent storage has been granted, data stored there by the application should not be deleted by the UA without user intervention. The application may of course delete it at will. The UA should require permission from the user before granting persistent storage space to the application.
If you do that (like Firefox admittedly does), then my issue with subdomain users using up all local storage comes into place.
It would really suck for github (for example) not to be able to use local storage in their UI because pilif.github.com used up all available storage for the whole domain.
This sounds like a mostly theoretical problem. Sure, it could be annoying, but if the quota simple is a little smarter (e.g. allowing a few filled subdomains rather than just 1, and/or always allowing the root subdomain to fill up, and/or evicting subdomains from localstorage on an LRU basis) then the chances of this happening non-maliciously are very small - and when push comes to shove, there's no guarrantee localStorage will persist anyhow, so every site needs to be robust in the face of data loss anyhow.
The 100MB per domain includes the subdomain storage? Then it doesn't solve what parent comment described - few subdomains could use up all the storage. If it doesn't include it and it's separate then it is exactly like it is now.
This really isn't that hard of a problem – allow X MB per domain and Y per sub-domain (user configurable or set by the browser developer to some sane limit – the actual number doesn't matter too much within certain ranges, in reality).
After either of those limits is hit, prompt the user to grant another increment of allowed storage or a customized amount, possibly (and disclose how much each domain and subdomain is using under a detailed view option). In the end, this puts the power back in the hands of the user and prevents any malicious usage while not allowing one subdomain to effectively deny storage to others, etc.
Limit the total local storage space at a browser level. E.g. 1GB. Just like you might limit the total size of temporary internet files/cache.
Beyond that, just provide a good (simple) UI for deleting stuff. Which could suggest candidates for deletion based on heuristics like you suggest. E.g. iframes shouldn't need so much. Hopefully less visited sites would be suggested for deletion too.
I don't think it's such a good idea to link directly to such huge files from here. The directory would do just fine http://planet.openstreetmap.org/planet/ and avoid bots or inattentive clicks.
Limit how much can be placed in localstorage regardless of site based on time. (Or perhaps prompt whenever that limit is reached, in case there is a legit reason for it.)
This isn't perfect in that your localstorage could still be filled up slowly if you leave a page open in the background, but I think this solution is robust to many different techniques.
Opera has a prompt message when limit is reached (it was actually triggered on the demo page). But it only works for subdomain level. There is no global domain limit.
Prompt is a horrible solution from a UX perspective. Essentially you're asking the user a question you, as a developer, couldn't or didn't want to answer. But the user has no idea either. Heck, she doesn't even know that there are limits in place or what DOM local storage even is.
I don't think it's a problem of the developer not knowing, they generally do know how much storage they'll need (or at least put an upper limit). It's a problem of trust.
Maybe the dev things they'll need 1GB but I'm not ready to give them that. The same way the apps asks for certain privileges when you install an app on, say, android.
I wonder why anybody thought it was a good idea to let any web page store some random stuff on my computer. Cookies were bad enough already.
You know, you can say what you want about Flash, but I least I could block it. These days I can't browse half the web if I disable javascript. One of these days I'll just run my browser with a separate UID just so I can keep it under control. Or better yet, a VM, since it appears people want to turn the browser into an OS anyway.
I'd argue that a user generally has an idea that there is a persistent storage ("disk") locally held in their computer and that they have a good idea whether they want a website to use that space or not.
The developer lacks the knowledge of the users requirements, that is why they can't answer the question. For "power users" the user is far better placed than the developer to answer the question about how much local storage space is used.
For a naive user the question comes down to "this website wants to put stuff on your computer, do you think your interaction with the website warrants them doing this", that's more a question of value of the website to the user than it is a technical question.
> and that they have a good idea whether they want a website to use that space or not.
"What's a website? I just double-clicked on my e-mail google and now Foxfire wants to fill up my disks. Is this going to put a virus on my Microsoft? Why don't they put it up in the clouds?"
Most users would just answer yes, thinking bad things might happen if they run out of 'disk space'. So you'd still need some kind of eviction strategy for people that never said no.
Prompting should be a configurable setting for users aware of the restrictions. By default, I would have it evict based on timing interval for normal users, unless prompting was enabled.
That's a bit like saying passwords and PINs are bad from a UX perspective. In a way, you're right, because any user flow gets simpler and smoother if you remove a password prompt, but it's pretty obvious why these things still need to exist.
Well, they get in the way of the user doing what she wants, of course. But they're a necessary evil, quite firmly entrenched by now (OpenID is subjectively worse even though it might be technically superiour) and they come expected for users these days. Just as you (usually) tell others your name when you call them on the phone, user names and passwords are kind of expected as a way of telling a web site who you are.
However, I'd say that prompts that may pop up whatever you're currently doing and ask for things most users cannot make an informed decision about. Eric Lippert once nicely summarised the problems in [1]. And while browser's confirmation dialogs are usually no longer modal, the problem persists. In the vast majority of cases the wanted result is »increase storage limits«. That this might pose a denial-of-service risk is something they are often not aware. And if you try telling them up-front they either won't read it or are needlessly scared. It's a hard problem, actually, especially given user habits concerning message boxes, confirmations and stuff.
There really isn't an easy way to avoid this problem even if you follow the standard fixed quota per domain and subdomains don't count policy. You could just embed iframes to diskeater1.net, diskeater2, etc and fill up the disk that way.
In the end, the problem is that one page can itself infer to other domain / subdomains in its document and those can execute and utilize localstorage. They have to, though, so you can embed an html5 game in your blog from some other site that you liked. It comes with the territory.
Sadly, it seems like the best answer is the horrible UX'd prompt - "do you want to allow x.y.z to store local content on your computer?" the same way you have to verify downloads and know exactly what you are running locally.
> You could just embed iframes to diskeater1.net, diskeater2, etc and fill up the disk that way.
Thankfully, in this case, domain registrations are expensive. Filling a 16 GB iPad with this technique would cost around $10,000 in registrar fees. A 128 GB SSD could be filled for under $100,000.
...So I wanted to come in here and say "cost prohibitive!" but... maybe not, given that most devices will be at least partially filled already.
You could prompt for domains that use more than a small amount - say, 25-100k.
Once they hit that point, show a prompt below the toolbar that shows how much data is being used by the whole domain, in real time and allow it to keep on filling up with data until the user says stop or always allow.
You could continue to allow 5mb for each subdomain, but wait until say, 25mb for the sum(*.domain.tld) before prompting. For example, this would allow {www,mail,docs,plus,???}.google.com to continue as they are now until google.com managed to rack up 25mb worth of data. After which point the user might want to know they're holding on to a fair bit of data for one particular site.
Then again, prompting is really annoying, and most people just click "okay" and without comprehending.
Give a standard amount to top-level domains (like tumblr.com), but keep track of it per-subdomain (so you know how it's divided up, even though it all adds up to the same quota).
Then, whenever space is used up, ask the user if they're willing to authorize extra space for the specific subdomain being added to. If they say yes, then it's authorized for that single subdomain (but not other ones).
I don't think there's any "automatic" way to do this, without asking the user. And I think most users would prefer to be asked.
What about this: writes to a.mydomain.com from a page with www.mydomain.com in the address bar count towards the quota for both a.mydomain.com and www.mydomain.com.
You'd have to store the other domains your page has written to in its own local storage area, but it doesn't seem to me like the book keeping would be that complicated.
You could use a coarse rule of all data in a.mydomain.com counts, and use a larger quota of n * per-domain-limit.
You could visit as many legit-site.tumblr.com addresses as you want with this rule.
It's too complicated. I don't think this is a real problem. Better to just evict a random subdomain entirely (or on an LRU scheme). After all, just like cookies, there's no guarantee the localStore will stick around, so any normal site needs to deal with this anyhow.
Just check if more than 3 subdomains are trying to allocate storage, and if they do request more than 4K size, request explicit permission to continue.
Reminds me how filing bugs with Microsoft is such a pain!
There's no category for Windows 8 on MS Connect, so when we found a bug in Windows 8 RTM I found the name of an MS employee working on the feature in question on the MSDN forums, then through Google found his LinkedIn profile where he luckily published his eMail address.
Microsoft should be ashamed.
It seems like the best way to file a bug these days is to create a blog post and publish it to HN, Reddit or so...
I'm actually aware of a very significant user impacting bug in Windows 8 (hint: It can cause every process in your startup/autorun list to not start consistently). But have no way to report it to them...
Has impacted my system multiple times since upgrading to Windows 8. Fortunately I know a work-around (eject the optical disc in the optical drive) but still -- annoying that I cannot even report it.
The bug we found is affecting a lot of Swiss customers (I admit Switzerland isn't so big) and it took a month until I got a useful reply.
Now we have a bug number and were told that the issue should be fixed "early this year" and changes would be checked in in March. Whatever that means...
At least their employees were kind enough to reply to my eMail. But this company should really improve their error reporting.
Please sir, may I have some more? This is awesome. I got to 935mb of space used before Google Chrome on Windows 7 64bit crashed dramatically. However, good news, when I restored the tabs it resumed right where it left off filling up my hard drive.
Did it literally crash, or just hang? One joyful misfeature of LevelDB (which forms the basis of Chrome's implementation) is that when a particular write rate is exceeded for a sustained period, LevelDB will fall behind merging old segments and block hard, often for minutes on spinning rust, until it catches up.
But hey, who cares about by-design black swan latency characteristics for real world use cases when the published benchmarks look so great.
I got to over 2GB and then Chrome crashed (not dramatically, just a tab). Fun fact is that I was not running the "exploit" in Chrome. I was testing it in Opera and I suspect that it was cheating and allocating more and more memory for all the local storage instead of dumping it out to hard drive. So system run out of memory and something had to go.. I still find it funny that it was Chrome :)
Haha yeah, that is a bit strange. When Chrome crashed for me it was every process, not just the one tab which I rarely see happen these days. Lets hope the Chrome team implement a fix for this ASAP, not holding my breath on an IE fix for at least a year or so given their prior history for patching things.
(Mildly OT) Well, that's interesting. The link has been changed since it was posted - it originally pointed directly at the demo which began to fill your disk. Is this a mod thing, or can the original submitter do it?
Either way, it was a good call. Automatically playing music and filling my hard drive with no warning is a terrible idea.
This kind of thing is a good reason why a monoculture would be bad for the Web. It's entirely possible that, if WebKit had a monopoly, Web sites would rely on subdomains' space not being counted toward the parent domain's space, and it'd be impossible to fix this bug without breaking the Web. But because we don't have a monoculture and some browsers implemented subdomain limits, Web sites haven't been able to rely on WebKit's behavior. So WebKit will be able to fix the bug without breaking the Web—which is better for WebKit, and better for the Web.
I was really annoyed by the fact that the "disk filling" started as soon as I clicked the link. However, the point really hit home. Is there a solution for this browser-side?
EDIT: The link has been changed to the blog post describing the phenomenon. Good riddance!
How does Firefox determine if something is a domain or a subdomain? Obviously the term subdomain is relative, so domain.com is already a subdomain of .com. But what about countries like the UK or South Africa where domains are commonly subdomains of .co.uk and .co.za?
Is there some generic way to know when a domain should be treated as a subdomain or do they basically hardcode the exceptions?
Example: does domain1.co.uk and domain2.co.uk share the same limit in Firefox? Probably not, but how does it know to treat them as separate?
There are already hardcoded lists for this that's used to limit the scope of cookies (so nobody can try to read all the cookies on *.uk).
I imagine these lists will become a real headache when the recent TLD auction is over. Is there any work being done on a more dynamic system (DNS TXT fields?)
You don't need to purchase any top-level domain, just a bunch of regular domains: dearleader000001.kp, dearleader000002.kp, ....
If you are, say, the North Korean government, or have a close relationship with some small island registrar, you can register any number of domains you like for peanuts.
Interesting question. I wonder if you could get into this list (without nefarious purpose) if you provided some major hosting service? Eg: I see k12-schools in the US are on that list, it would make sense to allow someone providing shared hosting to get on the list (to avoid users setting cross-domain cookies). Eg: appspot.com and blogspot.* is on the list[1].
More information:
http://publicsuffix.org/submit/ (and the rest of the site, obviously)
For those interested in changing the amount of storage per domain in firefox: about:config -> dom.storage.default_quota. Also, dom.storage.enabled to change whether you use local storage at all or not. I don't know if chrome or iexplore also give those options.
It's nice that this exploit is presented openly as a proof of concept, and includes a button to undo the damage. Many people, upon finding this, would try to use it for shadier ends.
Well this is frightening. You don't even need to create subdomains since basically anyone with wildcard subdomains enabled can do this without a sweat. All you need is a random number generator and rewrite x.domain.com to domain.com and the browser is none the wiser.
Though I can't quite imagine why anyone would want to do this to some random stranger. Unless you knew the visitor or had some means of personally identifying him/her, there are more devastating ways of filling up a remote HD with just an IP and hostname (nmap and friends come to mind).
> Though I can't quite imagine why anyone would want to do this to some random stranger.
You must be new to the internet?
> Unless you knew the visitor or had some means of personally identifying him/her, there are more devastating ways of filling up a remote HD with just an IP and hostname (nmap and friends come to mind).
There's a few things:
This works by sending someone a link. So you can target people without knowing their IP. It's also so easy a kid could do it. Therefore, kids will do it, just to "fuck with eachother's shit". Not to mention, they'll do it to the school and the library, etc etc. There's also enough people doing things "for the lulz". Spam this link to a thousand people, crash a few PCs, hur hur hur. Again, the fact that it's browser-based and not IP-based allows for different types of attacks. They can spam specific communities and fora they don't like or are at odds with.
..
By the way, when I ran that site in Opera, it asked me whether I wanted to grant the site extra storage space, which I declined. I didn't feel like testing it in Chrome and crashing my things right now, but am I correct in assuming Chrome would not ask for this extra storage space, but simply take it, without any kind of upper limit?
Well, I meant you can use these tools to gain access to the remote machine to do some real damage. Nmap and friends are usually for finding running services, list of open ports, knock on a few doors (run some queries?) etc... and if someone were to gain access to a machine this way, filling up their hard drive may not be on their list of priorities. Unless incrimination was the intention.
Really. I assumed you were talking about some nmap-based attack I hadn't heard of. That maybe fills the target's HD with log files or something. Wondered whether that would work cross-platform on any device, like this attack. Wondered whether it could be pulled off by an idiot with a grudge, like this attack. Or whether targeting someone by IP isn't in fact actually harder than being able to do it by getting someone to click on any link, anywhere.
But yes, indeed, if the machine's already vulnerable to something else, then that is possibly much worse.
I don't think this is as easy or as common as you think it is. For one, almost every computer is behind a firewall these days and remote vulnerabilities for common services aren't anywhere near as common as they used to be.
i.e. nobody. why the hell is webkit not following the standard here? they even implemented a permission dialog so you can allow an app going over quota.
Wildcard subdomains are ridiculously common on the web (unless I've misread you?) Blogspot, tumblr, tripod, basically anywhere using HTTP/1.1 Host-based routing of requests.
So then this would really be handy in a mud-slinging campaign. Maybe against a competitor. Any visitors would be treated to a massive drain on storage and other delights, but then the victim would still need to have multiple subdomains and/or wildcard subomains enabled.
It sound like they hate synchronous APIs. Well, the synchronous nature of it wouldn't be a problem if:
1. JS has a language-level support for asyncrony.
2. The implementation of retrieval was performant enough or allowed for some way to control granularity of reads from the code.
I really dislike that the idea that the only simple API for local storage will be gutted because of reasons quite tangential for what it does.
So synchronous APIs wouldn't be a problem if they were 1. Asynchronous or 2. Guaranteed to be really really fast? You do realize that the problem is that you can't guarantee that spinning rust will be fast, right?
I think he said that asynchronous APIs are a problem because they're hard to use well from JS, and that the performance of localStorage is a problem in part because the granularity of reads and writes is poorly specified.
2. This script writes a 2,500,000-length string to local storage, which should occupy at least 2.5Mb (probably much more). This matches the maximum storage per sub-domain.
3. This script then reloads the iframe on a different subdomain but the same script. GOTO 2.
So apparently this is where you file IE bugs http://connect.microsoft.com/IE - i'm not sure if it's expected or ironic that it's broken. Great find btw!
Session state is released as soon as the last window to reference that data is closed. However, users can clear storage areas at any time by selecting Delete Browsing History from the Tools menu in Internet Explorer, selecting the Cookies check box, and clicking OK. This clears session and local storage areas for all domains that are not in the Favorites folder and resets the storage quotas in the registry. Clear the Preserve Favorite Site Data check box to delete all storage areas, regardless of source.
That's just a bug in the exploit that it accidentally used more than 5 MB per subdomain. I got the same prompt, but when I tried to run it again in private tab and the limit was not hit. It got to over 2GB before something bad happened (Chrome's tab process crashed).
Just ask the user if its okay, like with geo data, translate web site, etc.
"Allow example.com to track your location?" [Yes] [No]
"Allow a1.example.com to store x MB of data locally?" [Yes] [No]
Also
> The HTML5 Web Storage standard was developed to allow sites to store larger amounts of data (like 5-10 MB) than was previously allowed by cookies (like 4KB).
Main difference is that cookies are uploaded to the server with each request, while localStorage is not.
That is a good way to never ever ever use a feature again. "Frightening Message: This website wants to do something scary. Do you want to allow some bad thing to happen to your computer?" That is how lay people, i.e. the people needed to mass adoption, read browser requests for Geo, storage, and other permissions.
It would be better to have sane and safe defaults in the browser, rather than pester the user. Would cookies have worked if the browser asked for permission on every website?
Browsers like Chrome do it with geolocation for example. If it is required for the user to get a certain service they want, what's the "scary" part? You can say no and use the parts of the site that work with it, or yes and get the extra functionality. Like with geolocation.
This is what Opera does (you are prompted to raise limit) and it does not prevent this kind of attack. You still would have to have separate TLD limit.
A root domain www.example.com can utilize upto 10MB of storage while sub-domains count towards that storage limit. Any domain trying to access more will automatically result in a user prompt. An exemption can be made for domains/subdomains that present a valid SSL certificate, the whole idea is to prevent abuse.
This could be quite a problem for users of SSDs who lack TRIM support.
IIRC Apple were selling mac book airs with no trim support if the user didn't pay to upgrade OSX.
If a malicious user felt so inclined they could with just a few domains create a bit of a write load that would quickly fragment and hurt the performance of the SSD.
He explains here that most browsers (except Firefox) don't follow the standard close enough, and ignore the exception for subdomains, i.e. 1.filldisk.com, 2.filldisk.com, etc.
It's the one about cookies, and .co.uk (i.e. every commercial site in the UK) sites all sharing the same cookies, because they all look like subdomains. Or was it all .friendly-hosting-company.com sites?
The fundamental problem is, there's no easy way to distinguish domains and subdomains.
Because most of the browser vendors see "standards" as more like "suggestions".
I'm sure any web developer will tell you it's been a problem from the moment there was more than one browser. This is just a particularly hilarious example.
Ahhh.... I wrote a response wondering out loud why this doesn't work on my browser, then checked the source, and it's javascript. No wonder, I browse without it!
"Yeeaahhpp, with enough javascript one can blow up just about anything." ~Tyler Durden
Would this work on an Android device with Froyo/Gingerbread? Because, some of those devices will never be updated. Hence, this can be used to practically disable a device.
In Chrome, you can go to Settings->Advanced Settings->Content Settings->Cookies->All Cookies and Site Data, and it will list sites using your localStorage.
This is such a typical HN comment. The sound and automatic start of disk filling (before the mods changed the link) was intentional and meant to be surprising.
I don't like the tone of your response, especially as I don't spend that much time on HN. I explicitly tried to voice my opinion as my own without diminishing what you have accomplished.
By visiting a YouTube video, I am giving YouTube my consent to play me a video, including any sound it might contain. I completely expect YouTube videos to play sound and am prepared for it, unlike websites such as this.
No, that this is a double standard people have. They gripe about any non-YouTube site playing audio, but say nothing when they click a bit.ly link that redirects to a YouTube video page.
It doesn't. Flashblock is one of the most popular Firefox extensions[1], and variants are available in other browsers. Even though it offers some protection against hidden Flash elements and supercookies, it would still be popular just to prevent autoplaying flash presentations.
The author of Flashblock even specifically mentions YouTube on the extension page[2]:
Youtube videos not blocked: This is because they are now increasingly HTML5 Videos. I plan to add HTML5 blocking in the next version. Meanwhile you can try out a experimental version at:
http://flashblock.mozdev.org/installation1.html#unstable
This indicates there is a demand for the autoplay blocking feature alone, regardless of the medium.
Not that I care about unexpected noises in a situation like this, but I hope we all agree that unexpected and unwanted noises are genuinely annoying to many people.
A maybe workable solution would be to only allow creation of new keys for the first-party origin. What I mean is that whatever.example.com has full access if that's what the user is currently viewing directly in their browser.
<wildcard>.example.com embedded via iframes could either get read-only access, or read-write access for existing keys. Also maybe limited to, lets say, 4K.
This sounds like a really complicated solution though. Any better ideas?