So. The question I'm asking myself now is how to fix this. Giving <wildcard>.domain.com a shared quota will allow one tumblr or github pages user to monopolize all storage, effectively removing local storage for this kind of scenario (also removing it for the host which is even more annoying).
A maybe workable solution would be to only allow creation of new keys for the first-party origin. What I mean is that whatever.example.com has full access if that's what the user is currently viewing directly in their browser.
<wildcard>.example.com embedded via iframes could either get read-only access, or read-write access for existing keys. Also maybe limited to, lets say, 4K.
This sounds like a really complicated solution though. Any better ideas?
Any kind of DOM storage (cookies, localStorage, IndexedDB, etc.) is ephemeral. The browser needs to decide the maximum amount of disk space that it wants to consume, and then when it hits that limit, it needs to start throwing away (garbage collecting) some of the data based on some policy like LRU.
If the web app really needs permanent storage then that permanence of storage needs to be granted explicitly by the user.
> An application can request temporary or persistent storage space. Temporary storage may be easier to get, at the UA's discretion [looser quota restrictions, available without prompting the user], but the data stored there may be deleted at the UA's convenience, e.g. to deal with a shortage of disk space.
> Conversely, once persistent storage has been granted, data stored there by the application should not be deleted by the UA without user intervention. The application may of course delete it at will. The UA should require permission from the user before granting persistent storage space to the application.
If you do that (like Firefox admittedly does), then my issue with subdomain users using up all local storage comes into place.
It would really suck for github (for example) not to be able to use local storage in their UI because pilif.github.com used up all available storage for the whole domain.
This sounds like a mostly theoretical problem. Sure, it could be annoying, but if the quota simple is a little smarter (e.g. allowing a few filled subdomains rather than just 1, and/or always allowing the root subdomain to fill up, and/or evicting subdomains from localstorage on an LRU basis) then the chances of this happening non-maliciously are very small - and when push comes to shove, there's no guarrantee localStorage will persist anyhow, so every site needs to be robust in the face of data loss anyhow.
The 100MB per domain includes the subdomain storage? Then it doesn't solve what parent comment described - few subdomains could use up all the storage. If it doesn't include it and it's separate then it is exactly like it is now.
This really isn't that hard of a problem – allow X MB per domain and Y per sub-domain (user configurable or set by the browser developer to some sane limit – the actual number doesn't matter too much within certain ranges, in reality).
After either of those limits is hit, prompt the user to grant another increment of allowed storage or a customized amount, possibly (and disclose how much each domain and subdomain is using under a detailed view option). In the end, this puts the power back in the hands of the user and prevents any malicious usage while not allowing one subdomain to effectively deny storage to others, etc.
Limit the total local storage space at a browser level. E.g. 1GB. Just like you might limit the total size of temporary internet files/cache.
Beyond that, just provide a good (simple) UI for deleting stuff. Which could suggest candidates for deletion based on heuristics like you suggest. E.g. iframes shouldn't need so much. Hopefully less visited sites would be suggested for deletion too.
I don't think it's such a good idea to link directly to such huge files from here. The directory would do just fine http://planet.openstreetmap.org/planet/ and avoid bots or inattentive clicks.
Limit how much can be placed in localstorage regardless of site based on time. (Or perhaps prompt whenever that limit is reached, in case there is a legit reason for it.)
This isn't perfect in that your localstorage could still be filled up slowly if you leave a page open in the background, but I think this solution is robust to many different techniques.
Opera has a prompt message when limit is reached (it was actually triggered on the demo page). But it only works for subdomain level. There is no global domain limit.
Prompt is a horrible solution from a UX perspective. Essentially you're asking the user a question you, as a developer, couldn't or didn't want to answer. But the user has no idea either. Heck, she doesn't even know that there are limits in place or what DOM local storage even is.
I don't think it's a problem of the developer not knowing, they generally do know how much storage they'll need (or at least put an upper limit). It's a problem of trust.
Maybe the dev things they'll need 1GB but I'm not ready to give them that. The same way the apps asks for certain privileges when you install an app on, say, android.
I wonder why anybody thought it was a good idea to let any web page store some random stuff on my computer. Cookies were bad enough already.
You know, you can say what you want about Flash, but I least I could block it. These days I can't browse half the web if I disable javascript. One of these days I'll just run my browser with a separate UID just so I can keep it under control. Or better yet, a VM, since it appears people want to turn the browser into an OS anyway.
I'd argue that a user generally has an idea that there is a persistent storage ("disk") locally held in their computer and that they have a good idea whether they want a website to use that space or not.
The developer lacks the knowledge of the users requirements, that is why they can't answer the question. For "power users" the user is far better placed than the developer to answer the question about how much local storage space is used.
For a naive user the question comes down to "this website wants to put stuff on your computer, do you think your interaction with the website warrants them doing this", that's more a question of value of the website to the user than it is a technical question.
> and that they have a good idea whether they want a website to use that space or not.
"What's a website? I just double-clicked on my e-mail google and now Foxfire wants to fill up my disks. Is this going to put a virus on my Microsoft? Why don't they put it up in the clouds?"
Most users would just answer yes, thinking bad things might happen if they run out of 'disk space'. So you'd still need some kind of eviction strategy for people that never said no.
Prompting should be a configurable setting for users aware of the restrictions. By default, I would have it evict based on timing interval for normal users, unless prompting was enabled.
That's a bit like saying passwords and PINs are bad from a UX perspective. In a way, you're right, because any user flow gets simpler and smoother if you remove a password prompt, but it's pretty obvious why these things still need to exist.
Well, they get in the way of the user doing what she wants, of course. But they're a necessary evil, quite firmly entrenched by now (OpenID is subjectively worse even though it might be technically superiour) and they come expected for users these days. Just as you (usually) tell others your name when you call them on the phone, user names and passwords are kind of expected as a way of telling a web site who you are.
However, I'd say that prompts that may pop up whatever you're currently doing and ask for things most users cannot make an informed decision about. Eric Lippert once nicely summarised the problems in [1]. And while browser's confirmation dialogs are usually no longer modal, the problem persists. In the vast majority of cases the wanted result is »increase storage limits«. That this might pose a denial-of-service risk is something they are often not aware. And if you try telling them up-front they either won't read it or are needlessly scared. It's a hard problem, actually, especially given user habits concerning message boxes, confirmations and stuff.
There really isn't an easy way to avoid this problem even if you follow the standard fixed quota per domain and subdomains don't count policy. You could just embed iframes to diskeater1.net, diskeater2, etc and fill up the disk that way.
In the end, the problem is that one page can itself infer to other domain / subdomains in its document and those can execute and utilize localstorage. They have to, though, so you can embed an html5 game in your blog from some other site that you liked. It comes with the territory.
Sadly, it seems like the best answer is the horrible UX'd prompt - "do you want to allow x.y.z to store local content on your computer?" the same way you have to verify downloads and know exactly what you are running locally.
> You could just embed iframes to diskeater1.net, diskeater2, etc and fill up the disk that way.
Thankfully, in this case, domain registrations are expensive. Filling a 16 GB iPad with this technique would cost around $10,000 in registrar fees. A 128 GB SSD could be filled for under $100,000.
...So I wanted to come in here and say "cost prohibitive!" but... maybe not, given that most devices will be at least partially filled already.
You could prompt for domains that use more than a small amount - say, 25-100k.
Once they hit that point, show a prompt below the toolbar that shows how much data is being used by the whole domain, in real time and allow it to keep on filling up with data until the user says stop or always allow.
You could continue to allow 5mb for each subdomain, but wait until say, 25mb for the sum(*.domain.tld) before prompting. For example, this would allow {www,mail,docs,plus,???}.google.com to continue as they are now until google.com managed to rack up 25mb worth of data. After which point the user might want to know they're holding on to a fair bit of data for one particular site.
Then again, prompting is really annoying, and most people just click "okay" and without comprehending.
Give a standard amount to top-level domains (like tumblr.com), but keep track of it per-subdomain (so you know how it's divided up, even though it all adds up to the same quota).
Then, whenever space is used up, ask the user if they're willing to authorize extra space for the specific subdomain being added to. If they say yes, then it's authorized for that single subdomain (but not other ones).
I don't think there's any "automatic" way to do this, without asking the user. And I think most users would prefer to be asked.
What about this: writes to a.mydomain.com from a page with www.mydomain.com in the address bar count towards the quota for both a.mydomain.com and www.mydomain.com.
You'd have to store the other domains your page has written to in its own local storage area, but it doesn't seem to me like the book keeping would be that complicated.
You could use a coarse rule of all data in a.mydomain.com counts, and use a larger quota of n * per-domain-limit.
You could visit as many legit-site.tumblr.com addresses as you want with this rule.
It's too complicated. I don't think this is a real problem. Better to just evict a random subdomain entirely (or on an LRU scheme). After all, just like cookies, there's no guarantee the localStore will stick around, so any normal site needs to deal with this anyhow.
Just check if more than 3 subdomains are trying to allocate storage, and if they do request more than 4K size, request explicit permission to continue.
A maybe workable solution would be to only allow creation of new keys for the first-party origin. What I mean is that whatever.example.com has full access if that's what the user is currently viewing directly in their browser.
<wildcard>.example.com embedded via iframes could either get read-only access, or read-write access for existing keys. Also maybe limited to, lets say, 4K.
This sounds like a really complicated solution though. Any better ideas?