What if the checksum was the same and you accepted the cache hit if the checksum agrees and get your own copy if it doesn't? Maybe the application should get to declare a canonical URL for the js file instead of the browser? So something like
> I'm concerned about people like me who use noscript selectively. How easy is it to create a malicious file that matches the checksum of a known file?
SHA-256? Very, very, very, very hard. I don't believe there are any known attacks for collisions for SHA-256.
People make too big a deal of this collision stuff, a lot of times these are very theoretical would require tremendous computation. Anyway, for this use case, even md5, how likely really to make a useful malicious that file collides with a particular known and widely used one? I dunno seems pretty unlikely.
It would be interesting if browsers start implementing a content-addressable cache. So as well as caching resource by URI also cache by hash. Then SRI requests could be served even if the URL was different.
Of course this would need a proposal or something but it would be interesting to consider.
> How easy is it to create a malicious file that matches the checksum of a known file?
As others have pointed out, it's quite difficult. But here's another way to think about it: if hash collisions become easy in popular libraries, the whole internet will be broken and nobody will be thinking about this particular exploit.
Servers won't be able to reliably update. Keys won't be able to be checked against fingerprints. Trivial hash collisions will be chaos. Fortunately, we seem to have hit a stride of fairly sound hash methods in terms of collision freedom.
I think this vaguely reminds me of the Content Centric Networking developed by PARC. There's 1.0 implementation of a protocol on github (https://github.com/PARC/CCNx_Distillery). A CCNx enabled browser could potentially get the script from a CCN by referring to it's signature alone (it being a sha-256 checksum or otherwise).
> The first would seem preferable though, as loading from an external source would expose the user to cross-site tracking.
You're right in that the first one you had with just the sha-256 would be pretty much equivalent as what I had especially given that hn readers have resoundingly given support to the idea that it is non-trivial to create a malicious file with the same hash as our script file. I was simply trying to be cautious and retain some control for the web application (even if the extra sense of security is misplaced).
This is the use case I'm trying to protect by adding a new "canonical" reference that the web application decides. As others in this thread have said, it is very unlikely that someone will be able to craft a malicious script with the same hash as what I already have. The reason I still stand by including both is firstly compatibility (I hope browsers can simply ignore the sha-256 hash and the authorized cache links if they don't know what to do with it).
As a noscript user, I do not want to trust y0l0swagg3r cdn (just giving an example, please forgive me if this is your company name). NoScript blocks everything other than a select whitelist. If the CDN happens to be blocked, my website should still continue to function loading the script from my server.
My motivation here was to allow perhaps even smaller companies to sort of pool their common files into their own cdn? <script src="jimaca.js" authoritative-cache-provider="https://cdn.jimacajs.example.com/v1/12/34/jimaca.js""></scri... I also want to avoid a situation where Microsoft can come to me and tell me that I can't name my js files microsoft.js or something. The chances of an accidental collision are apparently very close to zero so I agree with you that there is room for improvement. (:
This is definitely not an RFC or anything formal. I am just a student and in no position to actually effect any change or even make a formal proposal.
<script src="jQuery-1.12.2.min.js" authoritative-cache-provider="https://ajax.googleapis.com/ajax/libs/jquery/1.12.2/jquery.m... sha-256="31be012d5df7152ae6495decff603040b3cfb949f1d5cf0bf5498e9fc117d546"></script>
Would this cause more problems than it would solve? I'm assuming disk access is faster than network access.
I'm concerned about people like me who use noscript selectively. How easy is it to create a malicious file that matches the checksum of a known file?