> I work on compilers for a living. …as I take it, for languages without undefin...

pizlonator · on April 1, 2020

There is a broad pattern of software that uses undefined behavior that gets compiled exactly as the authors of that software want. That kind of code isn’t going anywhere.

You’re kind of glossing over the fact that for the compiler to perform an optimization that is correct under C semantics but not under structured assembly semantics is rare because under both laws you have to assume that memory accessed have profound effects (stores have super weak may alias rules) and calls clobber everything. Put those things together and it’s unusual that a programmer expecting proper structured assembly behavior from their illegally (according to spec) computed pointer would get anything other than correct structured assembly behavior. Like, except for obvious cases, the C compiler has no clue what a pointer points at. That’s great news for professional programmers who don’t have time for bullshit about abstract machines and just need to write structured assembly. You can do it because either way C has semantics that are not very amenable to analysis of the state of the heap.

Partly it’s also because of the optimizations went further, they would break too much code. So it’s just not going to happen.

saagarjha · on April 1, 2020

You say that this doesn't happen, and yet we have patches like this in JavaScriptCore: https://trac.webkit.org/changeset/195906/webkit. Pointers are hard to reason about, but 1. undefined behavior extends to a lot of things that aren't pointers and 2. compilers keep getting better at this. For example, it used to be that you could "hide" code inside a function and the compiler would have no idea what you were doing, but today's compilers inline aggressively and are better at finding this sort of thing. And it isn't just WebKit: other large projects have struggled with this as well. The compiler discarded a NULL pointer check in the Linux kernel (which I can't find a good link to, so I'll let Chris Lattner paraphrase the issue for me): http://blog.llvm.org/2011/05/what-every-c-programmer-should-... ; here's one where it crippled sanitization of return addresses in NaCl: https://bugs.chromium.org/p/nativeclient/issues/detail?id=24...