Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Off-topic question, but can some experts tell me why it is safe for `strlen()` and friends to use vector instructions when they can technically read out of bounds?


Essentially because memory mappings and RAM work at page granularity, rather than bytes. If a read from in-bounds in a page isn't going to fault, a read later in the same page isn't going to fault either (even if it is past the end of the particular object).

You can see this in glibc's implementation, which checks for crossing page boundaries: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86... (line ~68)


Ah, so that's why there is special code in Valgrind to handle glibc and friends!


I think capability-pointer machines like CHERI might need in-bounds-only variants of these functions, too.


Generally CHERI tracks things for 16-byte regions


Implementations using 32- or 64-byte (256 or 512 bit) vector extensions would run afoul of 16-byte granularity. While it is not common yet, ARM SVE allows vector sizes larger than 128 bits -- e.g., Graviton3 has 256-bit SVE and Fujitsu A64FX has 512-bit. (x86 has had 256 and 512 bit vector instructions for some time, but current CHERI development seems to be on ARM.)


I think you might be confusing the tracking of validity of capabilities themselves (which could indeed be at a 16 byte granularity for an otherwise 64-bit system) with the bounds of a capability, which can be as small as 1 byte.


Ah I think you are correct.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: