Off-topic question, but can some experts tell me why it is safe for `strlen()` a...

loeg · on July 24, 2023

Essentially because memory mappings and RAM work at page granularity, rather than bytes. If a read from in-bounds in a page isn't going to fault, a read later in the same page isn't going to fault either (even if it is past the end of the particular object).

You can see this in glibc's implementation, which checks for crossing page boundaries: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86... (line ~68)

gavinhoward · on July 24, 2023

Ah, so that's why there is special code in Valgrind to handle glibc and friends!

loeg · on July 24, 2023

I think capability-pointer machines like CHERI might need in-bounds-only variants of these functions, too.

saagarjha · on July 24, 2023

Generally CHERI tracks things for 16-byte regions

loeg · on July 24, 2023

Implementations using 32- or 64-byte (256 or 512 bit) vector extensions would run afoul of 16-byte granularity. While it is not common yet, ARM SVE allows vector sizes larger than 128 bits -- e.g., Graviton3 has 256-bit SVE and Fujitsu A64FX has 512-bit. (x86 has had 256 and 512 bit vector instructions for some time, but current CHERI development seems to be on ARM.)

Liquid_Fire · on July 24, 2023

I think you might be confusing the tracking of validity of capabilities themselves (which could indeed be at a 16 byte granularity for an otherwise 64-bit system) with the bounds of a capability, which can be as small as 1 byte.

saagarjha · on July 29, 2023

Ah I think you are correct.