Instead of dedicating two bits to flags, could the specific task in the video be accomplished by leveraging the sign bit? Define 0xFFFF to be zero, 0x0000 to be one, 0x0001 to be two, etc. Then for decrement, do an atomic `fetch_uint_sub(1)` and for increment, do an atomic `fetch_sint_add(1)`.
This could simplify read by making the sign flag a one-way gate enforced by the CPU semantics instead of the application?
You'd need that fetch_sub operation to saturate at zero instead of wrapping to 0xFFFF, and I'm not aware of a processor that provides saturating atomic arithmetic operations.
I don't understand what you mean. Decrementing from 0 gives you 0xFFFF, incrementing from 0xFFFF gives you zero. (It doesn't matter whether you use unsigned or signed increment as those are the same operation, at least on a two's-complement processor, where the bit pattern 0xFFFF represents -1.)
You could do this trick on a machine that uses signed-magnitude representation, since a signed increment of 0xFFFF (INT_MIN) would result in 0xFFFE (INT_MIN + 1). But all modern processors use two's complement.
This could simplify read by making the sign flag a one-way gate enforced by the CPU semantics instead of the application?