FreeBSD has atomic.h providing a bunch of different atomic ops, e.g: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/amd64/include/...
I believe FreeBSD also uses this style of buffer for dmesg, which sadly also results in a lot of interlacing if two kernel threads are trying to write to the buffer at once.
GCC, for example has __sync_val_compare_and_swap(...) function that give you access to the hardware compare and swap.