Is there a valid use case for doing pointer arithmetic on a void pointer? It jus...

DSMan195276 · on Dec 3, 2018

> Is there a valid use case for doing pointer arithmetic on a void pointer? It just seems so disgusting on the surface. Did someone just really not want to bother casting to uint8_t * ?

As a side note, there was a debate whether casting to `uint8_t * ` is a strict-aliasing issue (As `uint8_t` and `char` are not guarenteed to be the same type, and so the compiler isn't required to treat them as such).

That said, you're correct that you can always cast to `char * ` in this case as `char * ` is allowed to alias anything. I do agree pointer arithematic on `void * ` a bit messy, I think the intention was just as a convenience thing, but it can definitely be abused.

> As for the array types, I think maybe the article is really trying to say that they'll all compile down to the same thing in the end (probably...compilers can be weird sometimes).

I donno, I've read this page before and think the author genuinely doesn't know about pointers to arrays - if they do they've done a good job of pretending they don't exist. They are not mentioned anywhere on the page (Besides the part that says they're are all the same), and near the top the author talks about parentheses in variable declarations and says:

> This is not useful for anything, except to declare function pointers (described later).

Which anybody who has used a pointer to an array will know is not true, as you have to use parentheses to differentiate between a pointer to an array from an array of pointers.

quietbritishjim · on Dec 3, 2018

> `uint8_t` and `char` are not guarenteed to be the same type

Remember that `char`, `signed char` and `unsigned char` are the different types, even though `char` takes the same range of values as either `signed char` or `unsigned char`.

The typedef `uint8_t` is usually set to `unsigned char`, not `char`, even on systems where `char` is unsigned. Partly this reflects the fact that `char` is usually used to represent actual characters, while the other two types are usually used to represent integers that take the same amount of memory as `char`. The standard technically does not allow `uint8_t` to be `char` [1], although this requires an extremely pedantic reading.

Anyway, if you replace `char` with `unsigned char` in your comment then it's correct. I believe all current major implementations typedef `uint8_t` to `unsigned char`, but that's not guaranteed and even old implementations of GCC had a different type. `unsigned char` satisfies the same relaxed aliasing rule as `char` [2] but `uint8_t` may not.

[1] https://stackoverflow.com/a/16002781

[2] https://stackoverflow.com/a/40575162

saagarjha · on Dec 3, 2018

> I believe all current major implementations typedef `uint8_t` to `unsigned char`

That can't be right, because CHAR_BIT is not always 8.

maccard · on Dec 3, 2018

Practically speaking, for most people these days it is. POSIX specifies 8 bits and windows uses it. If you're working with a larger bit char, you're most likely well aware of it (offhand, some TI embedded chips are the only ones I know of, and even theyre rare!)

quietbritishjim · on Dec 3, 2018

It depends on what you count as a "major implementation".

torstenvl · on Dec 3, 2018

Casting to uint8_t* is not portable either, so it's not particularly better than using compiler extensions that let you increment void*.

simias · on Dec 3, 2018

"char" is portable and by definition sizeof(char) == 1 and on top of that char pointers can alias other types.

If I could time travel and influence the design of C the first things I'd do is change switch to break by default. The 2nd thing I'd do is rename "char" into "byte" since that's effectively what it is (and it's seldom a character these days since we often use UTF-8 or other multi-byte encodings).

saagarjha · on Dec 3, 2018

> The 2nd thing I'd do is rename "char" into "byte" since that's effectively what it is

This isn't always true. See machines where CHAR_BIT > 8.

simias · on Dec 3, 2018

A byte is the smallest addressable memory unit, nothing more, nothing less. It's true that on most modern machines it's always equal to 8 bits (and it's mandated by POSIX AFAIK) but that's orthogonal.

In French when talking about storage capacity we use "octet" instead of "byte", I always thought that made more literal sense (if you have a 1megabyte memory on a system where CHAR_BITS is 16, do you have 8 or 16 megabits?).

vardump · on Dec 3, 2018

There are bit-addressable CPUs and DSPs. Examples are many microcontrollers (part of the memory accessible in bit-addressable windows), and famous TMS34010 (https://en.wikipedia.org/wiki/TMS34010).

4-bit CPUs (like the one in your toothbrush or thermometer) can also of course address 4-bit "words", "bytes" or nibbles.

So 8 bits might not be the smallest addressable memory unit.

Sean1708 · on Dec 3, 2018

I always thought that 1 char was exactly 1 byte, but that 1 byte was not necessarily 8 bits?

saagarjha · on Dec 3, 2018

I think from a C standards point of view, that's true; I was using the general convention of 1 byte=8 bits, but char being one or more of these bytes. Hence CHAR_BIT is the number of bits in a char, and sizeof gives the number of chars that would fit in the specified type.