> None of those 'lies' have anything to do with assembler.
They do. Lets take a look at the very first one. "Array name is just a pointer".
Assebler "array" is a name of a label. So its just named address in memory. Or we can alternatively say, that assembler array is constant pointer.
`sizeof' returns size of array in bytes? Hmm... maybe that because main C abstraction for memory is the assembler one: memory is a continuous sequence of bytes? `sizeof' meant to be used for functions like malloc or memcpy, not for operator `new'. When we use some dynamic memory allocation in assembler we get pointer to untyped memory chunk, compare with C:
void* malloc(size_t size);
If you wish I can show you the connections of other 'C lies' with assembler abstractions. I'm too lazy to write about all of them, but I could write about one more if you ask. Just pick one you like more.
> now we are stuck with those bad decisions forever
Yes, you are right. We stuck with that. And it is bad. But it doesn't make my point wrong. C is crossplatform assembler, and these decisions looks pretty good from perspective of assembler. They give to programmer low level control on generated machine code while keeping code portable, and its very useful in some cases. For example when you developing an OS kernel.
From an assembler point of view a structure of N elements of type T and an array of T[N] have exactly the same layout and are accessed in exactly the same way [1], but in C have wildly different semantics.
Sizeof behaves exactly the same way for structs and arrays, so it is one of the few things in C that treat arrays "correctly".
[1] although usually the offset is constant for a struct field access.
In addition to being UB, the example doesn't illustrate the issue: arrays in C are not first class as they can't be passed by value and can't be assigned. The decay-to-pointer thing that prevent this regularity has nothing to do with asm.
Yes, its UB. But I'm not persuade you to use this UB in real code: in real C code use offsetof from stddef.h. The only thing I want to say is: this code would work everywhere (if you pay attention to alignment). And its not coincidence by some chance: C mimics asm, because C needs to be 100% predictable to coder. Because asm use simpliest and the most obvious abstractions, with predictable runtime costs. C also goes this way. So its inevitable for my code to work. With some precautions, but it would work everywhere.
> the example doesn't illustrate the issue...
Yes, I suggested it, and I asked you for some illustrative example, because I can't understand your reasoning from "arrays are not first class" to "nothing to do with asm". I see it other way: "arrays are not first class" is "asm mode".
> this code would work everywhere
it does not, it will be miscompiled by modern compilers.
> I asked you for some illustrative example,
foo(T x) { x[0] = 1; }
T x = {0};
foo(x);
assert(x[0] == 0);
The assertion fails for T = char[1], but succeed for T=std::array<char, 1>; You could construct a similar example in pure C.
std::array and C arrays compile down to the exact same code for access, have the exact same layout, etc, but C array are not copyable and assignable and implicitly convert to pointers without any good reason. This has nothing to do with assembler whatsoever.
> it does not, it will be miscompiled by modern compilers.
Sorry, due to formatting bug I overlooked this.
Can you show me example of such a modern compiler? I suspect that you mean some C++ compiler, and they probably do, they would `miscompile' my example, because they treat struct in a matter similar to a class with vtable and all other stuff. But we are speaking about C, not C++. But if I'm mistaken with my suggestions, I'd like to know about modern compiler of C which prove me wrong. Such a proof can help me to understand modern C much better.
It is hard for compilers to miscompile this specific example as it doesn't do much at all.
The idea is that a write to pfoo[1] couldn't possibly alias with any write to foo, so the compiler should be free to reorder accesses if profitable. This is the same in C and C++ and has nothing go do with vtables.
For what is worth, I couldn't get gcc, clang and icc it to miscompile [¹] a slightly changed example, so either it is not actually UB or compilers still refrain to make this kind of optimization as it would break way too much code.
[¹] i.e. they elect to reload from the struct after writing to the array and vice versa even when it would be profitable not to do so.
Okey... Now I cant understand only one thing: how do you jump in conclusions to your last sentence? If you use asm and try to pass array into function, then you will pass address of array, not a copy of array on stack. Looks similar to C behaviour, isn't it?
Whether you copy or pass by reference has everything to do with the language semantcis, ABI and calling convention and nothing to do with asm.
For example, if you look at the generated asm, C on amd64 will happily pass a struct by copy in registers, but will pass an array by address.
The designers of C decided to give arrays pass by reference semantics and struct pass by value [1]; this was done because is convenient: you often want to iterate through arrays and pointers are the most generic way, but it does make arrays not first class.
[1] admittedly traditional C couldn't pass structs at all.
They do. Lets take a look at the very first one. "Array name is just a pointer".
Assebler "array" is a name of a label. So its just named address in memory. Or we can alternatively say, that assembler array is constant pointer.
`sizeof' returns size of array in bytes? Hmm... maybe that because main C abstraction for memory is the assembler one: memory is a continuous sequence of bytes? `sizeof' meant to be used for functions like malloc or memcpy, not for operator `new'. When we use some dynamic memory allocation in assembler we get pointer to untyped memory chunk, compare with C:
void* malloc(size_t size);
If you wish I can show you the connections of other 'C lies' with assembler abstractions. I'm too lazy to write about all of them, but I could write about one more if you ask. Just pick one you like more.
> now we are stuck with those bad decisions forever
Yes, you are right. We stuck with that. And it is bad. But it doesn't make my point wrong. C is crossplatform assembler, and these decisions looks pretty good from perspective of assembler. They give to programmer low level control on generated machine code while keeping code portable, and its very useful in some cases. For example when you developing an OS kernel.