ydcvjk's comments

ydcvjk · on June 30, 2015

Where are you getting your facts? F-35 doesn't cost 10 times as much. A simple Wikipedia check can confirm that.

ydcvjk · on June 30, 2015

HN markdown/html editing is borken. Your links are too.

TeMPOraL · on June 30, 2015

HN never had markdown/HTML formatting support.

ydcvjk · on June 29, 2015

gcc 4.9.2 gives: warning: 'bar' is used uninitialized in this function

Maybe you are using a old version.

If you enable -Wall which you always should then this flag is automatically enabled.

https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#inde...

anon1385 · on June 29, 2015

-Wuninitialized has been broken in GCC for over a decade: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18501

anon4 · on June 30, 2015

As if we needed more proof that GCC is a fucking joke...

ydcvjk · on June 30, 2015

Hey Linus, how are you doing!

ydcvjk · on June 29, 2015

A lot of C programmers make valid assumptions based on their system architecture and compiler. Is there any point trying to unify their obviously different practices, while at the same time ignore the Standard? No.

Animats · on June 29, 2015

Many C programmers assume a single flat memory space. Most machines today have that. There's a long history of machines that didn't: Intel 286 in segmented mode, some Pentium variants in segmented mode beyond 4MB (Linux supports this in the kernel), IBM AS/400, Unisys Series B 36-bit word machines, Burroughs 5500/6700/Unisys Series A segmented machines where an address is a file-like path, Motorola CPUs where I/O space and memory space are distinct, and old DEC PDP-11 machines where code and data address spaces are separate. More recently, there are non shared memory multiprocessors where addresses are duplicated across processors - the PS3's Cell and many GPUs, for example.

Most programmers today will never encounter any of those.

kevin_thibedeau · on June 30, 2015

There's also the assumption that null is address 0. Notably breaking the purity of C++s type system with magic 0s for 20 odd years before nullptr came along.

michaeljsmith · on June 30, 2015

Actually the C++ spec says that 0 as a pointer doesn't actually have to have the value 0 - it just refers to a null value the same way as nullptr.

kevin_thibedeau · on June 30, 2015

That is the point. They went through the process of tightening down the type system by invalidating automatic casts and making void* less promiscuous and then they break that whole philosophy with a magic literal "0" that can work as a normal integer or as an address placeholder for any pointer even though it won't necessarily evaluate to address zero. At that point why can't I assign "0" to a float too and enjoy some more magic there?

MichaelCrawford · on June 29, 2015

To me, the whole point of C is to enable one to take advantage of one's system architecture. Thats not really possible in higher level languages.

pcwalton · on June 29, 2015

Well, one of the interesting takeaways from the article is that C as specified often does not let you take advantage of your system architecture. The specification has things like GPUs and segmented memory architectures in mind when it forbids you from doing seemingly reasonable things like taking the difference between the addresses of two separately-allocated objects, even though chances are very good what you're trying to do works just fine on all architectures you care about.

AnimalMuppet · on June 29, 2015

Yes and no. The standard specification defines a C that is very portable. It is quite reasonable that you cannot portably take advantage of your system architecture, which means you cannot do it using C as defined by the standard.

But you still can write stuff that looks just like C, that a C compiler will accept, and that lets you take advantage of your system architecture. You just need to understand that a new version of your compiler can break all of your nifty tricks.

ydcvjk · on June 29, 2015

Finally someone that understands.

Gibbon1 · on June 29, 2015

Not really because there are two kinds of c programer out there.

1. Those that communicate with the outside world exclusively via library calls to an operating system.

2. Those that do not.

People that live in world #1, usually don't get people living in world #2. Example.

  // wait for write ready
  while((spi.S & SPI_S_SPTEF_MASK) == 0)
    ;

  spi.D;
  spi.D = data;

Guess what,

No spi.D; is actually important. No just because the program never writes to spi.S does not mean you can delete the while loop. No you cannot reorder any of this and have it work.

_dps · on June 30, 2015

Are you saying something other than "you, the programmer, must declare spi to be volatile, or else the compiler might lay a bear trap for you?".

Based on your comment it seems clear that you understand that... but given that you understand that I don't get the point you're trying to make (except perhaps that writing low-level hardware code in C often means actively stopping the compiler from screwing you over).

(for non-C programmers: "volatile" can be approximated by "hey compiler, this variable can change unexpectedly even if you didn't do anything to change it, so don't use any optimizations that assume you're the only one changing it).

ydcvjk · on June 30, 2015

Sorry, I have no idea what you are trying to say. Maybe stick to the relevant issue and be less abstract?

icedchai · on June 30, 2015

His example is anything but abstract. Embedded programming is a world unto itself. You sometimes need to read an address before you can write, etc.

dllthomas · on June 30, 2015

I think the parent meant "implicit" rather than "abstract". The code is not abstract; but the pragmatic intent of the comment is not made very clear. I believe I follow but I'm not confident enough to risk confusing things by guessing.

(I believe I follow the intended conversational import; I certainly follow the code.)

Gibbon1 · on June 30, 2015

Sort of the point is, in pure languages only the symbolic operations are important. And the side effects (memory operations) are unimportant and beyond the control of the programmer. So you can do all sorts of transformation on the code without effecting the end result.

C is definitely impure. There is a large set usage cases where the memory layouts are important. Where the orders of operation and memory accesses are important.

What I worry about is when the optimizer is allowed to make assumptions about undefined behavior, that may be actually important on certain targets. Consider referencing a null pointer.

Winows7, Linux, in user land if you do that and you'll get a seg fault. And if the default handler runs your program dies.

The ARM cortex processor I've been slopping code for, reading a null pointer returns the initial stack pointer. Writing generates a bus fault. Since I actually trap that, and error message gets written into non-initialize memory and then the processor gets forcibly reset.

On an AVR reading a null pointer gives you the programs entry point. Writing to that location typically does nothing. Though you can write to it if running out of a special 'boot sector' and you've done some magic incantations.

So that's the problem I have with the idea that 'undefined means the optimizer can make hash of your program' instead of trusting the back end of the compiler will do something target appropriate.

AnimalMuppet · on June 30, 2015

Yes, embedded is very side-effect-ful. When you have to make that kind of thing work, you usually have to know both your hardware architecture and your compiler.

  *(unsigned long*)0xEF010014 = 0x1C;

presumes memory mapped I/O, a 32-bit hardware register at 0xEF010014, and what 0x1C will mean to that register. It also presumes that the compiler will generate a 32-bit access for an unsigned long.

Going back to Gibbon1's code sample, you have to know what your compiler is going to do at what optimization level. You either have to declare spi.S to be volatile, or you have to compile at -O0, or some such. And you may well need to review the generated assembly code a few times in order to really understand what your compiler is doing with your code.

If you're writing the Linux kernel and you want to be portable across hardware architectures, it gets harder. You can't just be processor- or hardware-specific. You probably need the volatile keyword. I don't know what the kernel compiles in terms of optimizations, but I bet it's not aggressively optimized.

ydcvjk · on June 30, 2015

His comment is abstract. A fragment of correct code doesn't magically change that.

spdionis · on June 29, 2015

Why? That's the issue that people should solve.

Disclaimer: I have no C experience.

MichaelCrawford · on June 30, 2015

Ive worked on embedded systems where stuff like that would be a problem, howver I also knew the memory map for my target.

I wanted to make some crypto run faster so i used memcpy on a pointer to a function to copy executable code from slow flash to fast ram.

jerf · on June 29, 2015

"take advantage of one's system architecture."

Yes, if your system is essentially a PDP-11.

I don't mean that sarcastically; pcwalton's sibling message only begins to mention the ways in which C is not a match to modern systems. Vector processing, NUMA, umpteen caching layers, CPU features galore... it's not really a match to the "architecture" any more.

(To the extent that you may think C supports those things, I don't really think "lets you drop arbitrary assembler in the middle of a function" constitutes "support". YMMV. To be fair to C there's a lot of features that seem to be unsupported by any high-level language today. Hardware moves way faster than programming languages. If you want to figure out what's coming after the current generation of languages, "a language that actually lets you use all the capabilities of modern hardware without dropping to assembler and giving up all the safety of the higher-level language" is at least one possible Next Big Thing.)

p0nce · on June 30, 2015

That's why I am hopeful that OpenCL will get more mindshare. It is very fitting to a wide variety of architecture (CPU, GPU, FPGAs...).

arielby · on June 29, 2015

C matches NUMA and caching just as well as assembly does. Basically the only thing C doesn't really expose is SIMD, but compiling "high-level" (i.e. C-level or above) code into SIMD is an open research problem (AFAIK) - I mean, you can easily compile array programs into SIMD, but the more general case is still open.

pjmlp · on June 29, 2015

Really?! How do you specify alignment and packing in ANSI C?

stephencanon · on June 29, 2015

_Alignof( ) and _Alignas( )?

pjmlp · on June 29, 2015

Yeah, since C11 only.

However very few compilers, specially in the embedded and real time OS space do offer C11 compliance.

Gcc and clang aren't the only game in town. If one aims to write compliant C code much more compilers come into the picture.

Joky · on June 29, 2015

Isn't it that if you write standard compliant C code, much less compilers come into the picture ;)

pjmlp · on June 30, 2015

No, there is C89, C99 and C11.

What reduces the set of compilers is the reliance on compiler specific behaviours or recently approved standards, when compilers are still catching up.

jerf · on June 29, 2015

"but compiling "high-level" (i.e. C-level or above) code into SIMD is an open research problem (AFAIK)"

In general, "taking a language never written for X and applying X on it" is an open problem. See also, for instance, trying to statically type a program in a dynamically-typed language. Of course it's hard to statically type a language that was designed to be dynamically typed. Of course it's hard to take C and make it do SIMD operations. You'd have to write a higher-level language designed to do SIMD from the beginning, if you really want it to be slick.

"C matches NUMA and caching just as well as assembly does."

I'm going to make a slightly different point, which is that C does not support caching as well as a language could. For one particular example, C still lays structs out as rows, and has no support for laying them out in columns. (It doesn't stop you, but you're going to be rewriting a lot of code if you change your mind later. Or doing some really funky things with macros, which is also rewriting lots of code, too.)

Of course C doesn't support this crazy optimization... when it was written the order-of-magnitude difference between CPUs and memory was much smaller, indeed outright nonexistent on some architectures of the time (though I can't promise C was on them). Computers changed.

(I'll give another idea that may or may not work: Creating a struct with built-in accessors designed to mediate between the in-RAM representation and the program interface, which the compiler will analyze and determine whether to inline into the memory struct or not. For instance, suppose you have two enumerations in your struct, one with 9 values and one with 28. There's 252 possible combinations, which could be packed into one byte, but naively, languages won't do that. This language could transparently decide whether to compute the values upon access, or unpack them into two bytes.)

Of course, virtually nothing does support this sort of thing, which is why I said C isn't really uniquely badly off... almost no current high-level languages are written for the machines we actually have today. (I'm hedging. I don't know of any that truly are, but maybe one could argue there's some scientific languages that are.) C is the water we all swim through, even if we hardly touch it directly, and ever since GPUs become standard-issue the mismatch between our programming languages and our hardware has become comically large.

Assembly supports everything, but by so doing so, supports nothing. It will of course permit you to lay your structs out in rows or columns or anything in between, but it doesn't particularly support any of them, either.

Incidentally, unlike some of my age, I don't actually complain about this; it is what it is, there are reasons for it, and what we have in hand is still pretty darned powerful. But, again, I'd suggest that if you are a language designer and you're looking to write the Next Big Thing, you could do worse than figure out how to start integrating this stuff into a programming language that can cache smarter and use the GPU in some sensible manner and all these other things. It's one way you might actually be able to create a language that will not merely tie C, but straight-up beat it.

arielby · on June 29, 2015

I don't know of any way to compile non-array-ish, non-explicitly-SIMD-ish code into efficient SIMD code. Thinking about it, this seems to be the general case with parallelism - you can take fairly naïvely-written code, maybe add some memoization and hash-table-dictionaries, and it turns into not-too-inefficient sequential code, but I don't know of a way to do an equivalent with parallel code (especially if you add the memoization) - STM promises, but (AFAIK) still doesn't deliver.

Of course, the best way to have an SoA representation is to use an ECS :-).

C's main claim to fame is being "an HLL" (in the classical sense) but still being able to do essentially everything Assembly does. Also, having relatively-simple semantics surely helps (C is the only imperative language to have completely formalized semantics that I know of).

@qznc: I don't know of a way to do it natively in Rust (you must write accessors).

jerf · on June 30, 2015

"I don't know of any way to compile non-array-ish, non-explicitly-SIMD-ish code into efficient SIMD code."

Which is why I'm proposing that somebody creating one might get some traction, after all...

"C's main claim to fame is being "an HLL" (in the classical sense) but still being able to do essentially everything Assembly does."

And I discussed at length the things that assembly can do today that C can not, without callouts to assembler. I mean, I know the party line, I've heard it for like twenty years now, and my entire point is that it's not true anymore. C is not a "high level assembler" for a 2015 machine. It's a high-level assembler for a PDP-11. Which is still useful enough, thanks to backwards compatibility, but it's high time for it to get out of the way and stop being "the high level assembler", just like it's high time for it to get out of the way and stop being "the systems language".

arielb1 · on June 30, 2015

C certainly can do SIMD just as well as assembly (via intrinsics). It does not allow high-level code to be compiled to SIMD, but there is no known way to compile high-level general-purpose code to SIMD. Finding such a method is an Open Research Problem AFAIK (of course, solving it would be Very Welcome).

qznc · on June 29, 2015

Your two-enums-in-a-byte example: I'm pretty sure I could do this in D. Hence, it should be possible in C++. Maybe Rust can do it as well.

dbaupp · on June 30, 2015

It's possible to do manually in all those languages, but I'm not sure it can be done without programmer intervention: C++ and Rust both allow interior pointers/references to point to fields, which inhibits automatic application of many of the craziest layout changes.

jerf · on June 30, 2015

I had to add that naively, languages won't do that, because of course there's no trick to it otherwise. Even Javascript can do it, it just won't be any faster. It's a simple example to fit a simple paragraph of text. One could imagine other possible optimizations to harness the fact that it's faster to do work on what you have than to pull more stuff in from RAM, like perhaps a string type that transparently compresses itself with a fixed dictionary (or optional dictionary) or something depending on runtime performance heuristics.

Of course anything I say is possible in an existing language, with enough work, enough assembler, and enough compromises, but it's not what languages are based around.

pjmlp · on June 29, 2015

It is also not possible in C. People just think C is any better.

The standard doesn't say anything about SIMD, GPU, instruction reording, IO registers, interrupts...

All of that are language extensions or library functions written in Assembly. Any programming language can offer similar extensions.

noselasd · on June 29, 2015

I suspect the parent is more referring to people writing code that does e.g. foo(i++, i++); or e.g. relying on signed integer overflow behvior based on observing a toy program on their own machine - assuming the code will behave the same in all contexts, optimization levels or minor versions of the compiler

Retra · on June 29, 2015

No, the whole point of C is to abstract away one's system architecture. C is a higher level language in this sense.

ydcvjk · on June 29, 2015

Translation to English please so I can implement the algorithm...

ydcvjk · on June 29, 2015

What seems to be the worst part of this is that his( Tim's ) University ousted him immediately, based on no real evidence.