Hacker News new | past | comments | ask | show | jobs | submit login
One year of C (floooh.github.io)
303 points by gok on June 2, 2018 | hide | past | favorite | 163 comments



Redescovering the beauty of C seems quite similar to city-dwellers redescovering the beauty of nature: everything is rainbows and butterflies, while in reality - if you are away from the safety net of civilisation - there are hundreds of ways nature will crush you if you make even a small mistake.


And then you have the rugged outdoorsmen get serious about nature and venturing into the wilderness. They plan for the worst. They bring enough supplies. They don't try to climb mountains in flip flops. They shake their fists at the city touristers leaving trash at the campsite and causing memory leaks. No garbage collectors out here. They know to respect the ecosystem flora and fauna or get ripped to pieces.


and then there's the guy whose back in the city who knows the world is designed by aliens and that nature is artifice, all decent processor architectures are designed using dataflow HDL languages and the notion of "bare metal" nature is a delusion held by hipsters and unix greybeards


But god created in nature three basic elements (resistance, capacitance and inductance) and with holy will and power forged the foundations. The wisdom of the first borns say there are more basic elemental that god intended (memristors...)


But even these are approximations and false gods.

Electromagnetism and solid-state physics are more fundamental still.


You. I like you. I hope FGPAs become more commonplace and efficient.


fpga's seem to be encumbered more by design/synthesis toolchain rather than anything else (imho)


> No garbage collectors out here.

Pack in, pack out. :)


malloc, free :) in that order


This is actually an amazing analogy.


THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE

Better to be ashamed than ripped to pieces though.


> They shake their fists at the city touristers leaving trash at the campsite and causing memory leaks.

Do these truly exist? I'm doubting it more and more for every day, with more and more trash being found around campsites used by the most rugged of outdoorsmen.


They certainly do. Are their fences occasionally weak, or their cabins sometimes crude? Yes, but the fundamentals can be overwhelmingly good.

See the Linux kernel which for all its warts is arguably the greatest software engineering achievement, and is by any measure one of the most widely deployed software programs.


Now you're just making me feel bad! Of course, no disrespect to the kernel devs. What they are doing is insane, in the most positive way.

...buuut... they also leave trash around their campsites...

I'm not saying I could do it better. I fully understand the business decisions and they are doing the very best they can. I'm consciously ignoring the big picture and pointing to this one flaw, which isn't even their fault. It's C's fault!


Skim milk!


The author isn't comparing against Rust or Java or something like that — they're comparing C against C++. C++ doesn't really provide much in the way of a safety net and actively gives you tools to hurt yourself.


Smart pointers and vastly more type safety is not a safety net?


It's not a very good safety net when it's completely optional.

Sure, it's technically optional in Rust too, but in Rust, you have to wrap raw pointer dereferences and stuff into unsafe{} blocks. The default is safety. In C++, your whole program is an unsafe block.


Having used smart pointers extensively in C++ I'd much rather debug a malloc/free issue. Finding cylces is not trivial or quick.


You’d rather debug malloc/free than try to find and eliminate shared_ptr cycles? You really think that’s easier? Besides, you shouldn’t be using shared_ptr much at all, and should try to use unique_ptr everywhere.


Yup. Most of the debug CRTs contain fantastic malloc/free debugging and pointer overwrite tools.

FWIW I don't consider unique_ptr a smart pointer, that's just RAII in action.


Can a turing-complete type system be sanely considered safe?


Turing completeness of the type system isn't what causes C++'s unsafety: in a sense, being turing complete theoretically allows encoding more safety into the type system. It also allows encoding more impossible-to-write-a-program into the type system, but that's orthogonal (plus, if you can't compile anything, you can't run anything unsafe!).

C++ improves on C in some respects, but still suffers from undefined behaviours like use-after-free/dangling references, dereferencing of null (e.g. a moved-from unique_ptr) and iterator invalidation, none of which are caused by a Turing complete type system (they all, in various forms, exist in C too).


Because RAII isn't amazing?


living on the edge is also a natural way to experience deeper

balance the comfort of accumulated knowledge with your animal instincts


It's the same for JavaScript.. without safety nets (browser sandbox), it is just a massive problem to security.

Looking at the latest security issues, with JavaScript hackers found a faster way to break in: examples, like crypto mining in JS delivered to websites and so on


Mining is not a security issue though. It doesn't break into anything. It just wastes CPU cycles.

If by "delivered to websites" you mean without the site owner's consent, via ad networks or comments/whatever user generated content, then it's a problem of trust and/or content filtering, it's orthogonal to browser sandboxing.


it's orthogonal to browser sandboxing

Not at all, the owner of a website should not be able to coerce your browser to anything but display the page.


"Displaying the page" has meant "running arbitrary application code" since Netscape. If you don't agree, disable JavaScript completely and enjoy the 100% working pages everywhere ;)


There is a big difference between executing a little JS to do a button rollover, and performing arbitrary computation!


What you're saying is basically a city-dweller's perspective of what nature is like. Tonnes of people live close to nature (both literally and figuratively) without any serious threat from nature crushing you.


Really? What's the life expectancy of someone living without doctors or modern medicine? For reference, 17th-century English life expectancy was only about 35 years.


The low life expectancy of old times is almost fully explained by child mortality. If you lived beyond, say, 15 years, you could expect to live nearly as long as we do today.


What? There is a pretty huge distance between what you suggest and living in nature a helicopter ride away from those things. Not that it is anymore than an anecdote but I live in nature and I have not had funerals of people under 85 so far.


someone living without doctors or modern medicine?

So-called modern languages are more like junk food and daytime TV.


Bad analogy - in programming languages there's no wild animals waiting to attack, no pathogens in the waters. Your only enemy is the enemies you create in your codebase. And if you know what you're doing, C is as safe as C++. It's just that everyone learns the C++ way of thinking about problems, so they can't figure out how to do the same things in C.


Tricky. I prefer C and use it at home for fun. C++ is strictly for work. But I also like to program in ways that are safe by construction, and C++ gives me tools like unique_ptr and std::vector for doing that.

But I agree those tools are limited. The class of bugs which C++ protects you from are the shallow ones. Coding with discipline will prevent more bugs than switching to C++.

Languages with real memory safety however are a genunie step up. Rust for example makes it painful to not code with discipline. Unfortunately I find it makes programing painful in general -- so I am still waiting for a safe language that is as much fun as C.


>Your only enemy is the enemies you create in your codebase

I'd say your primary enemy are those using your codebase, who can and may exploit any hole. And ofc, its not just your codebase, as it sits on top of libraries, maybe a runtime, an OS, and hardware; you're not safe just because your onw program is fine, within the context of its internal machinations


> no actions need to happen on destruction

so... you like never free, never fclose, never munmap, never pthread_join, never unlock mutexes, never close network connections or what ?

Also, all of your protoypes such as

    sg_pipeline sg_alloc_pipeline() {
should really be

    sg_pipeline sg_alloc_pipeline(void) {
since the first one can actually be called with any number of arguments (in C, not in C++).

Note also that C doesn't have return-value-optimization, hence all your struct-returning functions can possibly cause a call to memcpy (won't happen when compiled in C++ mode of course) and will generally lead to much more binary bloat than the traditional C way of passing outputs as arguments. e.g. given this trivial code in foo.c:

    typedef struct _foo
    {
        int x;
        float foo, bar;
        char z[1024]; // change this to 10024 to get a memcpy! 
    
    } foo;
    
    void do_stuff_to_foo(foo*);
    
    foo blah()
    {
        foo f;
        do_stuff_to_foo(&f);
        return f;
    }
A disassembly after compilation with gcc -std=c99 -O3 gives

    blah:
    .LFB0:
	pushq	%rbp
	movq	%rdi, %rbp
	pushq	%rbx
	subq	$1064, %rsp
	movq	%fs:40, %rax
	movq	%rax, 1048(%rsp)
	xorl	%eax, %eax
	movq	%rsp, %rbx
	movq	%rbx, %rdi
	call	do_stuff_to_foo@PLT
	movq	(%rsp), %rax
	leaq	8(%rbp), %rdi
	movq	%rbp, %rcx
	andq	$-8, %rdi
	movq	%rbx, %rsi
	subq	%rdi, %rcx
	movq	%rax, 0(%rbp)
	movq	1028(%rbx), %rax
	subq	%rcx, %rsi
	addl	$1036, %ecx
	movq	%rax, 1028(%rbp)
	shrl	$3, %ecx
	rep movsq
	movq	1048(%rsp), %rdx
	xorq	%fs:40, %rdx
	jne	.L5
	addq	$1064, %rsp
	movq	%rbp, %rax
	popq	%rbx
	popq	%rbp
	ret

The same thing built with g++ -O3 gives

    _Z4blahv:
	pushq	%rbx
	movq	%rdi, %rbx
	call	_Z15do_stuff_to_fooP4_foo@PLT
	movq	%rbx, %rax
	popq	%rbx
	ret
While clang (in C mode) goes for a memcpy:

    blah:                                   # @blah
	pushq	%r14
	pushq	%rbx
	subq	$1048, %rsp             # imm = 0x418
	movq	%rdi, %rbx
	movq	%fs:40, %rax
	movq	%rax, 1040(%rsp)
	movq	%rsp, %r14
	movq	%r14, %rdi
	callq	do_stuff_to_foo@PLT
	movl	$1036, %edx             # imm = 0x40C
	movq	%rbx, %rdi
	movq	%r14, %rsi
	callq	memcpy@PLT
	movq	%fs:40, %rax
	cmpq	1040(%rsp), %rax
	jne	.LBB0_2
	movq	%rbx, %rax
	addq	$1048, %rsp             # imm = 0x418
	popq	%rbx
	popq	%r14
	retq


> Note also that C doesn't have return-value-optimization, hence all your struct-returning functions will cause a call to memcpy (won't happen when compiled in C++ mode of course).

What ?

RVO is precisely needed because a copy in C++ can run arbitrary code and so is not as easy to ellide as a memcpy. RVO is basically a promess you make to have your copy constructor & destructor be semantically harmless compared to a memcpy.

There is no need for RVO in C.


That was exactly my thought as well, but the examples seems to show otherwise (at least on gcc and clang[1]).

The compilers are using basically the same underling optimizer and back-end with different front-ends, and since in C there are no "user-defined constructors" and no destructors, one would expect that you don't need any special RVO rule in C: the compiler can simply observe that a local object is returned and construct it in-place as necessary.

Thinking about this example, this may not be the case: distinct objects have to have distinct addresses, right? So in C you might not be able to make this optimization since the do_stuff_to_foo method (a black box to the compiler) could save its argument, and the caller of blah() could see that the argument it passed has the same address as the local f object in blah, a violation of "distinct objects, distinct addresses".

C++ has a the RVO escape hatch for this: it is expected that some objects that appear distinct in the source may not actually be distinct if they fit the RVO (or NVRO) pattern - but C does not. So perhaps gcc and clang are doing the right there here.

---

[1] All numbered versions of clang up to 6.0 seem to behave the way indicated in the GP post, but trunk in godbolt, which shows version as 7.0.0 (trunk 333657) compiles C efficiently like C++.


very good point on the "addresses compare == iff same object" rule.

In that case though, I think clang is right to optimize the callee (but it does introduce a problem in the caller) :

the only place you could do the equality check and observe the rule being broken is before the callee returns since the lifetime of its variable is bound to the call.

It seems that clang will not let the return pointer alias a local in the caller except when the call is the initialization of said local.

So if the caller goes :

foo x; leak(&x); x = returns_foo();

the memory will be temporary stack (and then memcpy), thus upholding the rule. (and it seems to me that this inefficiency is really required to respect the standard if we actually leak the pointer)

in the case :

foo x = returns_foo();

clang will pass the actual address of x down but that's before the object exists (and its address cannot be known yet) so the rule is still fine.

I stand corrected though, this does mean that RVO would be useful for C as well, as a way to relax the aliasing rule.

edit: nevermind that, in the first case it's perfectly legal to read/write foo through the pointer downstream so you cannot make the optimization anyway.


Yes, the same thought occurred to me (that perhaps clang is careful in the caller in the case the address escapes), but I seemed to find cases where clang optimizes the caller also, so that two distinct objects receive the same pointer and both pointers escape.

Here's an example:

https://godbolt.org/g/yxFzqT

This happens on clang versions back 3.6.

Note that if you change the caller to:

    Foo f;
    f = callee(&f);
the code changes and distinct objects are passed. I'm not sure if the first form (all in the definition) has a relevant difference per the standard that lets clang do this.


That seems to match the logic clang-trunk is using:

If the assignment is in the initializer then its using the simple C++ RVO-style call to blah():

    void zot() {
        foo f = blah();
    }

    zot:
        push    rbp
        mov     rbp, rsp
        sub     rsp, 1040
        lea     rdi, [rbp - 1040]
        call    blah
        add     rsp, 1040
        pop     rbp
        ret
But if the variable has the opportunity to leak then it adds a memcpy in the caller:

    void zot() {
        foo f;
        leak(&f);
        f = blah();
    }

    zot:
        push    rbp
        mov     rbp, rsp
        sub     rsp, 2080
        lea     rdi, [rbp - 1040]
        call    leak
        lea     rdi, [rbp - 2080]
        call    blah
        mov     eax, 1036
        mov     edx, eax
        lea     rdi, [rbp - 1040]
        lea     rcx, [rbp - 2080]
        mov     rsi, rcx
        call    memcpy
        add     rsp, 2080
        pop     rbp
        ret
Although weirdly the memcpy is removed when optimizations are turned on. This may be a bug.


> There is no need for RVO in C.

I updated my comment with some assembly.


Thanks. Unless something is escaping me, that's an optimizer bug. I'm pretty sure the ABI allows you to do whatever you want with the sret pointer, including passing it to another function to chain return for free.

I guess you could say that RVO is more robust since it's implemented in the frontend and does not rely on finding the optimization after a fair amount of lowering.

Edit: I don't have time to debug this but at least for llvm, I think this optimization should trigger in the optimizer's memcpy elimination pass ( https://github.com/llvm-mirror/llvm/blob/0818e789cb58fbf6b5e... ).

However I don't see why clang could not simply apply the RVO logic to C code as well.


I'd wager the optimisations you're talking about happen just fine if blah is a static function, where the compiler can assume nothing from the outside will call this function, so calling conventions can be broken at will.

Seeing as blah isn't a static function, I think the calling convention for C that g++ uses somehow dictates that a memcpy is to be used in this case.

Note that I haven't acually tried this, so no guarantees.


I don't claim to have any answers, but I found all of this interesting and surprising. I wondered about a couple of things. What happens if the do_stuff_to_foo is actually defined (and what happens in that actual function)? And is there a difference between value semantics and pointer/reference semantics?

These questions were my take-away from one of Chandler Carruth's C++ compiler optimization talks, I think it was this talk. https://youtu.be/eR34r7HOU14

My take aways were that the optimizer gets a huge chunk of its performance by inlining. And with value semantics, the optimizer can "cheat like crazy".

So I defined two different variants for do_stuff_to_foo

Original Pointer/Reference Semantics:

    void do_stuff_to_foo(foo* a)
    {
        a->x++;
    }
Value Semantics:

    foo do_stuff_to_foo(foo a)
    {
        a.x++;
        return a;
    }

In both cases, the compiler emits effectively the same output for C and C++ (I only tested clang.) (The main difference was name mangling. I omit stuff for brevity.)

Pointer/Reference Semantics:

    Lcfi2:
        .cfi_def_cfa_register %rbp
        incl	(%rdi)
        popq	%rbp
        retq
        .cfi_endproc
                                        ## -- End function
        .globl	_blah                   ## -- Begin function blah
        .p2align	4, 0x90
    _blah:                                  ## @blah
        .cfi_startproc
    ## BB#0:
        pushq	%rbp
    Lcfi3:
        .cfi_def_cfa_offset 16
    Lcfi4:
        .cfi_offset %rbp, -16
        movq	%rsp, %rbp
    Lcfi5:
        .cfi_def_cfa_register %rbp
        movq	%rdi, %rax
        popq	%rbp
        retq
        .cfi_endproc


Value Semantics:

    Lcfi3:
        .cfi_offset %rbx, -24
        movq	%rdi, %rbx
        incl	16(%rbp)
        leaq	16(%rbp), %rsi
        movl	$1036, %edx             ## imm = 0x40C
        callq	_memcpy
        movq	%rbx, %rax
        addq	$8, %rsp
        popq	%rbx
        popq	%rbp
        retq
        .cfi_endproc
What I find interesting here is that in both the C and C++ , the memcpy now appears. And in the C case, there is still only one memcpy, not two.

So as I said, I don't have any answers and really don't know what the take away is. But RVO no longer seems to be a factor in these variants.


Having a module boundary in a place where function call overhead is going to be signficant is a code smell. C++ and Rust programmers just don't notice because the entire standard library reeks of it.


I find it bizarre that the compiler isn't more aggressive about this in C (versus C++) as it doesn't need to worry about special member functions. If nothing volatile is involved there's no reason to bother copying anything.


> Also, all of your protoypes such as

> sg_pipeline sg_alloc_pipeline() {

> should really be

> sg_pipeline sg_alloc_pipeline(void) {

Ah thanks, I fixed that :)

In the case of sokol-gfx, missing return value optimization won't be a big deal, but I'll keep that in mind.


You can call those functions without object destructors. That is the point being made in that paragraph: there are other ways to program besides having data owning resources.


Clang (trunk) in godbolt.org seems to produce the same code for both C and C++ (with -O3). Maybe this is some sort of optimization chance that the C-specific parts of both GCC and Clang didn't catch until now?


GCC in godbolt is always C++, even if you pass -std=c89/c99 : https://godbolt.org/g/SFMB1G ; Clang doesn't even compile with -std=c89


Do not pass -std, just select the language from the combobox at the top right side of the source window. Your code doesn't compile (__cplusplus is undefined) if you select C.

(btw i was talking about Clang, not GCC above)


damn! I use godbolt everyday and didn't ever notice that you could change language like this. thanks!


if you find yourself automagically pthread_joining, something is probably terribly terribly wrong. similarly, if you commonly need to release a mutex far away from where you acquired it, something is probably also terribly wrong.


> automagically pthread_joining

What do you mean by this, exactly? You mean leaving it to RAII instead of doing it explicitly?


uhh "binary bloat" ? with that level of detail? not the right thing to cuttingly show as the deep weakness of C

the memcpy feature, on the other hand, might be unexpected enough to escalate to something worth fretting about..


If you have a need for more than a few pthread_join(), fclose(), etc, something needs to be fixed. In my experience, RAII isn't a good idea. With RAII control flow is implicit and all over the place. With a little data structure organization work the C way is so much better. I'd much prefer 5 explicit fopen/fclose pairs in my program, and 5 lock/unlock, 5 malloc/free, etc., to having to code out-of-band and out-of-context ctor/dtor pairs for each type of "resource" (plus the paralysis when from having to decide what the resources are, what should happen automatically and what explicitly, etc).

Regarding the assembly example. If the copy matters performance-wise (note: it doesn't!) then DON'T MAKE A COPY. Have a pointer argument and put the result there.


> If the copy matters performance-wise (note: it doesn't!) then DON'T MAKE A COPY. Have a pointer argument and put the result there.

it sure does, especially if you're going to make a high performance graphics library. Some old benchmarks I found were on an average of 10-15% performance loss if calling non-RVO'ed code in some tight loop.


Is there a performance difference between C++ with RVO and C where the functions operate on pointers to structs instead of returning the struct directly? Or is it just a matter of coding style? This is something I could never understand very well.


> (note: it doesn't!)

That was tongue-in-cheek. Of course you can always find a slow instance for any approach.

The more explicit message is: You're optimizing the wrong thing. "Tight loop" and "RVO", these things don't go together. Don't write slow and complicated code and expect the compiler to magically make it fast. Write straightforward code. It's easy. It will be effortlessly fast.

There are very few cases where you need to work hard to get fast code (SIMD etc). Typically, if you need to work hard, the code will be slow.


RAII was created as a solution to the problem of C++ code not releasing its resources when an exception happened. By putting the resource release in a destructor this is fixed through piggybacking on C++'s semantic of calling destructors when objects go out of scope, even through exceptions.

To put it another way, RAII is a hack which co-opts one semantic to cover the drawbacks of another semantic. Without exceptions there's no need for RAII because there's no problem to solve.


RAII isn't only about exceptions (although it helps greatly); see Rust for example.

RAII is about any scope exit automatically cleaning up resources. Exceptions is a big one, but so is return, break, etc. Having it happen automatically in all cases is a win for robust software (in my opinion).


Calling free is unnecessary in many kind of utility code. Writing such utilities in C++ may even slow things down due to useless free calls duplicating the job of kernel. As for threads and mutexes in C using multiple forked processes and communicating using pipes is still a valid option that does not require much if any RAII.

Note that am not arguing that RAII is unecesary, it is just I completely agree with the author that RAII is much less needed in C then in C++ due to different code style and structure. Plus there are static code checkers and/or compiler options that allows to enforce various RAII patterns in C.


Of course in C++ as in C if there is some case where you want to intentionally leak your allocations or otherwise have the memory be cleaned up on process exit rather than via explicit free() or delete you can do that: it's not as if raw pointers are unavailable to C++, after all.


The author makes an important point about having a list of languages to choose from.

We are working on a JVM-based project at work. We chose to use Kotlin because the initial team was intermediate programmers who didn't have that much experience. When we got an MVP out, and some clients willing to pay for the solution; we got some Java devs involved.

The guys have only known Java 6 and 7. They are struggling around with Kotlin, not because it's difficult, but because they only know "The Java Way". Sadly also, they've been stuck in J6/7 for so long that they don't know Streams or Lambdas. For people making their living from Java, that's dangerous self-extinction.

Their software architect is even worse. Ignorant guy who lives in a rock, if they don't understand something or have never heard of it, it's wrong or useless.

We're working on a greenfield project, and the guy wanted to write a REST proxy on top of gRPC because he thinks clients won't like us having ports exposed internally.

I nearly took him for his salary when I told him that *://localhost is only visible to the local machine. He bet me his salary that he could access a service on the server even if it's listening on localhost. I was gracious when it came time to pay up.

Anyways, I wish the guys were the exception, but I've seen many guys in their 30's who still think Oracle Inc. is where the sauce is.

My advice: learn something new every once in a while. The author found that C was better suited for some things than C++. That's a powerful statement to make, something I've learnt a lot over the years when I pick up a new language.


You got a point, but being stuck into older language versions is quite common on enterprise shops.

If the server they have to take care of only runs language version X, that is what they care about, after work there is another life totally away from computers.

When time comes for language X + 1 version to be replaced on the deployment system, then there will be one week of web based trainings to learn whatever is new in version now available on the IT development images and that is it.

Yes, there are lots of modern companies out there, but there are tons of them, specially when software is not their core business where development improvements come at snail speed, if at all.


This shows the state of domain knowledge in the industry. This is why people like me still exist because despite the huge progress in programming sophistication basic concepts escape everyone.


> The dangers of pointers and explicit memory management are overrated

Hummm, not my experience. The first time I ran AFL on one of my C library (parser/converter of EMF files, really easy to mess up), the result was quite frightening. IIRC, it took only 20 seconds for AFL to find 10 crashes.

Ever since, first I'm more careful, second I run AFL from time to time, and it happens regularly that it catches some edge cases I was not able to see.

I love programing in C, but I definitively know it's easy to make mistakes in it.


I have that experience in pretty much every language, the first time I run a fuzzer on my code to check an invariant. I'm honestly surprised fuzzers aren't more widely used - they're far more effective per line of code compared with classic unit test suites, and easier to extend. I'm not sure I've ever run a fuzzer on virgin code and not found new bugs. Even when the code in question already had allegedly comprehensive unit tests.

The one time I tried AFL, I didn't find any new bugs beyond the embarrassingly large number my own fuzzer had already pointed out.

But I agree with your point. Some double-digit percentage of bugs in a C library will be memory related, with all the remote execution implications that brings. This is arguably not a big issue for games or webassembly code. But for networked services it seems increasingly irresponsible now that Rust, Pony & friends are maturing.


What's AFL?


American Fuzzy Lop.

http://lcamtuf.coredump.cx/afl/

And no, it's not the rabbit ^^.

It's an 'intelligent' (as opposed to purely random) fuzzer that is quite good at finding edge cases (it's a very/too short summery, please read the link for more details).


I think it’s American Fuzzy Lop, a fuzzer that’s good at finding bugs in your code: http://lcamtuf.coredump.cx/afl/


I like this blogpost. But as stated in it, “my project”. Writing C project, alone is pretty different than writing C with 1 or more persons.

In highschool we had group project. Writing simple tcp/ip protocol in C. Holy shit did I see some awfull things... maybe writing C is not the bad thing about it. But maintaining could be.

Simple example, you can read string from simple data file on 6,7 different ways. Yet their effects and behaviors differ severely. Not everyone knows difference in parameters and returns in scanf, fgets and gets...


I think your experience writing C with a group might be severely colored by the group being made up of inexperienced highschoolers.


I totally agree with you on that. But the whole point was that there is much certain possibility for things to go wrong with inexperienced developers. I won’t quote the comment in the reply below, but that is the thing...

edit: typing on the phone, sorry


In my experience, turning an inexperienced C developer loose with C++ does not result in an improvement :-).


The thing with C is that there are plenty people that can code C and know what they're doing, but there are many more that either do not know what they are doing, or don't but think they do (which is even more dangerous). My experience mirrors yours, but I do believe these good C programmers exist.


I'm slowly starting to believe that every single C programmer falls into the "either do not know what they are doing, or don't but think they do" category.

I have yet to see a single, non-trivial C program that does not have many dangerous bugs related to language unsafety.

If these mythical good C programmers exist, they certainly have never released any of their source code.


It is like the driver paradox, when asked everyone asserts they are the best drivers in the world only the others are not.


> btw, did you notice how there are hardly any books, conferences or discussions about C despite being a fairly popular language?

Oh, yes. When I wanted to fiddle with OpenGL (and later SDL), I was baffled to find that absolutely everything documentation-wise was for C++.

EDIT: Sorry, "documentation" was probably the wrong word to use here. I meant "learning resources".


I ran into the same problem with the using the libraries for the C based ffmpeg (libavcodec, etc). Even worst is trying to search for information only to wind up with hundreds of entries on how to use ffmpeg with the CLI with maybe 1-2 entries related to binding to the libraries grammatically.

Even ffmpeg's own documentation outside of what they generate for their API is rather sparse if you wish to do more than use it on the command line.


Probably due to the game industry being quite C++ focused, and also Microsoft compilers’ lack of support for C99


Historically the C99 support by MS was quite bad. Fortunately VS 2017 has good (not 100%) C99 support.


Worse than bad: if I recall correctly, there was no C99 support at all until MS VS 2013. Before that you had to compile C as C++ in order to get C99-like features.


To the extent required by ISO C++14, with ISO C++17 requiring C11 library compliance.

If is after all called Visual C++.


> When I wanted to fiddle with OpenGL (and later SDL), I was baffled to find that absolutely everything documentation-wise was for C++.

Do you mean tutorials and the like? The official docs are pure C for both.

C.f. the official SDL API doc: https://wiki.libsdl.org/APIByCategory


Hum. That doesn't make sense, the OpenGL API is entirely C oriented and I don't know of any C++ docs.


It doesn’t make sense, but it jibes with my experience. Tutorials and blog posts about OpenGL are super heavy on C++, because gaming.

The official docs are C, of course, because the API is C. But the docs are rather abstruse for a beginner, and really become useful only well after you’ve gotten your bearings. (e.g. my personal beef: function names tend to be two nouns with no verb to say what the thing actually does, so you look in the docs and it says it “binds” noun A and noun B, with no explanation of what “binds” means in context, which direction the data binding goes, whether it persists, etc.)


The OP probably meant most OpenGL blogs, books and tutorials. The actual OpenGL API reference is pure C.


I have more uses for Python than the author, but one of these uses is really obvious: Python as C code generator. Python + C is a very powerful combination: metaprogramming, special constructs etc. Basically, with Python you can get rid of preprocessor hacks and messing with templates.


Ah, right! I actually use python for code generation as well, just forgot to mention it. For instance the Z80 and 6592 CPU emulators use generated code for the instruction decoder:

https://github.com/floooh/chips/tree/master/codegen


As someone said, "C is an improvement on its successors"


Tony Hoare, “hints on programming language design”:

>> The more I ponder the principles of language design, and the techniques which put them into practice, the more is my amazement and admiration of ALGOL60. Here is a language so far ahead of itst ime, that it was not only an improvement on its predecessors, but also on nearly all its successors. <<

http://flint.cs.yale.edu/cs428/doc/HintsPL.pdf


Ah, nice. Thank you. I knew I had seen it somewhere.


C is beautiful because it is so barebones. It does not hide data behind complex language feature that describe some runtime behaviour it is just bits and bytes moving around. A bit of a romanticed view but that is atleast the feeling I got when writing some small games witg C.

Link: https://github.com/Entalpi/PongC


> As a C++ programmer I developed my own pet-coding-patterns and bad behaviours (e.g. make methods or destructors virtual even if not needed, create objects on the heap and manage them through smart pointers even if not needed, add a full set of constructors or copy-operators, even when objects weren’t copied anywhere, and so on)

Yikes.


I guess you are getting down voted for lack of content, but I agree. Those are all very bad "Java without GC" C++ practices that lead to awful code bases. Thankfully the author seems to have learned that from their experience writing C.

Like the author mentions, C++ leads to overthinking which language features to use. It invites you to seek these perfect abstractions for Python readability with C performance. I think this is a big reason everyone develops their own incompatible style -- not due to passionate philosophical beliefs, just to reduce the size of the solution space.


I agree with you, mainly from the perspective of writing Rust code. When Rust was new, a lot of people said that thinking about ownership all the time is a burden, but I think it's the opposite, it gives a sort of frame that helps to build the software's core abstractions. I think in this regard C++ is much more unopinionated, which means every codebase is different. In Rust, ownership is part of the type system, so a lot of the times learning a new API can be much faster because API usage is easier to understand and this style of programming is already familiar.


yes. ownership is an extremely critical part of software design, especially for complex interactive programs.


I thought when people say memory safety in C is a problem, they usually mean in a larger codebase with hard to define API boundaries and with multiple programmers that can't handle the complexity of the code, which results in a team that can't find a single person that is familiar with the whole codebase.


As a C coder I find myself trampling over my own memory a lot.

Say you have a pointer somewhere that points to a struct, and then once every 10 million iterations of a set of 20 functions that pointer gets written over by a string that lacks a NULL terminator occasionally. So your program crashes in a completely random place that has nothing to do with the origin of the bug. That's the problem with memory safety. But the lack of memory safety is also very powerful, you can malloc a chunk of memory and then use it in extremely creative ways, you really see some peoples genius shine when you read their source code, in a way that I haven't been able to see with other languages.


This is soved by knowing the destination size and using functions that respect that and never assume the size of an input will not be larger. (unless proximity of something that would assure the size of input is close enough to where you're making the assumption about it in code, but still you'd be taking risks, especially if it's not your code, but some foreign library call)

It's the same with web programming. You always escape on the output, or just avoid escaping by using proper API (el.textContent = 'something').


That's why you need an ADT for strings. I've seen this problem (and experienced) this problem multiple times. Any char array needs explicit termination based on use.


Very cool read and a lot of your happy discoveries are similar to reasons why I like coding in Go, especially your "Less Language Feature ‘Anxiety’" section. No guessing about how to do something, there's generally only one way to do something.


I’m not a professional coder atm, but I learned quite a lot from “Understanding and Using C Pointers.” It’s weird knowing that I like to and can code, but being employed as a sys admin. Pretty common I imagine, as well.


I made that 'reverse path' to C quite a while back, more than 10 years ago in fact. for 20 years before that, I was C++, C++, and C++. Then I gradually realised I was trying to slim down my use of the language, due to stuff I just didn't want in the codebase I was working on.

Small details. Like well templates -- they were fantastically hard to track and debug at the time, and really, did they actually provide anything in the long run? Answer was... NO. they don't. Turns out it's a LOT easier to have 2 discrete functions explicitly doing the same thing than using a template. anyone will understand the scope of each if they see them; The arguments about "simplicity" doesn't exist once you get templates involved.

So yes, you can 'template' in C as well using the preprocessor, but it's still ends up being MORE READABLE than a heap of scopeless C++.

One thing I was fond of in C++ were stacked based objects; ie, use the constructor to save some state, and use the destructor to restore it. I think it is a lovely concept, and I still miss it -- hover it's a concept 'new hires' had problems with very often -- it was not as intuitive as it felt. It turns out an explicit save/restore is better to maintain. I wish it was different, but that got THAT feature of C++ out of the window.

Anyway, long story short, I wrote zillions of lines of C++, and now I just do C. I don't have to fight with the compilers every other year for instance.

Small silly example, for a million years (it felt like it!) I could write pure virtual functions declarations as ..... int blah_blah_method(some cool parameter) = NULL; -- it actually made SENSE. Then one day, someone decided it couldn't possibly be NULL -- it had to be zero as '0' or else, no compily.

^^ that is one just a silly old single example, but theres a dozen others where your 2 years old codebase suddenly isn't compiling because someone decide to shift the language under your feet.

THAT doesn't happen with C. C just works. yes, there's lots of syntax sugar I WISH I would have been able to keep from C++, but ultimately, the possibility to write a huge pile of bloat is a lot harder in C than in C++. Some idiot 'new hire' won't make a hash objects for 5 elements.

Some idiot colleague won't decide that the string creation class constructor shouldn't take a reference, but a copy, making the app startup take 30 seconds with ONE character '&' removal.

Anyway, rant over. Regardless of what the kids says, C is nice, C is lean, C is a LOT easier to keep clean as long as the adult people using it are aware of the sharp edges :-)


Your post is very ranty, i disagree with lots, but I think you’re flat out wrong here:

> Turns out it's a LOT easier to have 2 discrete functions explicitly doing the same thing than using a template

How do you write containers? I tried following a series on implementing an interpreter in c a few months ago, but gave up because 30% of the code was implementing std::vector for different types.

Also, math routines definitely work better templated. Code duplication isn’t always bad but if your methods are copy and pasted with different signatures, that sucks. E.g. an interpolation method that works on double and floats.

> ome idiot colleague won't decide that the string creation class constructor shouldn't take a reference, but a copy, making the app startup take 30 seconds with ONE character '&' removal.

This isn’t an argument against c++, this is an argument against your co workers. The exact same argument could be made for C with removing two * signs.

C has it’s fair share of gotchas thy you don’t refularly hit in c++ too - malloc requiring sizeof, not checking realloc’s return values, goto hell for resource cleanup, complicated lifetimes.. personally I’d rather have C++, warts and all, than have to deal with the “simplicity” of reimplementing a stack in every resource owner, and reimplementing vector/map for every type.


For C vectors, look at stb's[1] stretchy_buffer[2] or klib's[3] kvec[4].

The basic concept of these is they use macros to specify the type/size of the elements so you can have generic containers in C.

For hash's, you can also look at klib's khash [5]

[1] https://github.com/nothings/stb

[2] https://github.com/attractivechaos/klib

[3] https://github.com/nothings/stb/blob/master/stretchy_buffer....

[4] https://github.com/attractivechaos/klib/blob/master/kvec.h

[5] http://attractivechaos.github.io/klib/#Khash%3A%20generic%20...


> The basic concept of these is they use macros to specify the type/size of the elements so you can have generic containers in C.

I figured as much. As bad as templates are, I’d take templates over macros any day, especially if I have to debug one.


A 1-line macro is not bad. stb's stretchy_buffer is not bad, but it could do with less macros. Take the ones from my own example that I linked elsewhere on this page.

If you use macros "correctly" there is rarely any logic in there, and they are decidedly not hard to debug. Macros let you do things that you could not do otherwise, like getting information about the caller or the size of the element type that is abstracted behind a void ptr.

C++ templates on the other hand - if you have considerable C++ experience you will shy them. They drive compile times up insanely, and are actually very difficult to debug, at least under "normal" usage.

Also, code bloat results after a few instanciations where a simple runtime integer could be used to discriminate between different cases, but with shared implementation code.

I would never again touch even the most basic C++ template, that is std::vector. I prefer my simple 3 lines of macros.


> How do you write containers? I tried following a series on implementing an interpreter in c a few months ago, but gave up because 30% of the code was implementing std::vector for different types.

You can use the preprocessor to generate these discrete functions and data structures. A simple example implementing just a linked list data structure:

list_generic.h:

    #define CATH(a, b) a ## b
    #define CAT(a, b) CATH(a, b)

    struct CAT(T, _list) {
        T value;
        struct CAT(T, _list) *tail;
    };
list.h:

    #define T int
    #include "list_generic.h"
    #undef T

    #define T float
    #include "list_generic.h"
    #undef T

    typedef char * string;
    #define T string
    #include "list_generic.h"
    #undef T
Then you'll end up with int_list, float_list and string_list structs.

Not saying that this is pretty or particularly satisfies the goal of being explicit and readable, but it's possible if you can accept different type and method names for the different contained types.


> Writing C++ classes often involves writing constructors, destructors, assignment- and move-operators, sometimes setter- and getter-methods… and so on. This is so normal in C++ that I only really recognized this as a problem when I noticed that I didn’t do this in C.

POD containers don't ever need manually written copy/move/assign constructors or destructors. In fact hardly anything except resource managing classes need them (eg if you are implementing shared_ptr or mutex yourself). If you are writing your own copy constructor without knowing why, your code is not only needlessly bloated but also most likely broken. Follow "rule of zero" instead.


Does anyone know more examples of C APIs where the user is responsible for allocating data and where all the functions only "borrow" pointers? Is getting rid of malloc/free worth the loss of information hiding?


Pretty much the vast majority of the API out there? Only relatively new ones like asprintf or getdelim do memory allocation for you.

I don't understand your remark about information hiding. How does the difference between returning an owned pointer and borrowing a pointer concern information hiding?


If your function operates with a pointer to a struct you don't need to put the contents of the struct in the header file. You can simply declare the struct with

    struct mystruct;
and then use `mystruct*` in the function prototypes. I think in C++ this is known as the PIMPL pattern.

I was also asking more about user-defined libraries than about the standard library...


Opaque types do not force you to do the allocation in the library.

my_lib.h:

  struct my_struct;

  size_t my_struct_size(void);

  #define MY_STRUCT_MAX_SIZE 512
  #define MY_STRUCT_VAR(name) \
    char name##_buf__[MY_STRUCT_MAX_SIZE]; \
    struct my_struct* name = name##_buf__

  void my_construct(struct my_struct*, int, float);
my_lib.c:

  struct my_struct {
    int i;
    float f;
  };

  size_t my_struct_size(void) {
    return sizeof(struct my_struct);
  }

  void my_construct(struct my_struct* ms, int i, float f) {
     ms->i = i;
     ms->f = f;
  }
user.c:

  // On the heap
  struct my_struct* ms = malloc(my_struct_size());
  my_construct(ms, 123, 456.0f);

  // On the stack
  MY_STRUCT_VAR(ms2);
  my_construct(ms2, 123, 456.0f);


Do you know of any libraries that do this?

When I tried it here GCC complained about `ms2 = ms2_buf__` being an assignment between incompatible pointer types. That sounds like an easy way to get into trouble and undefined behavior.


I just wanted to convey the idea so I didn't compile or check the code. But now that you bring up UB, I would change the MY_STRUCT_VAR macro as follows:

  #define MY_STRUCT_VAR(name)                          \
      _Alignof(max_align_t)                            \
       unsigned char name##_buf__[MY_STRUCT_MAX_SIZE]; \
      struct my_struct* name = (struct my_struct*)name##_buf__
Ie, changed to unsigned char and set the buffer to have the maximum (strictest) alignment. And added the cast.

It's a pretty widely-used technique and I have satisfied myself in the past that it's not UB (ie, by checking the Standard). I am no language lawyer though so don't take my word for it. The heap example, on the other hand, is pretty much identical to what an allocation inside your library would look like so feel free to disregard the stack example if you'd like.

One example of something based on the same principle is the 'sockaddr' structure in the BSD sockets API. See https://en.wikipedia.org/wiki/Type_punning#Sockets_example.


  _Alignof(max_align_t)
should be

  _Alignas(max_align_t)
or

  _Alignas(_Alignof(max_align_t))


Take a look at SDL2. It does do some mallocs, but mostly avoids it and is overall a very well designed API and codebase.


Could you give some examples? I looked at a tutorial right now and it is full of SDL_CreateXXX / SDL_DestroyXXX functions. I couldn't find any instances of code that does `XXX foo; SDL_InitXXX(&foo)`, as is proposed in the article.


SDL_Create/SDL_Destroy basically malloc/free under the hood. (SDL has its own custom allocator, so depending on your platform and compile switches, it may use something other than the standard library malloc/free.)

SDL has some value struct types like SDL_Rect which APIs sometimes pass around through pointers. SDL_RenderCopy is one example.


In my above response, it was implicit, but not explicit, that because libraries may use a different memory allocator than your main code, well constructed libraries provide their own Create/Destroy pairs. So to be explicit, you should really never call free() on memory allocated within some library, and that library should provide its own free function. (Windows devs might also remember that crossing DLL boundaries exposes this same issue, so calling free in your program from memory allocated in the DLL is potential hazard.)

I also wanted to state that "information hiding" is typical for C libraries, because it is often useful for maintaining ABI stability while allowing for future changes.

For example, SDL is very careful about keeping the ABI stable for the life of a series. They broke it between SDL 1.2 and SDL 2.0, but that was a long number of years in between.

If they change a public struct that you use, like SDL_Rect, all the memory layouts in that struct may change. That means if your program is dynamically linked, and the SDL that gets used is newer than what you compiled against (and somebody mucked with SDL_Rect), your program is not going to work correctly.

If everything is behind an opaque pointer, then they can change the internals without breaking your binary. So as an example, a running joke is that Apple seems to change the way they do fullscreen every macOS release. This means your game won't work correctly on the next macOS. So SDL will need to add new code to support whatever the new thing is, while still preserving the old code for the existing macOS versions. This could mean a lot of internal changes to SDL's structs. But since they were opaque to your program, it doesn't affect the ABI. So you can just drop in the latest SDL dynamic library with your program (no recompile necessary), and things will work.


If you're explicitly looking for that pattern, the OpenGL API has a lot of it. I'd like some other examples as well.


This could also be viewed as a comparison between two different styles of C++.


With C99 it's not that obvious. Compound literals are outside of C++ standard. However GCC for example supports them, but a bit differently [0].

  $ cat foobar.c
  #include <stdio.h>
  
  struct foo {
  	int x;
  	int y;
  };
  
  int foobar(struct foo *f)
  {
  	return f->x + f->y;
  }
  
  int
  main(int argc, char *argv[]) {
  	(void)argc; (void)argv;
  
  	printf("%d\n", foobar(&(struct foo){10, 20}));
  	return 0;
  }
  $ gcc -Wall -Wcast-align -Wextra -pedantic -std=c99 foobar.c -o foobar && ./foobar
  30
  $ g++ -Wall -Wcast-align -Wextra -pedantic foobar.c -o foobar && ./foobar
  foobar.c: In function ‘int main(int, char**)’:
  foobar.c:17: warning: ISO C++ forbids compound-literals
  foobar.c:17: warning: taking address of temporary
  30
  $ # taking address of temporary means that it's undefined behavior
[0] https://gcc.gnu.org/onlinedocs/gcc/Compound-Literals.html


The stated he was only using the subset of C99 that could compile as C++.


How do you avoid the need for dynamically growing arrays? For instance, if you are doing any sort of IO, you often don’t know the size of the input beforehand. That either forces you to use malloc or do some strange low-level manipulations of the stack.


Something like this is all you'll need: https://gist.github.com/jstimpfle/562b2c3e9fe537e378351bb9d5...

Explanation: A little macro that adds the correct element size. Of course, you can opt to add it by hand as well, or put it in the run-time data structures and only set it once at initilization time. I personally think these approaches are error-prone / ugly, though.

There is also the stretchy-buffer approach from Sean Barret, where you don't need to explicitly allocate bookkeeping data (capacity value in my example). That one puts the bookkeeping data and count in front of the actual data.

I prefer the explicit approach, though, and I MUCH prefer having an explicit count for growing vector-like things. Because I hate when I have parallel arrays and each one has its own count. Having a shared count variable for parallel arrays is much clearer.


BTW that code has a security vulnerability in it: the size computation can overflow and you can end up shrinking your buffer, and you will end up smashing your heap, and then bad things.


I know. I don't care. Partly because it's only an example and the requirements are not spec'ed out. Partly because you have potential overflows basically everywhere, if you disregard the context. Let's just say the caller is assumed to make sure there will be no overflow. (Or just add a no-overflow assertion - I don't care).


Or you read a fixed block at a time. This gets awkward if you need to stitch together the end of the previous block with the new one but it does have the advantage of being fast, especially if you fix the block size to the block size of the underlying filesystem (or a multiple).


+1, maybe it is because I'm not programming the most complicated things in the world, but almost all of my data structures are fixed size arrays (or are stored in them). In my experience, dynamic memory management simply isn't needed unless you are doing something very very complicated.


Hehe, the lost art of not having everything in RAM at the same time. Async APIs are getting better and better all the time but it seems like streaming APIs hasn't gotten the same love, even though it should be easier than ever now as it's just an application of async primitives.


Yeah, use the heap: realloc() is your friend. Bit often times with io you just work in chunks, e.g. read up to X KB from a socket and then write it to file.


The wonderful thing about C is that bad coding can't hide. The programs blow up early and loudly. In safer languages, the memory bombs go away through garbage collection / safe pointers / borrow checking / etc, but all of the uglier and quieter mistakes are still very much there.

Knuth and von Neumann took the same approach to PRNGs back in the early days...


> The programs blow up early and loudly.

Loudly, sure. Early? I wish that was the case. Heartbleed and other high-profile exploits rely on this sort of thing being able to pass unnoticed.


You got me, fair-and-square! Speaking holistically though, most of the programs I enjoy using on a daily basis were written in either C or C++. Safer languages should have a tremendous edge here, but the proof has not made its way into the pudding.

Just think C works as a kind of "high-pass" filter for programmers--and is net positive to overall software quality. Memory issues? YMMV


Well, the ones that blow up quietly are the ones that you must be really careful about. Imagine a slow memory leak that would only crash the system after a few months on continuous usage, I could throw up just thinking about it :-)


Why not C11 instead of C99? It brings even new better safety


You can't learn C in an afternoon.


You can absolutely learn the syntax of C in an afternoon, or a few days at worst, if you're not having to learn to all the other concepts interesting C programs require. Much of the immense learning curve comes from th things that are "outside C", like how to synthesize high level concepts from the lower level primitives in the core language and system specific gotchas the language doesn't shield you from.


> You can absolutely learn the syntax of C in an afternoon, or a few days at worst

That depends on what you mean by "learn".

Learn it well enough to make something that works for you, on your architecture, with your toolchain? Sure.

Learn it well enough to know which combinations of the syntax you (thought you fully) learned trigger undefined and wildly variable behavior in the wild or when run with other libraries/build environments? Hell no. And bugs caused by that stuff are distressingly common.

That's not a "system specific gotcha". It's a huge area of skill-requiring behavior that arises directly within the core syntax of the language.

Saying "you can learn C in a weekend" is like saying "you can learn to drive a car in a weekend" in reference to learning how to maneuver a car alone in a big empty parking lot. Technically true, but it'll leave you woefully unprepared for operating that car the way almost everyone wants to: around other cars, in more complicated contexts, subject to many more unpredictable factors (other people).


"Learn C in a weekend" (or afternoon) is akin to getting accustomed to a new model of car, _with prior driving experience_.

Each car has its own quirks, each programming language has its quirks. Learning C from the background of already knowing how to program is like driving a new Ford Focus off the lot and getting used to it.

Learning C++, on the other hand... is expecting you to pilot a jumbo jet after your only experience was the Ford Focus.


> "Learn C in a weekend" (or afternoon) is akin to getting accustomed to a new model of car, _with prior driving experience_.

...in an empty parking lot, sure. C brings a whole host of old issues to the table that you simply do not get to experience in other languages.

C is deceptively easy to think you know, but I have yet to see evidence of someone actually using it responsibly.


A big problem of learning any programming language "in an afternoon" is dodging all the cargo-cult bad advise on the internet. If you know what's bad advice it's easy but when you start from 0, any tutorial looks equally legit and you might end up a week in a rabbit hole that is more destructive than productive.

An experienced programmer knows when to stop googling, when to use classes from the standard library and when to roll your own algorithm. A beginner does not and can easily start writing their own timezone-handling class or spend a day finding and configuring a package on npm that could be written in 3 lines yourself.

So in hindsight it feels like you should have been able to learn all this in an afternoon, but the road there is actually tougher than you remember.


You can absolutely learn the syntax of C in an afternoon, or a few days at worst, if you're not having to learn to all the other concepts interesting C programs require.

It is the same with any reasonably serious language. I mean you can learn Python syntax in an afternoon but it will be 6 months before you can use NumPy (and friends) without spending more time poring through the documentation than actual coding. Same with Java, C++, .NET, to be productive you must internalize a vast ecosystem and if you aren't doing it every (working) day then skill fade is very real.


You don't really know a language until you've program with it awhile. You are not going to write many significant programs in 24 hours.


To use an old meme: "One does not simply learn C". Instead, one learns:

1. Everyday C syntax. It's not strictly required to learn every esoteric and historical thing about the language; if you go down that rabbit hole, you'll be down there for years.

2. How to properly structure and organize code and header files. It takes a little while to understand exactly what goes into a header file (and why), how to anticipate and avoid problems like cyclical include dependencies, etc.

3. Data structures and algorithms. While not a feature of the language itself, most projects are going to need non-trivial data structures (one example: hash map/table or alternative). Surprisingly, there are very few freely available implementations one can just import and use. By and large, every developer custom builds their own implementations. It should not be underestimated how much of a time sink this truly is. This aspect is trivialized by veterans, who have spent a decade or longer accumulating and tuning their own personal collection of reusable/tweakable snippets.

4. Compilers, linkers, and build tools. Thankfully the standard compiler options are good enough for nearly everyone, because the options are so numerous and complex that I'd bet even the developers of the compilers themselves get lost in that forest.

5. Finally, you're not doing much with C itself; rather, you're interfacing with everything your operating system provides. Most of your time is spent learning every aspect of POSIX (or equivalent) for your target architecture. This is quite straightforward (just time consuming) if you're targeting one architecture - say, Linux. The moment you want to support anything even remotely cross-platform (Linux, *BSD, macOS, Windows), you're essentially opening Pandora's box.

Note that none of the above is really a criticism of C. There is simply more to using C than its syntax - and those other moving parts are where nearly all of one's time will be spent.


>> So one thing seems to be clear: yes, it’s possible to write a non-trivial amount of C code that does something useful without going mad

As an embedded engineer who had written many low level code, all I had was C and assembly. I know that the higher level of abstraction you go, the amount of languages available gets bigger and bigger.

I am rather bemused that he had this impression about C. I wonder how many other tech folks has the same impression??


> I wonder how many other tech folks has the same impression??

I do! I've worked with C, C++, Python, Java, Go, JavaScript, Ruby, and Pascal, but for anything that isn't network-oriented or a simple CLI tool, I always reach for C (sometimes a strict subset of C++). Its distinct lack of features allows me to focus more on the problem at hand and less on the best of the hundred different ways to do something in other languages. Go is the same way, which is why I use it extensively for mid-level stuff like CLI tools and servers.


I've used C and Go quite a bit, and I find it a horror to write.

I end up duplicating tons of code - the exact same code copied hundreds of times across the codebase, with every time slightly different changes made to it.

How do you do custom data structures that need to parametrize over multiple types without duplication in C or Go? How do you do sane error handling without 90% of your code being if (...) { // handle error } or if err != nil { return err } ?

I mean, just read https://git.kuschku.de/justJanne/statsbot-frontend/blob/mast... as example. (It's a quickly hacked-together project I wrote in about 10h)


That code can be simplified a lot if the db object stores the error itself so it can be checked only once after few calls, like what bufio.Scanner does.


That database object is from https://golang.org/pkg/database/sql/ which is part of the standard library of go.


Linux kernel developers would be bemused, too.


"It takes a lifetime to master C++"


Yeah when I see stuff like this I want to just go be a barista or something: https://stackoverflow.com/questions/13230480/what-does-auto-...


Why do some devs spell JavaScript and TypeScript with the incorrect casing? Not worth their time? Toy languages that aren't to be respected? Not like the rest of the post was riddled with typos to justify it.


But what did you think of the article?


For a small, one person project you can use absolutely any language and feel great about yourself and the language. It's like a stroll in a park - you can do it in any shoes, or even barefoot.

Where language matters are thousands or million lines of code projects, with multiple teams and people of varying experience and abilities, coming and leaving. With stupidly short deadlines and changing business requirements. With liabilities, support, maintenance, budget, time-to-market, etc.

Then it's like climbing Mt. Everest, in a snowstorm, with limited supplies and low team morale.

This whole article is just "mom look, I can walk". I come to HN to hear a "I've survived hell and here is how my shoes did" story.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: