Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Memory management in C programs (2014) (nethack4.org)
216 points by adamnemecek on July 18, 2016 | hide | past | favorite | 103 comments


I'm going to claim that the problem being solved here fundamentally comes from attempting to shoehorn early exit into an ancient codebase that wasn't designed for it, via longjmp. Exceptions can be a handy language feature, but plenty of modern code written in languages like C++ (whose exception support is often eschewed) and Go gets by without them. It just has to be written so that most functions manually propagate error codes they receive, potentially after manually unwinding some core state. C code can be written that way too: the manual cleanup that must be written then extends to mundane buffer freeing, but that's a tractable and local requirement; no need for complicated global reasoning around static buffers and custom allocation schemes.


> plenty of modern code written in languages like C++ (whose exception support is often eschewed) and Go gets by without [exceptions].

I would not consider C++ code that eschews exceptions "modern". To eschew exceptions almost always entails giving up RAII, and if you give up RAII you've done yourself a huge disservice: you're now stuck manually managing memory and resources, and suffering all the bugs that come with that. While RAII does not outright prevent these bugs (which is one of the reasons I'm a huge fan of Rust), coding along its principles greatly reduces the risk that you will run into problems.

If you're wondering: giving up exceptions means a constructor has no way to signal failure, as the result of a ctor in C++ is always either an exception, or a constructed object. Most "no exceptions" C++ code I've seen opts to construct what I call "zombie objects"; internally, the object tracks that an error has occurred, and all uses of the real object must be first checked against that internal error flag to ensure the object isn't a zombie (or if it is, then error). The object is effectively a null pointer, and has all the trouble that entails.

(I also won't pretend that exceptions aren't without problems; in particular, the argument against them because you can't tell, at a particular point in a function's code, if an exception can occur, or where it would be caught, is completely valid, but no different in many popular languages, such as Python, C#, and in some ways, Java. However, I think the advantages of RAII — and exceptions — outweigh their disadvantages.)

> It just has to be written so that most functions manually propagate error codes they receive

When I've had to write C, I've found this to be exceptionally tedious to do correctly, and all too easy to ignore.


RAII (or rather the only useful part of the idea: that the destructor is called when an object goes out of scope) without exceptions is totally fine if you're not too dogmatic about initialising everything (especially not things that can go wrong) in the constructor but instead have one or more separate init methods with a return value. The only important part is to cleanup everything in the destructor. This is all IMHO of course, but I've been writing C++ code for 20 years just fine without ever thinking 'gee it would be nice to use exceptions here'.


Strongly agree. RAII without exceptions is fine. Exceptions are deeply flawed for code understandability, and anything you get by doing work that can fail in the ctor is mot worth it.


Being able to construct without initialising is the flipside of being able to descope without deinitialising, and introduces much the same problems. It makes it much harder to use immutability (admittedly something that's hard in C++ in any case). It makes sharing objects between threads much harder to reason about, as you need to make sure init a) only happens once, but more importantly b) the effects are visible to any other thread before it uses the object.

How much code have you written with exceptions? A lot of features you don't miss if you don't use them, but once you're used to use them it's very frustrating to live without them. Back when I wrote C I never thought "gee it would be nice to use an anonymous function here", but now that I'm used to them it's very frustrating to not have them.


Admittedly I haven't written much code with exceptions, only when using C++ frameworks that heavily rely on them, but this was always for tools where productivity was more important than performance, my actual background is game development.

I have stumbled more recently over another case where the constructor/destructor paradigm in C++ is not ideal, since in some cases it 'encourages' constructing objects on the heap, just for the reason that allocation and initialisation are the same operation, I haven't thought this completely through yet, but bear with me:

Basically I'm trying to write code in a way that minimizes heap allocations. In the extreme case I have just one big 'application object', and all other objects are embedded in this application object. For variable number of objects, pools/arrays with the max capacity are embedded. If taken to the extreme, this model only does a single allocation at application startup (or it could even live on the stack).

However, there are systems which need to be initialized later, and in a specific order (for instance rendering, audio, input, etc...). The objects for these systems have already been created and had their constructor run, but they are not ready for use. That's where I need separate initialisation steps, unless I want to put those objects back on the heap. And in reverse, I need to teardown graphics, audio, input before the big application object is destroyed (and the destructors of the embedded objects are called).

Now I don't think that this extreme case of trying to minimize dynamic allocations down to 1 is particularly useful in real world code, it's more of a though experiment, but in this case, the good ol'e C way of separating allocation/deallocation from initialization/teardown is actually more useful.

Unfortunately I have stumbled over C++ frameworks which basically require to create all objects on the heap, which in my opinion is a design fault. A framework should not dictate to its users whether objects are created on the stack, on the heap, or are embedded in other objects.


I agree that developers and code can have a lot of knowledge that lets them make allocations a lot more efficiently than the likes of glibc's malloc() (which have to handle a lot of edge cases and a lot of different workloads). As an extreme example, if you can architect around cooperating short-lived processes (Erlang-style) you can use a simple bump allocator and "dynamic" allocation is effectively (almost) free.

I wouldn't want to be doing anything custom around allocation without the support of a type system though. Having allocated-but-not-fully-initialized objects hanging around and looking exactly like fully-initialized objects is a recipe for disaster, and while it may be possible to keep track of things yourself through constant vigilance, the cognitive load just isn't worth it IMO. So if I had to do that kind of thing in C++ I'd probably look at having a custom allocator and some way to pass hints to it, rather than doing allocation and initialization separately in "normal" code. (Well, in reality I wouldn't use C++)

> Unfortunately I have stumbled over C++ frameworks which basically require to create all objects on the heap, which in my opinion is a design fault. A framework should not dictate to its users whether objects are created on the stack, on the heap, or are embedded in other objects.

Depends on the use case IMO. The flexibility of being able to control all your objects is valuable, but it doesn't come for free. One can't really generalize about what's idiomatic C++, because there's such a broad range of users.


You can perfectly well have classes which have different levels of initialization and clean them all up in the destructor...

RAII is generally useful without exceptions.


I've not seen a large production C++ code base that did anything with exceptions other than to catch them, record some debug information and restart.

This is especially true when third party code is involved. C++ is hard enough to get right without exceptions; dealing with failure in bodies of code that you didn't write and maybe don't have control over (or even source code of) is death on wheels.

If that means the code I'm working on isn't modern . . . I think that's fine. But the first volume (of three?) of Herb Sutter's Exceptional C++ should be enough to convince anyone that dealing with exceptions "correctly" is time better spent eschewing exceptions and restructuring your code so you can do worthwhile hard things.


If you have a server that should process 100000 requests and processing one of them happen to throw an exception, it's very easy to just wrap request->process() with try catch and continue processing the rest of the requests.

You can also use it to recover parts of the processing with sane defaults. String parse integer exception? Use constant 0 instead if that makes sense, etc etc.

I don't see why people make exceptions in c++ into a big deal and a big mystery, in other languages people just use them, the only caveat in c++ is throwing in destructors but that's easy to remember. In one place I worked they had meetings for over a week to discuss whether it was ok or not to throw exceptions in a constructor or not T_T.


A colleague of mine is working on a C++ Qt application which can encounter failures in many different ways. It is not feasible to restart the application, the user must get a proper error message instead. He didn't use exceptions because Qt doesn't use them (it returns "empty" objects instead) and.. well, error-handling without exceptions is brittle.

In hindsight, he should have coded the logic/business part in "modern" C++ with exceptions and implemented thin adapters to the GUI part. That's what I did in another piece of code, and am extremely happy with the outcome.

It's not even hard to get them right, "the lightbulb" lit up for me after reading the chapter on exceptions in Stroustrup's TC++PL.


They make it easier to diagnose crashes by providing more context than just an error code without extra effort (like a call stack). Fail fast is another option but that takes control away from the caller.


> C++ is hard enough to get right without exceptions

In what manner? If you use exceptions, you must (IMO) use RAII, or yes, it will be painful. Every example of "painful C++" I've seen involved someone ignoring that, and that is fighting the language. I understand that's a bit of a strawman argument, but you didn't elaborate enough in your post to debate it.

If you are using RAII, exceptions shouldn't be terribly hard to "get right"; an exception propagating up the stack will free resources as it goes. (This is the same behavior, again, as numerous other languages; except in C++, I have the benefit of RAII, in others, I generally do not.)

> dealing with failure in bodies of code that you didn't write and maybe don't have control over (or even source code of) is death on wheels.

It's peculiar that I only ever hear this argument against C++, when Python, C#, and to an extent Java¹ work in the same manner. Were you writing a C++ webserver, for example, I expect that you would do exactly what my current job's Python webserver does: if an uncaught exception makes it to the main request handler, it is caught in a catch-all, duly logged, and a 500 served in response. This is appropriate both in Python and C++ IMO. (Whereas in the hypothetical C++ side, I gain the benefits of static typing and RAII that I can never have in Python.)

In my day-to-day Python, we quite often learn about uncaught exceptions — from third-party code that I didn't write and maybe don't have control over — in production; sometimes, we will add a handler for that particular exception where appropriate (it was a bug) or fix the underlying symptom (the exception indicates a larger problem, and allowing it to propagate to a good catch-all point was appropriate, as was 500'ing the request).

But again, "death on wheels" is not really something I can provide worthwhile argument against. Perhaps where third-party code has caused pain (at least, for me) is when a very low level (say, socket error) propagates through the stack; there might not be a good way to catch these beyond "catch everything", but again, that's not unique to C++. The biggest problem I've had with these is that they lack context: what caused this socket error? Or, if some library catches & re-throws its own exception class, too often do they discard the inner error, and I lack the root cause (a different lack of context). But you see this in Python as well. Error codes are perhaps the worst, as they will quite often force you to lose context; do you expose error codes for all of your inner error conditions? (And theirs, transitively?). (I've never seen a non-exception C++ or a C codebase use anything aside from error codes as a substitute.)


>> C++ is hard enough to get right without exceptions > In what manner?

I guess he's referring to noexcept/basic/strong exception guarantees. Yes, writing code with strong guarantees is hard, but what makes it hard is that you have to write your code "transactionally", i.e., either 1) prepare changes and "commit" them in one go, or 2) roll back changes if an exception occurs.

But writing code with "strong guarantee" is just as hard, if not harder, with error codes.


Its not harder. You know exactly where that function can return, with exceptions its very hard to know what might throw in practice.


With exceptions you can use something like scoped exit (http://www.boost.org/doc/libs/1_61_0/libs/scope_exit/doc/htm...) for automatic rollback and neat cleanup even in multilevel "transactions". With error codes, nothing happens automatically, hence it's easy to mess up cleanup/rollback.

Yes, you can still use RAII for rollback/cleanup in combination with error codes, but then you get the worst of both worlds: the (minor) complication of writing RAII classes AND code cluttered with manual error-checking.


I've seen exception classes split into recoverable and non-recoverable. The former clean up as they propagate and move the application to an initial known state in a state machine.

This works fine. I think the worst bug was an exception thrown from the initial state which lead to an infinite loop.


I don't think it's a secret that Google's C++ code base (very large and growing) does not use exceptions. And since starting there I have not missed them. Exceptions make execution flow hard to reason about. And RAII is most certainly still a thing without exceptions. There's lots of other ways to unwind the stack.


Google is an example that writing production code that doesn't use exceptions is possible, not that this is the right thing to do.

They said that their decision was primarily legacy-motivated (in short, they were fearing that introducing exceptions in a non-RAII codebase would lead us to a long period of instability).

I'll add that in some situations (e.g when freeing resources) the control flow gets very convoluted, and you don't want to deal with directly. See: Andrei Alexandrescu - Declarative Control Flow ( https://www.youtube.com/watch?v=WjTrfoiB0MQ )


>> It just has to be written so that most functions manually propagate error codes they receive

> When I've had to write C, I've found this to be exceptionally tedious to do correctly, and all too easy to ignore.

Indeed. I write Go all day and I type "if err != nil { return nil, err }" so often I could scream.


They should probably add a keyword for that phrase.


Eh? RAII works perfectly fine without exceptions. Either - You write a constructor that cannot fail for RAII types that cannot fail, or - return some kind of option type from your BuildRAII() function, and handle the failure case however you want.


> To eschew exceptions almost always entails giving up RAII

How so? RAII works just fine when you return errors.


How about in constructors?


If a constructor can fail, don't make that constructor public. Instead, add a trivial constructor to allocate an empty instance of the object, then make your possibly-failing constructor a factory function that takes whatever arguments you would pass to the constructor and returns an error code. The object being "constructed" should then be passed by reference as one of the arguments to the factory function and filled in. (If you're writing this in Rust, you have better options, since you can explicitly take ownership of some of the factory function arguments and avoid the need to allocate a trivial instance of the object by returning it as one of the variants of the sum type Result<T, E>.)

Note: I don't actually know C++ very well. It's possible there's a more idiomatic way to accomplish this. In particular, another wrapper class could be created that owns an instance of the trivially-constructed instance mentioned above. The factory function could then take an instance of the wrapper class and then modify and return an instance of the class you actually want to obtain.


Constructors are resource allocation, so they can always fail.


You can't really tell if the stack overflowed in a constructor though, plus that will essentially /never/ happen, so I think it's okay to exclude stack overflows when creating stack allocated objects as failure conditions. In that case no, a lot of constructors won't be able to fail.


Any function can overflow the stack. If the object is constructed on the heap, 'new' can fail but will always throw an exception; if you aren't using exceptions it will just crash the process.


Yeah. That's why I don't say it's a failure condition for a stack-allocated object to overflow the stack. Notice my parent comment said:

> Constructors are resource allocation, so they can always fail.

Which is false.


Can you explain why this is false? If I run out of stack or heap by exceeding the process limit, the constructor must fail. Is this incorrect?

I see your comment above that stack overflows essentially never happen, but having watched it happen, I disagree.


This is not an error that you can reasonably recover from. If you are out of stack, even constructing an exception to throw is likely to fail.

Because the only correct behavior here is to terminate the program immediately, it doesn't matter in the discussion of whether to use exceptions or not.


new can and does fail when you're out of memory, and this is a condition that many programs do need to handle (for some programs crashing is appropriate and in that case sure, you can live without exceptions). In a pedantic technical sense that's not the constructor failing, but your options for handling it are much the same.


std::optional is now confirmed for C++17, which can make the API a bit nicer. Using factory functions as pseudo-constructors also lets you name them, which is often nicer than having to rely on argument types to disambiguate constructors with different purposes.


Init methods is the answer to that (but it potentially forces some additional complexity around initialized/not-initialized state into the other methods).


As I stated in my original post, these have the same classes of problems as null pointers: you have "constructed" an object whose internal state in invalid; a instance of Foo is either a Foo that init'd successfully or not-a-Foo (that failed to init/isn't initialized; a "zombie object") just as a pointer is either a pointer-to-something or a pointer-to-nothing. This is why you say "it potentially forces some additional complexity around initialized/not-initialized state into the other methods.

At least in C++ (and many other OO languages), I find the downsides of having these Jekyll and Hyde classes outweigh any benefits that may come of ignoring exceptions.


In fact, I think managing this extra state, also dealing with complications that could arise from object slicing, were the major reasons given when Martin Sústrik argued he should have originally started out with ZeroMQ in C and not in C++.

< http://250bpm.com/blog:4 >


How does RAII deal with reference loops? It is trivial to create a loop of owned pointers in memory that never decrement, and it isn't that unusual of a pattern not to be a concern.

Just curious; I used a combination of strong/weak reference counting when I dealt with this. It wasn't fast, but it did appear to be less prone to implicit leaking.


One method is using weak_ptr.


I wrote a C helper library that does poor-man's exceptions using setjmp/longjmp https://github.com/adrianratnapala/elm0.

Partly because of the malloc/free issue, you don't just "throw and forget" like with real exceptions. Instead you jump a short way up the stack, using it as an extra mechanism for error handling inside a module -- living side by side with error returns.

And I found myself using the mechanism quite a lot. To the point where writing a new error-returns based function gave me that itch where I felt this is mistake and I would end up refactoring to the TRY mechanism anyway.


Yeah, I'm surprised they mentioned using longjmp for "exceptions". I think that just shows the age of the nethack codebase - maybe when it was being written people were trying to find ways of incorporating those error handling ideas into C. I doubt any modern C programmers would seriously consider using it over just returning an error code.


It's a common idiom, specifically in modern C code. See "C Interfaces and Implementations", one of the best books on the subject of writing reusable C. The author dedicates an entire chapter to exception handling, with longjmp as the primarily facility for it.


Error codes and longjmp are complementary methods of exception handling. Error codes can't handle every circumstance (well, at least not without undue pain). For example, error codes don't deal with nonblocking i/o or coroutines very well.


Completely agree, and it's surprising to see this in Nethack because game programmers almost unanimously shun exceptions.

In games, you have a small number of functions that can tolerate failure, probably relating to IO. Everything else instantly explodes.


Games that don't use exceptions tend to do so because of performance concerns. Nethack is a turn-based game, so performance isn't as relevant here.


Bear in mind this is Nethack4 which is a fork of Nethack.


Go still has (slow, dynamic) unwinding for the express purpose of resource management. While I can't advocate for using it directly, it is enough of a concern to include in the core language.


Hmm, for C programs I use https://talloc.samba.org/talloc/doc/html/index.html or the more cut-down ccan/tal http://ccodearchive.net/info/tal.html

There's sometimes a great insight available in viewing your allocation hierarchy on a running program, too.


Happened to go look up the Nethack.org page, and hey, the author just "ascended" to the DevTeam earlier this month:

http://www.nethack.org/#News

"""

The DevTeam would like to welcome its latest members. Both of these folks should be familiar to the members of the NetHack community:

Alex Smith, who created the AceHack and NetHack 4 variants. Alex is an expert on the inner workings of the game and the ways in which they can be exploited.

Patric Mueller, who is probably best known as the creator of the UnNetHack variant. Patric also created NetHack-De (German translation of NetHack) and has considerable involvement in the Junethack tournament. Before contracting the dreaded coding bug, Patric was a long time player, having started out on NetHack 3.0 on the Amiga back in the late 80s.

Alex has indicated that, at least initially, he'd be concentrating on improved interface, improved internals, and new features in areas that don't affect the game's gameplay, drawing on some of the features introduced in AceHack and NetHack 4.

Patric's initial focus will be to identify and incorporate changes from a variety of variants into NetHack, in order to expand the gameplay while keeping NetHack's spirit and appeal intact.

Please join us in welcoming Patric and Alex to the DevTeam as we start plans for the next major release.

For the DevTeam, Mike Stephenson

"""


In my opinion, writing code that consists of malloc/free pairs is inviting all the problems RAII and garbage collectors were designed to solve and which rust can now statically check for correctness. So that's basically it: at this point, I don't use the heap to manage memory in C, because if I did I'd instead use either rust or, more likely, a garbage collected language. The fact is, despite going out of their way to use the stack as much as possible in this article (including copying into the stack), its sort of a wonder they didn't realise a second, manually managed stack would be easy to manage and be orthogonal to the program stack. Yes, it has a top, but so does the program stack, and so does the memory backing the heap (yes, yes, pages, et cetera). It doesn't get you all the way there, but for many small programs, which might otherwise do a lot of mallocing and freeing as it dutifully initializes and destroys objects, you can get away with just one call to malloc, and never pop the second stack. Allocate a few stacks in your stack, throw them in a free list, and now you're cooking with gas


Yes because pretending it does not exist and not understanding how things worked prior to rust is totally realistic.


For all the good things Rust brings, one should not forget Rust isn't the first language having RAII, regions or being memory safe for systems programming by default.


With C, at least for me, it all starts with memory management and data. Specific use of it dictates what I'm going to do (mostly image manipulation and processing). I tend to allocate one or more big malloc buffers and manage them with TLSF. I've also looked into halloc for hierarchical allocations.


if you remember NASAs rules for programming, and MISRAs rules, then dynamic allocation is out of the question. Certainly for most all embedded devices I've worked on all significant memory is statically allocated, and the rest are stack variables. Ideally with proofs of maximum stack depth.


This page is about an old game, though.


    case OPTTYPE_KEYMAP:
        str = malloc(1 + sizeof "submenu");
        strcpy(str, "submenu");
        return str;
There isn't a good reason for this not to be

    return strdup("submenu");
is there?


>1 +

That's unnecessary. But then again I can't really blame them. It isn't always clear when you have to do that. Also a previous version might have used strlen() instead of sizeof.


malloc+strcpy is standard C, strdup is not.


While I agree, strdup is POSIX. In the exceedingly rare event that you're on a system without it, it's simple enough to roll your own. (And if this occurred more than a couple of times, I'd roll my own anyways to keep the code obvious.)


Nethack does have a lot of ports to weird systems (and compilers that might not special-case strdup on a string literal to avoid the strlen, which I didn't think of when I brought this up in the first place), so this seems like a fair enough reason to me.


In the exceedingly rare event that you need to target Windows?


Honestly, when I wrote the post, I thought I recalled that Windows had strdup (Windows does include some limited POSIX functionality, IIRC; e.g., "select" definitely exists, and is also a POSIX function, so I figured surely strdup was there.) MSDN seems to indicate that Windows does in fact have strdup — and that it's "deprecated" in favor of "_strdup", which appears to do exactly the same thing as strdup… what on earth is going on here?


#define strdup _strdup


Windows isn't the only non-POSIX OS out there, maybe for the HN crowd I guess.


I'd wager a significant amount of production C is running on systems without ANY OS


> In C++, the use of RAII means that exceptions can be made to free any dynamically allocated objects whose owners went out of scope as a result of the exception. In C, we don't easily have that option available to us, and those objects are just going to stay allocated.

In GCC and Clang, there is the "cleanup" attribute which runs a user-supplied function once a variable goes out of scope, so you can have scope-based destructors:

    static void do_cleanup(char **str) {
        free(*str);
    }

    void f(void) {
        char *s __attribute__((__cleanup__(do_cleanup))) = malloc(20);
        // s gets freed once the function exits
    }


Which is inherently not portable across other C compilers.


This is a very interesting intro to some ways that a C program can manage memory.


<offtopic>As someone obsessed with writing C code, something just clicked. I think I understand better why there are so many JavaScript programmers, so many tools and frameworks written in and around JS every day.

For me, C is the gateway to the computer. It's ubiquitous. It's deceptively simple. I can literally write anything I want in C. It's an exciting thing to have an open main.c file sitting in front of me.

That must be how JS developers feel about the browser. The browser is a natural way to start learning programming for someone new to it these days. When I was growing up, we didn't really have JS and the web was brand new. I had to learn from QBasic's "Help" menu!

I bet there's an age correlation, too. I bet over a certain age, it's way more C programmers, and under it is way more JS people. (And above us is probably FORTRAN and above them is probably COBOL.)</div>


I agree completely about C being the gateway to the computer. It really is the only way to go if you want to learn real programming. That doesn't mean I'm discounting JavaScript developers, but I like C better and it's more fun.


I think you're right that C is a gateway to learning how computers work. But I think using the term "real programming" in any context is flame bait...


Sure, with C, you will learn about some "low-level" things about computers that you never will if you only program in most other languages. I am talking about things like memory, bytes, addresses, the stack, binary arithmetic, bit twiddling... But that's pretty much it. Yet, there is much more to how computers work than that, and many (far from all, though) of those things you can only get a feel for if you learn some assembly programming. From there, you will see that C is a high-level, rather abstract, programming language, if only somewhat reflective of the idea of a computer of the von Neumann type. That's all. In the end, even the "machine language", of which assembly is a human-readable representation, today is usually interpreted or otherwise translated into a "true" machine instructions that get executed by the hardware. To get but a glimpse of what is going inside a computer - still on a somewhat high level - take a look at an excellent article by Mark Smotherman at https://people.cs.clemson.edu/~mark/uprog.html. If you really want to go deeper, there are transistor-level CPU simulators that run in the browser, such as the famous "visual 6502".


Slightly off topic: The way I look at it, C and C++ are the bedrock of all programming languages, because the popular compliers (llvm/clang, gcc and VisualStudio) are written in C and C++.

JavaScript, for example requires a browser to compile, which requires gcc or clang to compile. Python, Rust, etc all depend on programs written in C and C++.

Go is one recent exception of a popular language whose main implementation is completely self hosted. Though, Go doesn't host other popular languages.


I hate that way of looking at it because I think it's totally wrong.

C (this view is usually presented as only about c) isn't fundamental or bedrock in any way. Most of our current stacks just happen to be written in it.

The original Mac OS for instance was written in Pascal and C. The most popular open source compilers are written in c++. Most browsers are written in c++ not in c.

And there's no reason these days you couldn't rewrite llvm (a set of compiler libraries and compilers mostly written in c++, not that rust compiler uses llvm for code generation but the frontend is in rust) in haskell or java or javascript(they'd be slower but they would work). Haskell has useful features for writing compilers and a lot of language analysis libraries already. If you re-wrote in in rust it'd be just as fast.

Hell this guy (https://github.com/jameysharp/corrode) wrote a rust-c source-to-source compiler in literate haskell. Also in a few years servo will be a full browser in rust and Firefox is already adding components written in rust.

Other then os interface & c binary api linking there is nothing special about c or c++.


> Most of our current stacks just happen to be written in it.

Sure, there's nothing special about the languages themselves, but _currently_ C and C++ are the foundation for (almost) all other languages. In theory you could rewrite llvm (or use a different complier), but I don't see that happening any time soon. That's why I call it bedrock, because it would be very hard to change.

(Though, I do wonder when llvm will stop depending on Python 2 :)


> Go is one recent exception of a popular language whose main implementation is completely self hosted. Though, Go doesn't host other popular languages.

It wasn't self-hosted in the beginning, only since version 1.5 IIRC.


I'm not sure something can be self hosting from the beginning in the context of computing.


> I agree completely about C being the gateway to the computer. It really is the only way to go if you want to learn real programming.

Nonsense. That's as absurd as claiming assembly is the only way to go if you want to learn real programming.

> but I like C better and it's more fun.

Which is the only reason you think the above, nothing to do with reality, everything to do with personal preference.


I used to write C but now I do Coffeescript/JS. When I was just starting in C (as a teenager) I'd get really excited to learn the GNU's C standard library. I wanted to know how to make better and better console applications and networking applications. It felt like there was a huge learning curve to using graphics toolkits. Now that I do JS I feel like Web APIs are the new "C standard library". I can make console or apps with a GUI. I can do more because there already exists a large amount of interfaces abstracted over things that would take me a while to learn. At the same time, people are always figuring out new ways to abuse existing web standards for fun and profit - like how page visibility was done before the Visibility API existed: https://developer.mozilla.org/en-US/docs/Web/API/Page_Visibi...

I like being a JS dev. It's a very fun runtime to poke around... I'd almost prefer all apps to live within the heavily sandboxed browser. It's the best effort made toward portability and it's been a community effort. :x


It depends on what kind of programs you like to write. Personally I like to write things like programming languages or VT100-based text editors. And I like the added challenges that C presents, especially related to data modeling, memory management, lack of first-class functions, etc. So C is a natural choice for me, whereas JS adds little to no value, and actually takes away the challenge.


I hate installing things. :x I get pretty turned on by offering up something like Google Sheets as a webapp that installs and is usable in seconds over the Internet. I used to really love Lua but even though it was <1MB to install people didn't want to download it and then run my script on its interpreter. I remember I made the choice to play in Lua after Python because Python was a large install at the time (60MB for interpreter + standard libs). I honestly feel like the desktop experience is moving into the browser. It's very convenient to "install your runtime" by going to a web address. I love being somewhere near the front lines of that. :3 </zealous-panting>


Sure, unless you're the guy who writes Minecraft. People installed that. And Atom. And Spotify. There's still some room for desktop apps like these, granted not as much as before.


I'm not saying people won't install desktop apps - I just like that there is a diminishing reason to do it outside the browser. [High performance] games, browsers themselves, video players (VLC), and dev environments, are some of the last things left...


As a non-C user, C was always too forboding. I wrote a bit of it, and read K&R, but places like HN bang into your head that you'll likely make a horrible screw-up writing C, your code won't be secure, and it's better not to bother. Maybe I should spend some more time with it.


You might enjoy trying some embedded development. Memory management and object lifetimes are usually less complex, and security is not an issue unless the device talks to the internet. You rarely feel like you are "fighting" C; it feels like the right tool for the job.


Well, that's because it IS the right tool for the job. C was built for systems programming: kernels, filesystems, and the like. It really wasn't built for applications space programming.


No, C wasn't built for those things. Rather, it just so happens to be suitable for them. C was built as a higher-level alternative to assembly, for writing any kind of computer programs.


You can say that, but Dennis and Ken were writing Unix. The language got its feature set from what they needed for that application. And that application was systems programming.


Right, and Unix had things like calculator programs, spreadsheet programs, email programs.


Yes, but those aren't the projects that dictated the featureset of C.


You _will_ make a horrible screw-up and your code _will_ be insecure. However, it's worth the bother as an educational experience. For anyone wondering, I write C for a living, learned C 22 years ago and I hate writing every line of it (now, I liked it in the beginning).


It's definitely a worthwhile educational experience, even if you never write in it professionally.

If you can solve the problems in K&R, a good book to work through next is Computer Systems: A Programmer's Perspective [0].

Coursera used to have a course called The Hardware Software Interface [1] which covered the same material, but it doesn't seem to be availble right now.

[0] http://csapp.cs.cmu.edu/

[1] https://www.coursetalk.com/providers/coursera/courses/the-ha...


Legends say that in the mist of time there was an age of people between FORTRAN and C, that used to know better ways to the soul of the machine than C.

Rumors are coming to town that those druids are now gathering on the valley between the Misty C++ mountains, Ada forest and Rusty valley.


And above us is probably FORTRAN and above them is probably COBOL.

Actually, I think assembly language would probably be the next age "peak". If you know Asm you'll be far better at C; things like pointers and indirection immediately make sense. The effect is weaker, but still there in the other direction.

(FYI, FORTRAN predates COBOL by a few years.)


> If you know Asm you'll be far better at C; things like pointers and indirection immediately make sense.

Be sure not to confuse cause and effect there: if you managed to learn ASM, you must not be the sort of programmer who is confused by pointers and indirection.


No because in x86-64 assembly main memory is effectively just a very big byte array and a "pointer" is just a index into that array. In C a pointer can be many different things depending on the underlying architecture.

When Intel made their 16bit cpus they still could address 20bit of memory so they decided to create the DS (current segment) and DX (offset) register.

The problem with far pointers is that they could have different values but point the same physical address and C still carries that legacy with it in the form of undefined behaviour which makes everything needlessly complex.


Somehow tens of thousands of people from the 60s to the 90s managed to learn asm, so I highly doubt that's the direction the cause and effect goes. Assembly isn't particularly complicated, just tedious.


I never understood what is so complicated to grasp about pointers.

Granted I am a 70's kid that started with Timex BASIC and Z80, but we learned pointers in one afternoon drawing boxes and lines on a piece of paper and that was it.


I don't think you can draw that conclusion either.

At my university, the Electrical & Computer Engineering curriculum teaches freshman assembly language before C. All of the ECE students understand pointers immediately. The Computer Science curriculum teaches C++ (with raw pointers) before assembly language. Pointers are significant source of frustration for many students in CS.


> If you know Asm you'll be far better at C; things like pointers and indirection immediately make sense.

A lot of recent issues with compilers "introducing security issues" has come from this sort of understanding of C (overflows, other undefined behavior, etc). The places where C differs from just writing the assembly yourself are really really hard.


And this is generally about compilers behaving badly.

Some languages do a good job of creating an abstracted universe above the hardware that plays by its own rules. This allows the language/runtime to map programming constructs to hardware behaviour in highly inventive ways.

C never was such a language, it's hardware abstractions are -- by design -- as leak-free as a fishnet. And yet compiler writers feel they need to get creative.


> I bet there's an age correlation, too.

Only 90's kids will get JS this.


> Sprintf(...)

Seriously? Why would anyone use sprintf() in 2013 without knowing the size of all possible arguments.


The article does touch on this...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: