For C++ many standard library functions can't be implemented in standard C++. And it not only shows some very obvious compiler built-ins in the library, but the library can depend on behavior guaranteed by the compiler which is otherwise undefined by the standard. Therefore it's not always wise to learn techniques from the standard library, as it can depend on implementation defined behavior, and this dependence can be subtle too.
Not too long ago it was impossible to implement std::vector in standard C++, as constructing objects next to each other formally did not create an array. Therefore pointer arithmetic on the resulting pointers were undefined behavior, as pointer arithmetic is only defined within an array.
This was fixed by the array being implicitly created in certain circumstances.
My advice would be: to learn a new language, simply start writing some non-trivial projects in it (a few thousand lines of code or so). In some languages (like Python), the standard library is the actually important feature, in other languages (like C), it's better to mostly ignore the stdlib. Example: I started learning Zig by writing a Pacman clone (https://github.com/floooh/pacman.zig) and a home computer emulator (https://github.com/floooh/kc85.zig), the Pacman clone doesn't use any Zig stdlib features at all, and the emulator only minimally for memory allocation, parsing command line args and loading data from files.
Zig's stdlib is much more useful than C's, but it's still entirely possible to write useful programs without it and instead focus on learning Zig's language features first.
But on the other extreme, the whole reason why I learned Python was its "batteries included" standard library.
> To learn a new language, read its standard libraryTo learn a new language, read its standard library
Yes, but also no.
The standard library is often more complex and uses more advanced features then you often need for most projects. I.e. collections in std are supper general purpose, but if you need to write a collection it's normally specific to the purpose you need it for.
I.e. std is the "most" general purpose library you normally find so even if it's written with "KISS" in mind it's still often not so simple.
Then it sometimes uses unstable language features you normally can't use (as it's often made "in sync" with the language) and/or optimizations which in most other cases would count as "pre-mature" optimizations.
Through without a question you can learn a lot there, you just should be aware of the points above.
For older languages, it’s also pretty common for some of the stdlib’s corners to be very dusty with API design, naming conventions, and coding patterns considered at best way outdated.
Optimized code is not an anti pattern! What is an anti pattern is premature optimization, code you write a certain way because you think it is optimized but because you don't understand the compiler and hardware, it is not.
Optimizations in standard libraries are generally good ones, and reading it can give a good idea of what is or isn't worth it.
I think the comment simply stated that when learning a language you should focus on idiomatic code and not optimizations. The latter which is probably present and justified in a standard library.
Take Scala collections. These are written using builders and if-else checks of emptiness which are definitely not idiomatic. Authors of the std-lib did them for you so you do not have to. That is not a red flag.
The other way around: non-idiomatic use of a language in its standard library points to weaknesses in the language.
If emptiness is something that is so important you add extra branchpoints inside collections to check for it, there ought to be some way of disallowing emptiness or {nil,null}ness in the language itself.
Sure, it is nice for people who use the standard library, but it points to things in the language that perhaps should have been given some more thought.
I'm not sure I would call it a weakness. Performance vs expressiveness and maintainability will always be a trade off. It could be a weakness, but it's just as likely that it was a deliberate trade off.
Though this argument also supports the parent comment actually: notably, standard library has much higher than usual chance and assumption of being possibly used in tight loops - with no way to relax that assumption based on actual (often closed-source) usage. So the same optimizations done in stdlib could perfectly well be considered premature if done in non-stdlib code (even if we imagined some func was not in stdlib and one needed to write exactly same func in one's code).
A library like Java Collections API sometimes uses optimizations involving bitshift etc which is too confusing for most application level code. So your comment that it's not an anti-pattern to do such things in application code is something I have to disagree with.
It's the norm to work together with people with varying skill level.
Weather people have the skill to figure it out also doesn't matter as they normally don't have time to figure it out.
And even experienced programmers can get confused about bit hacks, sure temporary but that already a step price to pay which is seldomly worth it. It always comes with the risk of accidentally introduce a bug, even if your team only consists of senior devs. And it always hampers productivity.
I am against the idea of holding back for newbies.
If you think bit hacks are bug prone, needlessly complex, and not worth it, don't use them. But if your only argument is "junior may not understand", then no.
Newbies need to learn, if they always face newbie friendly code, they will be newbies forever. So yes, it takes time understanding a new technique, but it is an investment.
Later, when not-junior-anymore has to decide on an implementation, he then can chose the most appropriate among the many he saw, including bit hacks. It doesn't mean bit hacks are the best, or that he will use them, but because of his experience, he will make a more informed decision.
Note that I assumed code intended for professional developers. The situation is different if you are targeting non-specialists (ex: macros for end users) or if you are making example code.
I agree that complex bit hacks makes code less readable and should be avoided. But that's not necessarily what we're talking about here. Most common uses I see of shift operators tend to do the opposite: they make what you are doing more explicit when you do bitwise operations.
If flipping the Nth bit inside an integer or a byte array is an "advanced" technique, unfit for "application programmers" then I'm not sure I want to work with "application programmers".
I'd go even further. As long as you want to communicate your ideas via code it does not matter whether you work with juniors or seniors. You want to limit the cognitive strain caused by unnecessarily complicated code.
Actually, this isn't so much an argument against reading the standard library as it is another reason to do so.
If the standard library is ugly and complex, it tells you something important about its design, and possibly about the language itself and its maintainers. It is a good indication of where things might well end up for you.
You should look at the ugly bits, and you should ask yourself "am I going to invest in this?"
That's why I decided not to invest in Scala. Its standard collections library was so hard to grok that I just decided to stay on Java. I must understand everything in the entire stack I'm using. If I have blind spots, it makes me nervous.
Java is pretty good in that regard, for example. I regularly browse its standard library and it's quite comprehensible. Concurrent stuff is not easy, but I guess that's the nature of underlying algorithms.
Also that's the reason that I don't like Java streams. They're as hairy as Scala collections. I'm avoiding Java streams.
IMO implementation details of any library matters as much as its public API. If library implementation is not nice to read, it's a very bad smell.
The trick is to read unfaithfully. You don't need to understand everything in the standard library. You just need to swim around in good code for a while. If you're in over your head, chances are you can skip that part.
I find languages are pretty easy to pick up after the first few. What seems to be the barrier nowadays is frameworks. How do people go about learning a new framework? Obviously most of them have an intro project to follow on with, but they tend to be pretty simplistic, and once that's done, getting to the complicated bits that you actually want to implement seem to be the barrier.
If you stay strictly in the procedural/OO paradigm then sure: it's just a new std lib and syntax. When you jump to functional, logical, or more esoteric paradigms things start to get pretty different.
Of course if you've already got one of each major paradigm under your belt things get progressively easier because you're exposed to more ideas...
I usually prefer working with mature stacks that don't require me to constantly work with flavor of the month frameworks.
It's great to broaden your horizons and all, but that is something that needs to be done judiciously and deliberately if it is to actually be of any use.
Me too, but to get started on a non-trivial project in such a stack is, IMO, the tricky part. It would be nice if I could do everything in the stack(s) I already know, but that's not always possible (e.g. backend vs web vs mobile app vs desktop app).
Knowing what a framework does (its function) and knowing how to use it (its functions) are two different levels of knowledge. sometimes you can work backwards from knowing what it does to make a very quick-and-dirty implementation.
Same experience, I thought I had things figured out but after a few weeks struggling with OCaml I felt like a complete retard :-) But it was an eye opener in some way on how to write programs in a different way. I think OCaml also suffers a bit the same problem as C++ does. You need to learn quite a lot to be able to read others code. There are many advanced concepts with cryptic syntax that is hard to search for on the net.
In my experience, one of the hard part when learning OCaml is that you can write your own code without modules, but to use someone else's code, you have to know how they work.
Completely agree. Once you learn enough languages, it all starts blending together and looking more or less the same. Same control structures, same functions, same structures, same classes, same objects, same lists, same hash tables, same pretty much everything. There's usually a few innovations and peculiarities here and there but it's not that much.
The bulk of the language is actually the standard library. The APIs people will be using to solve the vast majority of problems. Of particular interest are the APIs used for dealing text and I/O because everything involves them.
This is why Scheme is so easy to learn. The language itself can be learned in hours. The standard library is so small it's pretty much useless. Learning Racket on the other hand is much harder.
To go even further than standard library, read the language's source code. This is especially relevant for virtualized languages. The implementation reveals how they actually work and enable a much deeper understanding of it.
Yes. I learned so much after I started reading source code. I discovered the history and implementation of many language features, including the more obscure ones which are the subject of many fun stackoverflow questions. I obtained a deeper understanding of the threading model. I developed a good sense for the performance cost of my code as well as the optimizations I can expect from the implementation. I was able to create libraries in C that can be loaded at runtime by the language. I was able to embed the language in C projects.
This isn't something beginners should do, of course. It is a fact that it does lead to a much deeper understanding of things.
> This is why Scheme is so easy to learn. The language itself can be learned in hours. The standard library is so small it's pretty much useless. Learning Racket on the other hand is much harder.
This is why Racket is so easy to learn. The language itself can be learned in hours. The standard library is so big that you can do anything with it. Learning Scheme on the other hand is much harder.
I think the Scheme quote is interesting, because it almost applies to C.
Most people consider learning C hard (for example, compared to python), but it is an extremely small language with a "standard library so small it's pretty much useless".
C is hard because despite being a "small" language, it's full of all kinds of unintuitive rules and behaviour for beginners to stumble upon, and the language provides next to no assistance in avoiding the pitfalls. People have to learn C by making potentially dangerous mistakes, hopefully in code where they don't matter.
What makes C somewhat hard to learn is pointers and strings. Apart from that it's so small that most fulltime C developers will touch upon the entire language on a regular basis.
Whole heartedly agree. I was big into Go years ago, and some newer guys tried to make it a functional language. When they argued with me, I couldn't say 'its's just not right', but instead pointed them to std library code.
Protip: That doesn't change their mind, it just makes them hate Go.
Why is that wrong? A language can be multi-paradigm while the “official” documents are written in a specific style. Like human languages, they evolve with their use (or fork into another language completely), while the high literature is the last one to change.
Together with that, I agree that one should read the high literature before inventing a new writing style.
On the other hand, sometimes it really makes sense to strike out and define a new, higher standard for code style and paradigm in a given language, or at least to improve somewhat on the style and paradigm that's handed to you.
JavaScript used to be a messy language "not fit for professional use" (see popular JS books from the year 2000 :shudder:), but it turns out most of the problem was how people were using it.
JS code being written today looks like an entirely different language compared to early idiomatic "JavaScripts" one could use to make pages "more dynamic".
JavaScript today wouldn't be recognizable to a JavaScript developer 6 years ago, especially when using class syntax, destructing, and async/await, among many other syntax additions.
Actually it was very messy. People were encouraged to use it as a write-only imperative scripting language.
People thought VBScript was more powerful. JavaScript was like that attention-seeking red-headed stepchild that you see at family reunions.
There's a good reason Douglas Crockford re-introduced JavaScript as "the wwwrld's most misunderstood programming language." He saw beauty within that others didn't.
That's something I don't really understand. The language is a garbage-collected C-like. If you don't abuse objects/this/.call, plain Javascript is very readable. What made it a "write-only imperative scripting language"?
If you look at the "JavaScripts" that were floating around at the time (try looking up dynamicdrive.com from around the year 2003) I think you'll understand what I mean.
Wanting to explore that space as a hobby is not wrong, but insisting on it at work when it's clearly not typical or well-supported is pretty wrong in my opinion.
If it drove them to write in another language, might that be a win? A code base full of non-idiomatic code, where devs think in the style of language A while using language B, isn’t a great place to hang out.
It absolutely would be, if our entire stack wasn't in Go.
One was a JS guru who literally threatened 'I can walk across the street and get another job, you know.' And we let him, and both of us are happier for it.
So I do agree, don't join a team writing a language you hate.
I think I would have left too. Go has its advantages to be sure, but if you can write highly expressive functional code in a language like JS, Go is like wading through a swamp.
Agreed. We're 100% Go in our back end, and in the process of hiring. We literally wrote in the job posting, "If you don't want to write Go, this job is not for you."
Well that's just silly. Why take a job using hammers and nails when you really like screws and screwdrivers, but hate hammers and nails?
I think the argument instead might have been, "Go is not designed for it, and kludging in a language paradigm that the language wasn't designed for is going to make for worse code."
The fact that these engineers didn't understand this, suggest to me that perhaps they were not very good engineers as such, even if they might have been skilled software programmers.
Sorry mate, your argument doesn't make much sense. People can and do write OOP in C++, and just because it is multi-paradigm does not mean no one should write OOP.
It's like saying, because an artist is using watercolors for this painting, they should therefore never do pencil drawings on other projects.
I don’t disagree, but that’s an entirely different issue than your original argument, which was to suggest that C++ is not an object oriented language.
This doesn’t seem like good advice to me, as others have already said.
C++ standard library is written in a heavily templates style most user code isn’t written in. That would be a terrible way to learn.
Some of Python’s standard library delegates to C code.
Much of Clojures standard library builds the language up from a small set of primitive, sometimes out of macros, and isn’ta good example of how to write good code yourself.
The standard libraries tend to be written in a different mindset than idiomatic user code is, as the requirements differ. In my opinion, the best way to learn a language is to implement a non trivial but small project in it.
Maybe this works for the language that the author is talking about, but in the cases that I'm familiar with, the standard library code does not give a good idea of what "normal" application code looks like. For example it's often got a bunch of platform-specific logic or low level optimizations that, as an application author, you probably want the standard lib to abstract away from you
This is natural due to the low-level nature of a language implementation. And at some point, what's truly going on is often hidden behing a compiler intrinsic or a foreign function call to the runtime.
At the same time, there are going to be parts of the standard library that are less platform-specific and thus more readable, for example the Java Collection Framework.
In the past, people supported pundits who would sit in the monasteries/huts their whole life perfecting their knowledge of scriptures and their skills of explaining them to others (incl. picking the right verse when it's needed and commenting the way helpful to the seeker). Some times I feel like today, besides free software projects, meant to be ran on donations, there should be entire separate projects dedicated to writing really good (also taking usability, eloquence, ease of understanding and practical examples very seriously) documentation for programming languages and libraries.
Standard libraries are very concrete, not academic at all. Learning the standard library means learning how to open a network connection and making an HTTP request to some server. The language becomes useful and gives results immediately. Theory means stuff like language concurrency models, it becomes important once you start considering how to structure the code at a high level. How do I use this language to handle X requests concurrently, hopefully in parallel?
That highly depends on which language and standard library you are talking about.
Reading C++ standard library gives you headache, and doesn't really teach you how to write C++ in a normal application. For a beginner, this is an impossible approach.
Reading Python standard library means reading C half of the time. Useful for some seasoned Python developers, but not for new comers who want to read Python.
At the end of the day, standard libraries aren't written to teach, and some being a good teacher is a coincident. When you are learning a new language, you really have no idea whether the standard library is a good source for beginners.
I'd suggest a less poetic, more pragmatic approach to learning a new language: ask the most active forum of the language, and use the most common learning materials.
I don't think this is a good read. A standard library is very concrete - it's the core of the language in action, and the shape that its code takes will likely be repeated regularly in 3rd party code written in that language. There's nothing academic or abstract about it.
EDIT: As others note, a good chunk of a stdlib may involve system interop that isn't representative, so that tempers my comment with regards to whether that code is idiomatic. But my case still stands regarding concrete vs. abstract.
I would not recommend anyone ever look through that as a way to learn C++. Heck just something as basic as a struct that stores two member variables is a 200 line of code horror show in C++:
No, in C++, a pair of member variables is what I wrote.
C++ has enough complexity already, there's no need for you to add additional false claims. The thing you pointed to is the standard library version of std::pair, written to be as platform-independent, performant and correct as possible. But if, as a programmer, you need a 'simple struct with two elements', as you described it, you just define a struct and add the two elements, and that's it. Or you just declare it as std::pair<int, int>.
Your statement that declaring a struct with two members in C++ takes 200 lines is just completely false.
You are completely ignoring the context of this discussion to make an irrelevant point and doing it in a way that ironically supports my position.
If one wishes to learn how to represent a pair of values in a generic, cross platform and efficient manner, one should NOT consult the standard library to learn such a thing precisely because the standard library includes incredibly irrelevant and archaic details that are entirely unneeded to understand how to idiomatically represent a pair of values.
Your claim that the standard library is written to be cross platform couldn't be further from the truth. The standard library is written to work with a specific compiler, and often times only with a specific version of a compiler on a specific platform. While it's possible to use libstdc++ with clang, it's not possible to use libc++ with GCC. Furthermore no compiler other than MSVC fully supports MSSTL.
The idea that the C++ standard library is a good way to understand in concrete and idiomatic terms how to represent a generic pair of values is simply bad advice.
If you wish to counter that position then you are welcome to do so, but so far you are arguing something entirely different from the actual topic at hand. The argument I'm making isn't that you need 200 lines of code to represent a pair of values, it's that if you tried to understand how to represent a pair of values by consulting the standard library, you'd need to sift through 200 lines of code because the standard library is written in an incredibly complex manner that is entirely unnecessary in order to learn or understand idiomatic C++.
In short, the statement " To learn a new language, read its standard library", is bad advice for someone who is new to learning C++.
Well, that's still instructive about how library code that aims to be efficient and portable has to look in C++. It's not representative of most application / business layer code, and it's definitely not good learning material.
But, if choosing C++, at some point you will have to write code like that as a project matures and must target different compiler versions and platforms.
No, at no point should you use the standard library to understand how to write C++ code. The standard library has a specific clause in the C++ standard that exempts it from rules about undefined behavior. For example, that link I pasted to implement a std::vector is not something that is permissible to use by non-standard library code, it is full of undefined behavior that is permissible for use by the standard library but not permissible for use by ordinary code.
To the extent that one can learn how to write cross-platform efficient and generic C++ code, there is always a better example to learn from than the standard library.
This is a classic hacker/builder take. Start scratching your own itch and make many apps. /s
To be frank I have the same mindset, but it's good to read other people takes on it. Some prefer to have basic knowledge and feel great anxiety when they're jumping to code without any foundation.
Better really depends on who you are. Born academics can learn better by reading foundation libraries and principles, and they write articles advising others to do the same :-)
What you described as better could be a really good fit for business-first coding opportunists who would get annoyed by perceived inefficiencies and petty politics in academia...
The article is a terrific example of basic analysis BTW. This underappreciated skill can be a huge blindspot for opportunists...
Reminds me of Nassim Taleb's works. Reading and rereading the standard library is a good way to learn a language or get depth. I don't see how reading an actual implementation can be considered academic, as it is real-life engineering. It is not pseudocode or theory, some algorithms in Java and C++ are using some ugly shortcuts that only an engineer would find interesting while an academic would think it's heresy.
Cherry picking is 100% compatible with analysis, which is closed subjective logic, not open opportunistic logic. Plus I don't think the author is exactly trying to go head to head with your personal favorite method. Could be wrong though.
IMHO, any language worth learning has a large enough user base in both corporate and academic settings and that there are books written on it. That's my criteria. If it doesn't have at least 5 books on Amazon, I'm suspicious of it (and from real publishers, and not publishing mills). There's too much wheel-reinventing happening for me to get excited about new languages.
In the last decade I've spent time learning five sufficiently novel languages: Scheme, Ada, Erlang, Rust, and Go. Three of those are from the late 70's. I'm really glad I learned Ada and Rust. I started using Ada for embedded programming, and I'm trying to find more opportunities to use Rust because it so robust. I only studied Erlang, Scheme, and Go because smart people I knew told me I should investigate, but I have not used them.
What is novel about Go? It just seems like C repacked with a garbage collector and basic concurrent programming support to me. Which both weren't novel even when Go was designed.
It's just a little bit more work to write a non-trivial program in Go than in a scripting language such as Python, so if you need/want the good tooling and you need/want the significant speed improvement, it could be a good choice.
Go doesn’t have the libraries of python however. In certain computing domains you would first have to spend 10 years rewriting the python frameworks in golang before you could even be on parity with python.
I am most familiar with scientific computing and GIS applications — python is miles ahead of golang in ecosystem support in those domains.
For scripting and "just throwing something together for internal use," Python is a pretty traditional choice, but I think Go is able to outclass it for maintainability -- static typing and somewhat better package management are a big help, and Go has a comparable ecosystem.
And if this wasn't a useful argument many more people would use OCaml, Crystal, D, etc ; and every new language wouldn't be met with "I like it but it doesn't have the libraries I need so I'll keep writing Java".
Naive question, can't you just call Python code from Go? I don't see the problem besides that performance of the Python libs wouldn't be as good as if they were written in Go.
Go doesn't have any substantial innovations in the set of language features, or maybe none at all. I believe it is instead the whole of a sober and efficient toolchain that effects tedious but simple to understand and work with code, which gave it such a great appeal.
It's like a modernized C with a good implementation and out-of-the-box no-nonsense tooling, and the tricky bits left out. Nothing really new to see, but there was nothing really like it when it took off. Very utilitarian. Rust, Java, C#, C++ - those are all very 'complex' or 'bloated' in many different ways to an avid Go programmer.
I'm not promoting Go (I personally don't like it), I'm just trying to understand it's appeal and popularity and what is different about it. Quality and completeness of implementation, tooling, etc. by having a big company behind it also helps of lot of course - but it isn't sufficient.
Could I trouble you to compare Ada and Rust, especially in terms of safety? I'd like to be able to write low-level code without dealing with the insanity of C, and my top picks are Ada (traditional answer to "safe language"), Rust (the up and coming answer to "safe language"), and Pascal (friendlier than C and way better at least for memory safety), but I'm reluctant to learn all of them to sufficient depth to be able to decide in retrospect which one was worth learning:)
Memory "unsafety" isn't bad. It's just a feature. In fact it's a feature some people want and need. I'd still start with C. There are all the ressources in the world to learn and it's better to know how to manage boundaries and memory allocation by hand than the other way around.
There's a significant niche for such features, we're dozens. That's the reason why people like stuff like Zig. Pointer arithmetic isn't just dangerous, it's cool too.
I'm not saying that memory unsafety is useless or even undesirable, I think the "problem" is that it's the default in "current popular and actually used high-performance languages", which means that stuff that needs to be fast and secure (like web browsers) have a hard time.
Maybe my "most of the time" was too strong and I ignored big domains that I don't know. But for the part that I relatively know (web stuff mostly), safety and speed together matters a lot. The thing is, since unsafety tends to be viral (an unsafe part of your stack can compromise everything), people get very paranoid about "unsafe languages".
I understand your point, but for the most part memory-safety features and speed of execution run contrary to one another.
Garbage collection and bounds checking is big sticking point for systems / high performance programming, and to an extent real-time programming.
As far as I can tell, Rust should be about the limit of what's possible if you want to have your cake and eat it too. But I'll doubt it'll ever replace C for speed-critical applications.
Of those three, Ada and Pascal are as I understand it pretty closely related syntax-wise (in the same "family"), so learning one of those will help a lot with the other.
I'd recommend starting with Pascal, because of the comprehensive and high quality standard libraries (or "frameworks" -- see related recent HN discussion...) in Free Pascal and Delphi, and because Free Pascal can target a lot of different OSes. OTOH, a downside might be that you learn (to rely on) too much of the FCL/VCL in stead of just syntax, so you get confused at the lack of those libraries in Ada; that could speak for taking it the other way around.
TL;DR: IMO, in a way that's perhaps closer to two new languages to learn than three.
As someone who reads 10-20 tech books per year I rate PragProg the highest, then O’Reilly, though not everything is of the same quality there. PacktPub is absolute trash tier. Don’t remember others you’ve mentioned as good or bad
I do not think an advice like this is applicable to any programming language out there. For example, I find it hard to apply to languages which have long history and change a lot over time, like Python or Java (or C++ as mentioned in other comments). I can easily imagine parts of the standard library being written long time ago, with many modern features not available back then.
For example, I do not think it is possible to learn modern Java by reading Java Collections code. You won't see Optional<T> being used there. Instead, your takeaway might be that it is perfectly fine to return nulls all over the place.
I think Rust is in a category of it's own, mostly due to circumstances rather than anything: The ecosystem has grown immensely over the past several years and you can find a crate to solve most of your needs. But here is where thing become tricky: more often than not the documentation provided is extremely vague to put it mildly and the examples are a hello world at best. With this in mind, in order to truly use it you have little to no choice but to dig deep into the implementation and at some point reach the standard library.
What's a good way to learn about the proper way of writing java 11 in 2021 / 2022 ? I'm currently trying to escape writing java 1.5 style code in java 8.
To anyone that considers this approach for learning C++, I would advise strongly against it. Standard library code has to deal with too many cases and tries to be optimal for as many of them as possible. Also the formatting is highly unusual, compared to other C++ code found in the wild.
That's probably true of quite a few languages actually. I mean if there is any library you want to be as fast as possible, it's the standard library. Take python for example, a big chunk of the standard library is actually written in c (although there is often an equivalent implementation in python). In rust the standard library uses quite a bit of unsafe code, and even nightly only features and standard-library-only features that you don't usually need in real code, because you will be calling something in core or std that does it for you, and has been rigorously checked to be safe. The java standard library has parts that are written in c++. Etc. Etc.
> In rust the standard library uses quite a bit of unsafe code, and even nightly only features and standard-library-only features that you don't usually need in real code
I have found the Rust std lib to be a great resource. Much of the unsafe is necessary because there is literally no other way to build up abstractions like Box or certain data structures without it. Obviously once those are written though, you want to build on top of them where possible and not use unsafe.
It's probably important to delineate between a good example of application code vs library code also. The std lib is not going to help much with application code.
Exactly. The rust standard library can be very informative (and the same is true of other standard libraries as well), but it isn't necessarily going to be the same as what you would do in application code, because in application code you have the benefit of the standard library.
There are many like firecracker vm, polkadot, linkered proxy, tokio, hyper etc. And these are the one I know and I think there are too many good quality application code in Rust.
> I mean if there is any library you want to be as fast as possible, it's the standard library.
To a decent extent yeah, though I would emphasize the issue isn't quite being as fast as possible (although in some cases it is, but also see I/O streams...), but rather having near-absolute correctness, and achieving high performance as a secondary goal. You can usually find a faster third-party implementation of anything in the C++ standard library. The problem is third-party implementations always cut corners somewhere, whether it's strong exception-safety, proper trait/concept handling, thorough testing of rare edge cases, extensibility/customizability, or other stuff. Even the syntactic complexity rises as a result, let alone the complexity of the actual semantics. (Even minor stuff like using difference_type/size_type instead of ptrdiff_t/size_t makes things more verbose and harder to read in the standard library; third-party implementations would often opt for the latter.)
Algorithm runtime analysis is usually measured with asymptotic estimates (Big O, Little O, Big Theta, etc). In something as commonly and generically used as the standard library of any performance focused language you're liable to find a focus on optimizing these algorithmic performance guarantees for corner cases over optimizing for simple and straightforward code.
Yeah, the first thing that popped into my head was some poor n00b spending the next six years of their life deciphering glibc. There have got to be easier ways :)
And the C++ standard library has too many fundamental design issues that are just bad advice for writing actual maintainable and performant code. For example, the standard specifies all map containers to have pointer stability, which is often not needed in reality. Because of this all <unordered_map> implementations are hilariously slow. Or what about std::vector<bool>, <iostream>, std::string, you name it.
The frustrations people have with it are sometimes up to the point where they begin writing everything from scratch…
That’s actually what I do whenever I’m lacking humility as a coder. One F12 on std::iterator (which is now deprecated lol [0]) is enough for me to remember that I don’t know anything.
My gripe with C++ is that you need to know so many things to build even the most basic abstractions. When writing a custom data structure you always end up with having headaches about the specifics of std::iterators and rvalue references and all that jazz rather than focusing on the actual algorithm. There’s a reason why people want to go back to Orthodox C++ (using C++ as C with a few extra features)
I was about to post the same but here you are. The only time I would study std:: namespace would be is I needed to write some generic library. Trying to learn C++ by reading std:: can make one's head explode.
One thing I have wondered about is the number of multi-arity functions that individually specify 1, 2, 3, 4, 5 argument cases in 3rd-party library code. Is it a result of people thinking that’s idiomatic, or is
([a b c d])
performant enough to matter in library code instead of using
This is also true for .NET, especially modern .NET where performance is critical nearly everywhere. It's representative of perf-sensitive code, but not general purpose C#.
I’m using Zig a lot right now, and I have been pleasantly surprised by the standard library. Maybe it’s due to Zig being a very straightforward language, but I find most everything I read in the standard library to make immediate sense.
I don’t know about reading std alone to learn Zig though, I used other sources like ziglings and ziglearn which taught me syntax and patterns. Had I started with the standard library I doubt I would have picked up the language as quickly as I have.
So having learned Zig I am much more inclined to look at standard libraries for languages I learn in the future, but I don’t think it’s wise to rely only on standard libraries for learning.
Learning Zig by reading its standard library source has worked for me. The source files also include quite self explanatory tests that compensate for incomplete documentation on main Zig website.
I'll be honest there aren't really any complete learning resources even for those that do know C. For what there is the language is changing whatever you read in documentation might not be true between stable and master at the moment. The stdlib source is probably the best reference but even reading strictly from that it doesn't mean you'll learn to make code that works it means you'll learn to make code that's correct. There are plenty of known compiler and language issues on GitHub and the self hosted compiler is still being written.
At the moment since Zig isn't complete you aren't going to find documentation that is either. Particularly for newcomers.
Ziglings [0] is a series of small problems where you solve bugs in Zig code, and I enjoyed working through them. The readme claims "you are not expected to have any prior experience with... C." When I did ziglings I found the most difficult part was adjusting to the type syntax (which I now love).
I admittedly know C very well, but assuming you already know a language with a C-like syntax (C++, Java, JavaScript, C#, Rust, etc.) most things like functions, control flow, variables, statements map to Zig with only small syntactic differences.
One thing that may be difficult to learn not knowing C is working with strings, since they are just arrays of bytes. Most other languages have a string type, but Zig is much more like C in that strings are null-terminated fixed-sized arrays of bytes.
Until you realize most of the time the code is loaded with Macros, ifdefs, cross-platform conditions(windows,macos,linux...), hacks for different versions of OSes/dependency/whatever, the real meat is hard to spot on, it's easy to get lost in those noises.
Ugh Crystal is so beautiful and sorry to make this typical complaint but I wish the compile time was 1/10th of what it is. The dev cycle loop is just not there for me but the language is incredible.
Compile times on M1 seems to be fairly nice yes, according to those that have tested. But it is still perhaps not what one could wish it to be. So far most of the efforts in the compiler seems to be around making it correct, rather than faster.
I think it's pretty clear the GP was asking if the optimizations implemented for x86 that aren't implemented for aarch64 would actually improve performance of generated aarch64 code. It's a question about CPU architecture/microarchitecture. That's a different question as to if the optimizations improve performance of generated x86 code.
For instance, I imagine the x86_64 register allocation does some variant on graph coloring for register allocation, with an additional pass to assign lettered registers (rax, rbx, etc.) to the most heavily used registers, since using higher numbered registers requires a REX prefix byte. In addition, many instructions have more compact encodings when eax/rax is the destination register. At a minimum, excess REX prefixes take up instruction cache space. There's no parallel for aarch64, so there's no sense in implementing logic to try and make sure the low-numbered aarch64 registers are used more. (Though, on 32-bit ARM with Thumb/Thumb2, only a subset of registers are available, so there is a similar optimization for 32-bit ARM targets that support Thumb/Thumb2 when optimizing for space.)
I imagine there are better examples, but my point is that some optimizations are useless on some architectures.
> I think it's pretty clear the GP was asking if the optimizations implemented for x86 that aren't implemented for aarch64 would actually improve performance of generated aarch64 code.
Oh, of course! That must be what was meant. Sorry.
That was(is?) the benchmark for gcc optimizations: it has to pay for itself on the compiler: if the resulting compiler code is faster but the added compilation time is even greater, it s not worth it.
Crystal core developer here. Happy to see Crystal's stdlib as an example for how you can grasp a language by looking into stdlib code. That's what I like a lot about Crystal. Even its stdlib easily readable.
Arguably, this is somewhat deteriorating as algorithms get optimized. Readability and performance being in competition.
I agree with some of concerns from many commenters. For many languages, especially those that have been around for a long time it's probably not a good idea to look at the stdlib. C++ or Java, no thanks.
But with some languages like Crystal, Golang or Julia, it can be really very helpful to look at the stdlib implementation.
All well and good until you find out half of it is written in C or something like that. I like to go out and find the most starred repos on github or just ask around for a library that people think highly of and read that.
C itself is actually a pretty good example of where this fails as well. Its standard library is usually more macros than code. It's necessary if you are going to ship portable code that deals with every conceivable architecture and endianness and permutation of toolchain from 1976 until now, but it's not how most C code looks.
K&R is a much more meaningful guide than unistd.h.
I actually partially disagree with this. Sure, for the higher level OS independent packages (e.g. “archive/tar”) this is great.
For stuff in the “os” package it’s a different matter. All of that code is a bit messy. Not because the authors did a bad job, but simply because it’s hard to implement the same API on half a dozen operating systems.
Then there is the “runtime” package which relies on a lot of global state, and has to work well in cases where heap allocations are not permitted.
Sure you’ll learn how the language works, but you shouldn’t draw inspiration from those on how to write your own code.
I suspect that the part of the stdlib that deals with the OS and runtime will be fairly messy and un-idiomatic in any language. They are inherently hairy problems.
I agree, Golang is a good language for this. Rust, too.
The Python standard library is pretty crufty — lots of stylistically dated code. Same with Perl 5, and I would guess other similar languages but I can't speak from direct experience. I bet that's more a function of their age — back in the 1990s maybe it might have been unreservedly good advice to read those standard libraries.
I've also learned a lot from writing FFI extensions for languages, which for Rust and Go involves a lot of standard library diving but for languages like Python, Perl 5 and Ruby where the core is mostly written in C, it's a different experience than reading the standard library.
This doesn’t seem like a great idea in general, since the standard library is usually low-level code, aggressively optimized for speed, that is not at all typical of applications written in that language.
Emacs comes with a manual called "An Introduction to Programming in Emacs Lisp". It says:
> I hope that you will pick up the habit of browsing through source code. You can learn from it and mine it for ideas. Having GNU Emacs is like having a dragon’s cave of treasures.
I don't think the advice applies for C++ either. The last time I worked with a C++ code base I tried to understand some std:: API semantics by reading the code but I failed every single time. I'm not an expert C++ programmer by any means but I think I've passed the novice stage where you try to learn the language. Maybe for C++ constant learning of the base language is needed :-)
For Go and C on the other hand I think it works very well since the core languages are so simple it's viable to read others code successfully without being a language expert. I read the Go stdlib code all the time if something is unclear in the documentation (the docs are good but sometimes there are edge cases that are not clearly documented).
This is how I approached learning Nim at the time, and it was quite an excellent approach in my opinion. Even gives you the opportunity to contribute to the language in question, when you inevitably come across something that isn't optimal, a bug, or even something as small as formatting inconsistencies, which is a nice feeling
The goals of library authors are frequently different from those of application developers. Library authors usually have to consider how to make their code work on more environments, on older language versions, more configurable, handling more edge cases.
I think there’s definitely value to reading a language’s standard library, and I’ve also gotten lots of value from even reading the language’s implementation! But I hope people take this advice to start with reading the standard library with a grain of salt: it can be one extra resource for you, and likely to show you new things that weren’t written in docs somewhere, but if it doesn’t click or is proving difficult to understand, that’s fine!
In particular, for the “reading code to learn” bit, consider reading an application written in that language (command line tool, web application, etc.) and it might serve as a more reduced intro to the language.
I'm not sure how many languages this works well for but it certainly doesn't work with C. You won't learn the important distinctions between implementation defined, unspecified and undefined behavior. And reading a C standard library implementation will not really teach you to write normal idiomatic C code. You will also not learn the strict discipline required to avoid doing things unless you absolutely know with certainty that they are either strictly conforming or defined for your particular set of target platforms (usually much harder to answer this second question which is why when writing C you should avoid having to).
That being said, for Erlang, I think this is a good idea. I learned a lot when reading the standard library.
Interestingly, my engineering school adopted the opposite approach on practical teaching: the first couple of month after the preparatory years were all about re-building all the C stdlib, networking layer, then building your own shell, then building your own assembler and corresponding vm (it was an assembler for a quartet based system for whatever reason), then compiler for a random language in c++ (tiger) and so forth.
We also built a simple CPU from gates using an electronic simulator.
Left me with some confidence that I could do things with a relative confidence. Definitely value in learning the basics.
Even though it's not about human languages, it is funny to think about how you would definitely learn a human language too if you pored through its entire literary canon.
I say best way to learn is to start coding and break stuff :-)
Also once one gets more experienced, its better to decouple the language from programming concepts like object oriented programming, continuations, avoiding side effects in functional programming, etc. Useful to read books like SICP and algorithm books, then learn multiple languages and you will see they are not that different overall.
I recommend learning Lisp or Scheme or related languages because they have a lot of good conceptual resources to learn from. But maybe that’s my background talking, I loved SICP. I learnt SO much from studying Common Lisp in the last year. Of course it helps that it has a great language specification.
In terms of choosing a language, standard library matters a lot. Hence the explosion of JS due to NPM and also being in browser. I’m suprised SWIFT is not more popular, given Apple’s place in the ecosystem. I really liked programming in Swift, very enjoyable language. And it has generic functions too!
It can give you some insight into language features, memory model, etc, but it is unlikely to help you "learn" , as in "be effective in using" it.
For instance in Scala you'll likely need to know cats-effect or zio ecosystem more than stdlib. Likewise for Rust, where futures and async stack (tokio, asyncstd) are outside stdlib.
lol this advice might not apply as much to Scala. All the CanBuildFrom stuff was overwhelming to look at the first time, and a lot of things in the stdlib are implemented in a way that’s not idiomatic in app-level Scala. They’ve simplified things a bit and made it better since I was baffled by it all long ago though.
Reading the Python stdlib was a brilliant move a peer encouraged me to do. I learned three themes of stuff:
1. A good look at long lasting, durable pure python.
2. A good look at long lasting, durable C implementations of python (dict is the core of Python. Read the source!)
3. A look at a bunch of libraries that basically never get used and a sense of how they compare to popular third party libraries. (Doing this is why I’ll never complain when a language’s stdlib is small. I get it now.)
What I learned is that once a library is in the stdlib, it's usually stuck being maintained for a long time. And then you've got all these libraries that see no use because they were ill-conceived or better ideas came along.
I don't think stdlibs should be barren, but I completely get the ideas of languages like Rust that want to include only the most obvious of candidates and leave the rest for the community to organically evolve.
I've jsut discovered this approach with python. I usually used google and cookbooks. But I often found myself in position where I tought 'surely there must be a better way to do it'. Reading the standard library really helped me understand how to do some stuff. Best way I found to enforce what I learnt was to rewrite some code with minimal imports.
To learn a language, read and understand its specification. Then learn the tooling and libs that come with the compiler. Then just start reading other people's source from wherever while you write your own stuff. Reading the standard lib is helpful, ad is reading the source of any framework, but you have to know the language spec well for it to make sense.
In C#, standard library is awesome, and indeed a good starting point for people learning the language.
On the other end of the spectrum there’re languages like C++ with these horrible templates, or Rust with tons of unsafe code. Adopting patterns or coding style from these standard libraries is not the best idea.
I think this applies to lot of frameworks, especially web frameworks. This has been my experience with expressjs and Angular2, Jquery. They may look like they are doing lot of magic but underneath it’s just a simple code, when read it easy to write optimised code.
this works really well for certain languages and have done many times over the years, first can actively recall was Turbo Pascal some years ago, which ironically enough, was just tweeting about on Friday along with other just get started thoughts, https://twitter.com/DotDotJames/status/1451656005155176454
Reading these things is good advice! But it's distinct from the advice given in the article, which is about reading libraries close to the heart of the language that are written in that language itself.
JS is a bit odd here, possibly for good reasons (ubiquity + dynamic nature of the lang itself might mean runtime considerations override any degree of self-hosting), but it basically means it's not possible for standard definitions of standard library. So the question becomes... what are some masterful libraries written in JS?
I learnt more about C and C++ from studying Plauger's The Standard C Library and Plauger/Stepanov's The C++ Standard Template Library. I also remember learning Object-Oriented Framework design from MFC Internals.
Not too long ago it was impossible to implement std::vector in standard C++, as constructing objects next to each other formally did not create an array. Therefore pointer arithmetic on the resulting pointers were undefined behavior, as pointer arithmetic is only defined within an array.
This was fixed by the array being implicitly created in certain circumstances.