I wish this didn't have so many exclamation points in it! There are literally 41 (!!!) exclamation points in this short blog post! It is exhausting to read! Please consider not using so many! There should be ways to convey enthusiasm about a topic without peppering your writing with exclamation points!
We're discussing a pretty solid technical article, one deeper than the typical HN programming article, and this is a very superficial criticism. It's like you're proudly advertising that you're so disengaged with executable file formats that you can be derailed by idiosyncratic punctuation.
I'm actually very interested, which is why I bothered to comment in the first place. I had to download the HTML of the article and search and replace all exclamation points with periods to be able to read it.
Why? It's true. 90% of programming articles lately are "Check out my port of <some marginally interesting library> to Go!" Julia does awesome work and is kind enough to break things down and share them with us.
Well, it's true that average HN article level may be somewhat low, but this one is just telling stuff that every 1-year CS student should know without actually going into details. "ELF is a file format like any other, and you can read it" — I hope everybody knew that, right? "There exist static and dynamic linking" should be quite widely known as well. But when it comes close to something actually interesting — "_start <..> does a bunch of Very Important Things that I don’t understand very well, <..> so I won’t explain them." What's the point then?
I'm not saying that article is bad, actually I believe that it can be useful for somebody, because there are some who doesn't know what computer program is even to this extent, but I'm merely surprised that somebody presumably competent would call such an superficial article "deep".
2/3rds of all bad nerd programming message board comments seem, for some reason, to start with "every 1-year CS student should know...".
No, they don't. Thinking that they do suggests to me that you're still in school (at least emotionally), because technology is vast, everybody specializes somehow, and not everything you crammed for in CS 100 remains memory resident and actionable once you're actually working.
I work in software security, where file formats are especially relevant; I write a new debugger an average of once a year. This was a good post. And file formats are in fact arcana to most working programmers, as I've learned by actually having conversations with working programmers that touched on how executable loaders work.
First-year computer science sounds like a lot of work. I've been told, at various times, that every first-year CS student should know Java, Python, C, assembly, data structures, algorithms, operating systems, and compilers.
I guess the second through fourth years are spent sleeping, to make up for the sleep debt from that first year?
Anyway, CS students may very well be polymaths who shoot lightning from their fingertips, but those of us who spent our wayward college years solving the Schrödinger equation and eating pizza are happy to read a fun article about the executable formats that we've never really taken the time to look into.
I know that you work in software security, that's why I'm surprised in the first place. In the other case I wouldn't comment at all, because, as I said, I don't think article is bad or good. It's superficial, but nothing new here. It's ok to write superficial articles, because they are useful for somebody as well. The word "deep" is what worried me, not the article itself.
It's perfectly natural that not everyone knows how exactly executable loaders work. The same goes for what you called "file formats". That article isn't about that, because, as I said, all interesting parts are skipped. All the facts revealed are that computer executes some files, that are not arcane magic inside, but can be read and decompiled (which isn't even always true, but author doesn't mention it as well, which is natural). Static and dynamic linking are terms everyone should know as well, because even if you use only interpreted languages like Python, once in a while you find some library that requires manual compilation and run into all these nasty problems like required library versions don't match. Or maybe you could notice that the same Qt app compiled for Windows takes a lot more space on the disk than that does when compiled for Linux and ask yourself why. So, it isn't something that only specialists know, it's a basic fact about how computers work.
I don't know about today, but 15 years ago you couldn't be considered programmer without knowing that much at least. It's surprising to hear that it is considered deep.
I know what you mean. I think any person with proper CS knowledge should know these things, sure. What is nice about this article is it is 1) short, 2) gives people that know roughly how executables work knowledge about tools like readelf and really basic things like objdump. This is not a deep article, at all, not even in the ballpark. But it can give someone a short bite to see and decide if they want to explore deeper references, presumably linked to by the article.
> I don't know about today, but 15 years ago you couldn't be considered programmer without knowing that much at least. It's surprising to hear that it is considered deep.
60 years ago, you couldn't be a programmer without knowing the exact binary code of every opcode your machine executed, and how all of the peripherals worked at the lowest possible level.
Back then, of course, all programs were trivially small to fit in hilarious amounts of memory and mass storage, and graphical programming was a specialized topic, to put it mildly. Networking in the form we know it now flatly didn't exist.
I'm not convinced that the amount of knowledge programmers know has changed, but the kind of knowledge surely has.
That makes me think of the 1986 vintage Mac SE sitting on one of my tables, more as a decoration than anything. It's new enough that it's almost sorta kinda possible to get it on the internet, yet also old enough that it's somewhere between an adventure and a PITA to do so, and pretty challenging to do anything vaguely useful with it once you do.
I don't know as much about the nitty gritty details as I'd like, but it's damn cool that I can now write 1 line of Ruby that fires off a query at a web server somewhere, gets a JSON reply, parses it into a Hash, and delivers it back to me, doing roughly a kazillion hugely complex things along the way. It lets us all spend a lot more time building things that are useful to customers instead of scrounging around with bits, fun as that can be sometimes.
> Well, it's true that average HN article level may be somewhat low, but this one is just telling stuff that every 1-year CS student should know without actually going into details.
Based on the hundreds of programmers I've met with CS degrees, I would say that 90% of them have probably never even written C, most write Java which this still applies to I suppose. And of those 10% I doubt any of them had the curiosity to use a tool like readelf to understand symbols and static linking. Given that, I highly doubt most first year CS students have any idea what this article is even about, much less care.
If everyone in the world shared your attitude about sharing knowledge, the world would have one smart person and a bunch of morons. Get off your pedestal.
In my experience, no, you can't assume a first-year CS student knows that. You can't even assume CS grads who've been working for 10 years know that, or remember it if they did. You're right that it isn't rocket science, but it's more down in the weeds now than ever before.
As somebody who works on operating systems, I share your disappointment that it's not common knowledge, but it really is trivia insofar as it's related to the sorts of things that most working programmers do these days.
However, I do not share your disappointment that somebody had the nerve to write a perfectly fine article about something you already knew about. How dare they! How will you ever deal with the shame you feel on their behalf? These are truly tragic days we live in. :)
Actually, no. No first-year CS students should know about ELF. They should know about:
1. basic data structures (variables, arrays, simple binary trees)
2. static control flow (sequential execution and control structures)
3. dynamic control flow (the call stack and how exception handling works (in C: setjmp()/longjmp()))
4. basic program structuring and hygiene (functions, named constants, picking good names)
Focusing on that instead of the details of machine-level knowledge is what separates CS from IT; we need both, so we should not try to make our CS programs bad copies of our IT programs.
> I think ideally a first-year CS student has been programming and learning prior to freshman year, is probably what the parent meant.
It's nice when people come in to a class warm, as opposed to completely cold, but it's bad pedagogy to assume specialized knowledge beyond what's explicitly listed in the course prerequisites.
I like them. I'd imagine that's how I'd feel as I hit those "a ha" moments while demystifying something. There's so many because the article is only covering those moments, which is another nice feature.
(Here we are writing list concatenation in infix notation, rather than the customary prefix notation). '[1,2,3]' is shorthand for 1 : (2 : (3: [])) in Haskell.
Thanks, I appreciate the walk through. So it's essentially just copying an array (in this particular example). I'd still argue that the syntax is not natural - since you have to have specific knowledge of the operator and the function.
Syntactic naturality seems like it's mostly a function of familiarity. That said, the example has, for a lot of mathematical reasons, a great deal of semantic naturality.
It's far from immediately obvious, but `foldr (:) []` is the way to copy singly-linked lists. In particular, if you look at a linked list as a degenerate tree then what `foldr f z` does is replace all of the nodes with `f` and the single leaf with `z`. Since cons, i.e. (:), creates a node and nil, i.e. [], is that single leaf then `foldr (:) []` is a very natural way to do nothing at all.
So it's kind of the most boring interesting example imaginable.
A "well actually" occurs when a person is splitting hairs. This can be desirable, for example when you're trying to obtain a complete understanding of something. But in normal conversation, such pedanticism is often annoying. The "well actually" policy, as I understand it, is an attempt to make such hair splitting "opt-in" rather than "opt-out".
The article argues against overly formal approaches, but I actually think it errs too much in that direction. When I'm studying a math thing, at some point I need you to shut up with the English words and simply describe the object in terms of sets, functions, relations and properties. Categories stopped being mysterious for me once I realized they were just multigraphs with a certain algebraic operation defined on edges (composition) such that edges with certain properties (identities) were stipulated to exist.
Ha, I actually read his last sentence as sarcasm initially and thought "haven't I heard this joke..." only to find that you had already referenced the one I was thinking of:
"A monad is just a monoid in the category of endofunctors, what's the problem?"
I didn't mean to imply it was trivial to understand, but it's no more difficult than the formal definition of, say, a derivative (which involves a limit, and hence an epsilon-delta construction, which in my experience are not so easy to fit in your head at first). Understanding what a multigraph is should be doable.
I suspect that that explanation only works on people with a fairly specific background. I think the hope of the article is to target a wider range of people who could benefit from understanding the formalism.
Just as it would be unreasonable to require that anyone who drives a car roughly understands how an engine works, it would be similarly unreasonable to require that anyone who uses the internet roughly understands how addressing works. It is of course true that your car ownership experience will be greatly enriched by understanding roughly how an engine works, and that your internet experience will be greatly enriched by understanding how addressing works. But neither should be a prerequisite.
You underestimate how subtle the concepts underlying addressing are, probably because, like many technical people, you have understood how URLs work for so long that you can no longer remember what it is like to not understand them.
Each one of these is a completely different implementation detail which the average user doesn't care about and, honestly, won't necessarily understand without understanding the underlying technology behind these sites.
Remember: Most users barely understand, if at all, what a browser is! And if you want to see a comparable challenge, try to explain to someone just one thing - why do some sites have www vs. not.
The parent's sentence was really great:
"You underestimate how subtle the concepts underlying addressing are, probably because, like many technical people, you have understood how URLs work for so long that you can no longer remember what it is like to not understand them."
> Try and explain to the average user why URLs on HN look like this:
That's easy. "The information following the domain name is used to route your request to the appropriate destination".
> Each one of these is a completely different implementation detail which the average user doesn't care about and, honestly, won't necessarily understand without understanding the underlying technology behind these sites.
Why do they have to understand the "underlying technology" at all?
If you think physical addresses are simpler, you'd be wrong:
Sgt John Smith
Headquarters Company
7th Army Training Center
ATTN: AETT-AG
Unit 28130
APO AE 09114-8130
or:
Mr John Doe
CMR 333 Box 2345
APO AE 09903-0024
or:
John Doe
C/O Acme, Inc.
STE 12
123 Main St NW
Placename, State 12345-1234
or:
Jane Doe
P.O. Box 562
Placename, State 12345
or (this is dual addressing, guess what it means? it doesn't mean the P.O. box is at 123 Main St NW):
Jane Doe
123 Main St NW
P.O. Box 562
Placename, State 12345-1234
or:
Don Johnson
Professor, GIS Studies and Internet Arguments
UCIA Computer Science Department
5th Floor Rockefeller Building East
C/O UCIA
12345 Main Address Street
Placename, State, 12345-1234
These are complicated, and we haven't even delved into common abbreviations, street layout consistency, relative addressing, or, god forbid, international addressing.
Yet somehow people write, address, and successfully send mail, every single day. They get in their cars and navigate the interstates and the weird street grids and one-way streets that they're unfamiliar with, and eventually wind up at the right place.
Or, they plug the address into their GPS and get there, without ever really understanding how the GPS performs the routing, just that it does, but still fully cognizant of what the addresses mean, even if they don't understand the system under which they were allocated.
I don't think your argument holds water; not even a little.
Well, you're absolutely right that I don't understand most of those addresses.
You're right that I could still send them mails though, in exactly the same way people use URL's - rote copying them, then letting the technology (or mail system) work its magic. I don't need to understand your PO box example to get it working.
> Try and explain to the average user why URLs on HN look like this:
>
> news.ycombinator.com/?id=123123
Easy. All most people need to get out of that is the fact that there's a domain name there, and something to make each page unique. I suspect most people would also quickly recognize that each page has it's own number, similar to street addresses or the serial number found on just about everything these days.
They don't have to actually parse it as a query string. The fact that some URLs reveal a lot more information (like your CNN example) is a bonus.
> Remember: Most users barely understand, if at all, what a browser is!
Remember: 14% of adults[1] in the U.S. are illiterate.
Nobody said that we have everything solved. That doesn't mean we should give up and pretend the problem doesn't exist by saying that "most people don't need to read".
I'm very surprised at that statistic, so thanks for bringing it to my attention.
"They don't have to actually parse it as a query string. The fact that some URLs reveal a lot more information (like your CNN example) is a bonus."
Well, what you're saying is exactly what Chrome is doing - make people only care about the domain, don't bother with the rest of the information as it's "only there to make a page unique". What exactly are we disagreeing on?
I think grandparent's point was that the part after the domain serves simply as identification of the information that is requested from the domain. Just like the foo in foo@bar.com identifies the user at that email domain. The user doesn't need to understand the particular implementation of the identification, just the principle "same string, same page". This is important to understand that URLs can be copied and used as links on the web. I've seen people not understand this: e.g. someone uploading a video to Youtube, then when I suggest to add a link to another video, or their Facebook page, not knowing that this can be achieved by copying the URL of the relevant pages out of the address bar. This is the missing internet literacy, and it seems unlikely that moving away from the "URL is just a piece of text" principle would make the understanding easier.
Sadly the simple principle of "same URL, same information" breaks down somewhat because of cookies (and, to a lesser extent, IP localization and user agent). http://facebook.com/ shows completely different information dependent on the logged-in account, and in fact that same person mentioned above was aware that URLs without a path part is the home page of that domain, and was thus expecting that http://facebook.com/ would show the same information to everyone. I'm not sure whether he/she really believed that the whole world should be seeing his/her posts, but it surely is a bad move by Facebook, it actively breaks the premise of URLs and is thus arguably damaging for internet literacy (and perhaps they are purposely doing it to mislead some people into a false sense of personal importance).
I agree with the first point, but it breaks down for me when you suggest that the principle of "internet literacy" is more important than making things easier to use for the user. I would not want the people building my car to have never invented the automatic gear, because they decided somehow that it removes me from understanding how a car works.
The difference is that understanding what a gear is is not essential to the functioning of a car at all. You can completely remove it without loss for that particular user nor for the community of car users.
Whereas not knowing that a URL is a piece of text which specifies a particular content on a domain and can thus be copied and used for linking is a loss both for the user and for the community.
You'll have to reinvent the principle of linking in another form to avoid the loss (e.g. every web and local app would need special GUI functionality to use instead, and the need for specification what to link would not be completely removable, unlike the gear).
A mistep with urls is that they're encoded when shown in the address bar. If spaces etc were just shown as is they'd instantly become more readable. Given that possibility you might find that developers drop a lot of the noise like hyphens (which really is a hack to have readable spaces in urls anyway).
"Try and explain to the average user why URLs on HN look like this"
You're missing the point, you don't have to understand that anymore than you have to understand why my street number is 4 digits long or my street name ends in "street" instead of avenue, crescent, drive, lane, etc.
Wait, it's not clear to me from the blog post. Did they make a system that obsoletes reCAPTCHA? If so, it's just a matter of time before the spam systems catch up, correct? If so, what's the successor to CAPTCHA? Or is the web just going to be full of spam in the future?
Digging in to the journal article, the technique they use can only scale to captchas with 8 characters or less, so having a longer word is a simple fix.
I'm not sure what "fit to be a developer" means. People (with very few exceptions) are not born with labels that say "able to write software professionally" or "not able to write software professionally". Try to understand what has prevented you from landing jobs so far. Go through http://sijinjoseph.com/programmer-competency-matrix/ to determine what you lack, and then work on improving those. (Not saying that that link is the ultimate criteria for determining one's suitability as a programmer, just that if you really don't know what you're missing, it will give you some ideas)
I don't like the part about "no circular dependencies." If the language supports it, such as in c#, then why not? There are cases where you want Class A to use a B and Class B to use an A. Unless that's not the kind that they're talking about.
I've tried reading Rudin a couple times, but it's a bit of a slog. There are easier analysis texts. Abbott's Understanding Analysis is good, though a bit basic. I'm currently reading Terence Tao's Analysis I, which is very good if you're in the right frame of mind for it. The first 150 pages are spent building the real numbers from scratch, starting with set theory and the Peano axioms. You successively construct the naturals, the integers, the rationals, and then the reals (as equivalence classes of Cauchy sequences of rationals). It's fun to see how the sausage is made, but I can also admit that when I was just starting out in math I might have found this book unbearably tedious.
Find a book about C++. Read as much of it as you can. Write some code. (solving a few Project Euler problems might be worth it as long as you focus on writing dumb brute force solutions). Prioritize the parts of the book that cover things you think will be on the test.
60 pages before defining vector spaces seems like a not so good approach to linear algebra. Since Linear Algebra Done Right has already been recommended, I will suggest Linear Algebra Done Wrong: http://www.math.brown.edu/~treil/papers/LADW/LADW.html
I know, I would have liked to have linear transformations and vector spaces moved forward, but without the computational skills (grunt work) you can't really do much with vector spaces and lin. trans. except define them.
Then just define them! The entire subject is about linear transformations on vector spaces. The biggest problem with linear algebra as its taught today is that students uniformly come out with an understanding that linear algebra is about matrices. It should be made clear that a matrix is simply a representation of a transformation, and then when you do teach the grunt work operations, you can teach them as what they actually are (multiplication: composition of transformations, determinant: area of unit square under transformation, etc.)