Hacker News new | past | comments | ask | show | jobs | submit login
Graydon Hoare: 21 compilers and 3 orders of magnitude in 60 minutes (lambda-the-ultimate.org)
205 points by eternalban on Sept 9, 2022 | hide | past | favorite | 50 comments



That's the third Hoare for me today.

1. Tony Hoare - "I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years." Null References: The Billion Dollar Mistake - https://www.infoq.com/presentations/Null-References-The-Bill...

2. C.A.R. Hoare - "The most important property of a program is whether it accomplishes the intention of its user."

3. Graydon Hoare - "Rust started as a personal project (on my own laptop, on my own time) in 2006. It was a small but real 17kloc native compiler for linux, mac and windows by the time Mozilla began sponsoring it in 2009-10."

EDIT: C.A.R. = Charles Antony Richard = Tony. Not to be confused with his twin brother C.D.R.


> C.A.R. = Charles Antony Richard = Tony. Not to be confused with his twin brother C.D.R.

I thought you were making a Lisp joke, and I Google for Cdr Hoare.


2. C.A.R. Hoare - "The most important property of a program is whether it accomplishes the intention of its user."

This is something I am guilty of too often. I get lost in fixing the minute details and edge cases that might come reality under very special circumstances only.

It is good to step back once in a while and focus on whether the thing you are working on actually has an effect on the user experience or user interaction with your product.

Don’t lose the big picture out of sight for all the small edge cases


The slide on V8 seems to give the impression that V8 just changed "IRs", but in reality, all of the compilers over the years are so different from each other that they really are totally different systems. V8 has been a microcosm of execution strategies and has completely revamped itself multiple times.


The slides seems like a history talk or a pep talk. I didn't really get anything from it and I wrote a non trivial compiler so I would understand the details. This seems like it'd be a fun talk to listen to. A quick search doesn't show any audios/videos online


"On March 26, Graydon Hoare, the original creator of the Rust programming language, stopped in to speak about compilers to some lucky University of British Columbia students in the school’s introductory class to compiler construction.

"With the aspiring compiler designers of tomorrow in mind, Hoare’s talk spanned the history of building compilers for programming languages (He didn’t record the talk, so we have the slides to go by). He told the students he wanted to demystify that space “between class projects and industrial compilers” to “reduce terror, spark curiosity, encourage trying it as a career.”

https://thenewstack.io/rust-creator-graydon-hoare-recounts-t...


It's a shame the maru compiler didn't get a mention: Self-hosted (x86) in 1800LOC, bootstrapped from C-maru compiled with gcc or clang, 70% the speed of C with gcc.

Current maintainer is Attila Lendvai https://github.com/attila-lendvai/maru/tree/piumarta


I often look at C codebases and think "what an unreadable mess!" Most of that is probably because of my own shortcomings, but occasionally I see a codebase like this and realize it doesn't have to be that bad!


Some items of interest mentioned in the talk:

SRI-ARC: https://archive.computerhistory.org/resources/access/text/fi...

Frances Allen (RIP): https://news.ycombinator.com/item?id=24066832


Slide #21 mentions "your textbook has 3 implementation flavours. Java, C, ML. That's probably https://www.cs.princeton.edu/~appel/modern/ then.


>Use compiler-friendly languages, by which he is really taking about languages that are good for implementing compilers, like Lisp and ML

why?


You can easily match over structures, reaching into ASTs to any depth, which makes writing certain optimizations easier. As a very contrived example:

  match expr with
  | Mul (Const x, Const y) -> Const (x*y)
  | Mul (x, Const 2) -> (* optimize as a left shift *)
  | ...


It might not be a great language to create compilers, but python also has a pretty good structured pattern matching as of version 3.10:

  class BinOp:
     def __init__(self, left, right, operator):
         self.left = left
         self.right = right
         self.operator = operator

  sum_of_ints = BinOp(left=1, right=1, operator='+')

  match sum_of_ints:
     case BinOp(left=int(left), right=int(right), operator='+'):
         print(f'Summing int {left} + {right}')
     case BinOp(left=str(left), right=str(right), operator='+'):
         print(f'Concateneting strings {left} + {right}')
     case _:
         print('Don\'t know how to sum this')


One shortcoming (though it's nice that Python has this) of this is that this isn't checked for exhaustiveness by the compiler (because there isn't a Python one) which makes it easy to forget cases and introduce bugs.

Not sure if there are any runtime checks for this?


And look at the difference in verbosity. Lisp is to Python as Python is to Java.


In addition to the pattern-matching rwmj pointed out, for ML I'd add strong static typing (because compilers are bug-prone!) with sum types, statically checked pattern matching, implementations that generate efficient code (because compilers are often slow), automatic memory management (usually GC), and duck typing in the form of higher-order functions. Lisp shares the last two of these.

Both ML and Lisp make it pretty easy to write out complex domain-specific data structures (like ASTs) as part of your code, but I haven't found that as valuable as you might expect when writing toy compilers.

Pattern-matching, though, that's super important. Compilers are full of case analysis of sum types, and not just in the sort of optimization transformations rwmj mentions, though in a sufficiently advanced compiler most of the code will probably be in the optimizer.

(For these purposes Python is basically a slow Lisp.)


Big question. Some glittering generalities:

1. These languages, having been used many times over to write compilers, have been battle-tested.

2. These languages do a good job of making immutable data structures easy to reach for and use, with escape hatches for more performant, mutable structures if need be; it's easier to reason about correctness when things are immutable by default. Correctness matters a lot for compilers.

3. Compilers are, in some sense, a bag of functions that iterate or map over trees. Pattern-matching is a way to reify trees: the branches of the trees become branches of your match expression. These languages support pattern-matching in a pervasive, first-class sort of way, as they prefer recursion and pattern matching over other kinds of control flow.

4. ML-likes have an advanced type system, while still being relatively ergonomic to write code in, as the type system admits some form of Hindley-Milner type inference. Good for correctness; but not strictly necessary for a compiler, maybe.

5. Lisp-likes don't generally have a type system (although some do!). But Lisps have extreme dynamism in the form of macros. Being able to tersely write code that expands out to exactly what you need can carry the day too. Whereas imperative languages sort of strike a compromise: their type systems are neither all that great (although C++, C#, and Java have all improved on type inferencing in the last decade) nor do they have anything as powerful as Lisp macros.

6. People and culture. Compiler people are generally made forged in academia, where it's generally taught in some Lisp or ML derivative. Type theory papers are usually written in some Haskell-like. Implementing a GHC extension is a path to a degree for grad student.


I expect the former because sexprs give you AST structure for free, and the latter because pattern-matched dispatch on AST nodes is ubiquitous.


Yeah, compare to C where you’d start by making a struct to implement a tree node.


Does it cover JIT compilation? Or modern concurrently garbage collecting runtimes? What are good resources?


It covers JIT, but not runtime. There is The Garbage Collection Handbook, but yeah, I would love to read similar slides for GC.


This is excellent, thanks for sharing!


You're welcome. Also wanted to give HN heads up that LtU is back up.


For those who aren't aware, in addition to inventing Rust, Graydon also wrote Monotone, the source control system whose internal design Git is based on. Interpersonally he can be prickly sometimes but his opinions are well worth listening to.


> Monotone, the source control system whose internal design Git is based on

Huh? They both use content-addressing (identify everything by its hash) which leads to the nice property that history is cryptographically immutable, but at least Mercurial also does (and is also older than Git).

But other than that, they're fairly different.

The concept of what a branch is is fundamentally different. In Git, it's a pointer to a revision, and has a home (it's mutable scalar that a URI could point to). In Monotone, a branch is "all revisions - whether or not you know about them - that have a certificate that you trust that says they're in that branch".

(I describe that as Monotone being distributed as in Usenet, while Git is federated as in Email.)

Monotone treats a file as an object (so it can be explicitly renamed or deleted), where Git only has "this tree has a file at this location" and then can infer renames or deletes or creates (or copies or splits or whatever).

Plus of course there's the whole "monotone uses SQLite" thing.


This is unfair. I worked with Graydon years ago and he was always pleasant, helpful, and interesting. I wish I worked with more people like Graydon.

Regardless, you are right to point out two of his contributions that have left an indelible mark on our industry.


I want people who don't wish they worked with more people like Graydon to read the slides anyway instead of replying to my comment with "well actually Graydon is the kind of person who..."

I'm glad to hear you got along well.


> Graydon also wrote Monotone, the source control system whose internal design Git is based on.

Huh. I wonder if that played a part in Torvald’s being open to Rust in the Linux kernel.


It's an interesting question, but I suspect that the technical attributes of Rust as it exists today play a much larger part in Torvalds's choices than does its origin many years ago.

In particular, Rust is less error-prone than C, rather than more so (as is arguably the case for C++), and it does not require the importation of a large runtime into the kernel. And, unlike for example Lua or Scheme, a number of kernels have already been written in Rust, some of which are already fairly full-featured: https://github.com/flosse/rust-os-comparison


Also Rust semantics is more amenable to formal verification. This might have major implications in 5-10 years time.


Also the core code (in C++) of the Stellar consensus-based cryptocurrency.


> Interpersonally he can be prickly sometimes

Aren't we all? What's the point in saying this? I find him quite pleasant.


I want people who don't find him quite pleasant to read the slides anyway instead of replying to my comment with "well actually Graydon is the kind of person who..."

I'm glad to hear you get along well.


I think it’s a bit unfair to many people to try to normalize prickliness by saying everyone is.


How much choice do you think prickly people have in who they become?

If I'm an asshole, and I recognize that I'm an asshole, but I'm either unable or unwilling to change, would you kill me if I asked it of you or society at large? I certainly don't want to be this way, but find I have little choice in the matter. Any signs of mercy for those of us who are born to suffer?


(to be clear, I did find your post funny, so this is friendly feedback.)

> How much choice do you think prickly people have in who they become?

It boils down to if you think being an asshole is congenital and impervious to efforts at self-improvement.

Personally I think we humans all hunger for love and acceptance, and a generous dose of love is Alchemical and can transmutate the lead of assholeness to the gold of beautiful conduct. So next time you see an asshole, give them a hug.

> I have little choice

As a last resort, you could always self-isolate. :}


I’m saying not to dismiss prickliness by declaring that everyone is. Everyone isn’t. Some, many, most? are, sure.

“There’s no point singling out people who steal bulk bin candies. Everyone does it.” is the sentiment that sits wrong with me.


Virtually everyone is prickly when suitably provoked. Some people are just more easily provoked to prickliness than others. Some of it is probably just temperament, but for celebrities in the tech space, I wouldn’t be surprised if they have to deal with an aggravating number of annoying people and that sours their disposition.

All that is a long way of saying have some empathy at the very least and some sympathy if possible.


What if, instead, somebody could offer you a path to change, as unlikely as it may seem? How unwilling are you?


I wish people would find a new word, instead of persisting in using gutter English. Not just back alley, but gutter.


The word “prickly” is neither back alley nor gutter English. The relevant dictionary definition is “ready to take offense” or “easily irritated”. Here’s the etymology: https://www.etymonline.com/word/prickly


I was referring to “asshole”.


Did somebody wake up on the wrong side of the bed?


Maybe reading the literal words, we are all a little prickly sometimes. The intent is probably a polite way of saying "he is frequently an asshole for no apparent reason".


No, if I had meant that, I would have said it that way, probably because (much as I wish it were otherwise) I am frequently an asshole.


You’re being a bit prickly about this.


Ha!


Oh I had no clue he was behind monotone nor git's lifting it.

Have you both collaborated on something ?


Linus was hanging out in the monotone list for a bit and was a little frustrated with the performance at the time. https://lwn.net/Articles/132161/


interesting bit of history

also reread about https://en.wikipedia.org/wiki/Cogito_(software)

it's funny how like a few other projects, the original high level interface (cogito) got canned and the low level one ended up the main. I've read this about apt (which was a sub project of a big debian graphical package management) and lisp too.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: