"Define errors out of existence" might sound like "make illegal states unreprese...

hyperpape · 2024-12-21T23:53:05 1734825185

Does Ousterhout actually say modules must always have a longer implementation than their spec, or just that this is a generally desirable feature?

If he did, I agree with you, he was wrong about that. I also agree that the unix file API is probably not a good example.

But whether or not he did, I think the dissection of edge cases would be better off emphasizing that he's got something importantly right that goes against the typical "small modules" dogma. All else being equal, deeper modules are good--making too many overly small modules creates excessive integration points and reduces the advantages of modularity.

P.S. While I'm here, this is not really in response to the parent post, but the example in the article really does not do justice to Ousterhout's idea. While he does advocate sometimes just inlining code and criticizes the pervasive idea that you should shorten any method of more than n lines, the idea of deep modules involves more than just inlinining code.

Darmani · 2024-12-22T01:40:08 1734831608

I'd say he's in between — he strongly recommends that most modules be "deep."

I agree that blindly making lots of tiny things is bad, but his criteria for how to chunk modules is flawed.

lgas · 2024-12-22T00:45:35 1734828335

> Does Ousterhout actually say modules must always have a longer implementation than their spec, or just that this is a generally desirable feature?

I mean the spec is a lower bound on the size of the solution, right? Because if the solution were shorter than the spec, you could just use the solution as the new shorter spec.

Darmani · 2024-12-22T01:37:32 1734831452

Not necessarily. The implementation is very often more defined than the specific. If the implementation is the spec, then it means that even the smallest change in behavior may break callers.

Mawr · 2024-12-22T10:57:52 1734865072

> his main example of a deep module is actually shallow.

It's not, you're just ignoring what he said:

"A modern implementation of the Unix I/O interface requires hundreds of thousands of lines of code, which address complex issues such as: [... 7 bullet points ...] All these issues, and many more, are handled by the Unix file system implementation; they are invisible to programmers who invoke the system calls."

So sure, the `open` interface is big in isolation but when compared to its implementation it's tiny, which is what you've badly missed.

The book also brings up another example right after this one, that of a Garbage Collector: "This module has no interface at all; it works invisibly behind the scenes to reclaim unused memory. [...] The implementation of a garbage collector is quite complex, but the complexity is hidden from programmers using the language". Cherry picking, cherry picking.

Then you proceed to not mention all the other key insights the book talks about and make up your own example of a stack data structure not being a deep abstraction. Yes, it's not. So? The book specifically emphasizes not applying its advice indiscriminately to every single problem; almost every chapter has a "Taking it too far" section that shows counterexamples.

Just so you don't attempt to muddy the waters here by claiming that to be a cop-out, the very point of such books is provide advice that applies in general, in most cases, for 80% of the scenarios. That is very much true for this book.

Overall, your formal background betrays you. Your POV is too mechanical, attempting to fit the book's practical advice into some sort of a rigid academic formula. Real world problems are too complex for such a simplified rigid framework.

Indeed, a big reason why the book is so outstanding is how wonderfully practical it is despite John Ousterhout's strong academical background. He's exceptional in his ability to bring his more formal insights into the realm of real world engineering. A breath of fresh air.

Darmani · 2024-12-22T20:03:14 1734897794

Hi Mawr,

I don't have much to say to most of your comment --- a lot of the text reads to me like a rather uncharitable description of the pedagogical intent of most of my writing.

I'll just respond to the part about deep modules, which brings up two interesting lessons.

First, you really can't describe an implementation of the Unix IO interface as being hundreds of thousands of lines.

That's because most of those lines serve many purposes.

Say you're a McDonalds accountant, and you need to compute how much a Big Mac costs. There's the marginal ingredients and labor. But then there's everything else: real estate, inventory, and marketing. You can say that 4 cents of the cost of every menu item went to running a recent ad campaign. But you can also say: that ad was about Chicken McNuggets, so we should say 30 cents of the cost of Chicken McNuggets went to that ad campaign, and 0 cents of everything else. Congratulations! You've just made Big Macs more profitable.

That's the classic problem of the field of cost accounting, which teaches that profit is a fictional number for any firm that has more than one product. The objective number is contribution, which only considers the marginal cost specific to a single product.

Deciding how many lines a certain feature takes is an isomorphic problem. Crediting the entire complexity of the file system implementation to its POSIX bindings -- actually, a fraction of the POSIX bindings affected by the filesystem -- is similar to deciding that the entire marketing, real estate, and logistics budgets of McDonalds are a cost of Chicken McNuggets but not of Big Macs. There is a lot of code there, but, as in cost accounting, there is no definitive way to decide how much to credit to any specific feature.

All you can objectively discuss is the contribution, i.e.: the marginal code needed to support a single function. I confess that I have not calculated the contribution of any implementation of open() other than the model in SibylFS. But Ousterhout will need to do so in order to say that the POSIX file API is as deep as he claims.

Second, it's not at all true that a garbage collector has no interface. GCs actually have a massive interface. The confusion here stems from a different source.

Programmers of memory-managed languages do not use the GC. They use a system that uses the GC. Ousterhout's claim is similar to saying that renaming a file has no interface, because the user of Mac's Finder app does not need to write any code to do so. You can at best ask: what interface does the system provide to the end-user for accessing some functionality? For Finder, it would be the keybindings and UI to rename a file. For a memory-managed language, it's everything the programmer can do that affects memory usage (variable allocations, scoping, ability to return a heap-allocated object from a function, etc), as well as forms of direct access such as finalizers and weak references. If you want to optimize memory usage in a memory-managed language, you have a lot to think about. That's the interface to the end user.

If you want to look at the actual interface of a GC, you need to look at the runtime implementation, and how the rest of the runtime interfaces with the GC. And it's massive -- GC is a cross-cutting concern that influences a very large portion of the runtime code. It's been a while since I've worked with the internals of any modern runtime, but, off the top of my head, the compiler needs to emit write barriers and code that traps when the GC is executing, while the runtime needs to use indirection for many pointer accesses (if it's a moving GC). Heck, any user of the JNI needs to interface indirectly with the GC. It's the reason JNI code uses a special type to reference Java objects instead of an ordinary pointer.

If you tally up the lines needed to implement either the GC or the POSIX file API vs. a full spec of its guaranteed behavior, you may very well find the implementation is longer. But it's far from as simple a matter as Ousterhout claims.

Dansvidania · 2024-12-22T22:12:13 1734905533

the example you quote for "Define errors out of existence", while indeed it does not follow "make illegal states unrepresentative" does follow what IMO also is an FP principle: "a total function is better than a partial one"

philosopher1234 · 2024-12-22T00:43:14 1734828194

Your review is great! But I think the idea that it’s in opposition to PoSD is not right, I think it’s a further development and elaboration in the same direction of PoSD

jcamenisch · 2024-12-23T15:33:43 1734968023

This is an interesting observation! It seems like the "deep modules" heuristic has validity under it, but Darmani is looking for a more universal, rock-bottom way to define the principle(s) and their boundaries.

Darmani, is it fair to say that each interface should pay us back for the trouble of defining it—and the more payback the better? And given that, accounting for the ROI is either very complex work, or just intuitive gut instinct—as you point out in the Chicken nuggets example?

On the one hand, this is the stuff of religious wars. On the other hand, I see value in having a mental model that at least prompts our intuition to ask the questions: What is this costing? And how much value is it adding? And how does that compare to some completely different way of designing this system?

E.g., for users of certain systems, the cost of a GC may be roughly 0 as measured by intuition. I'm thinking of a system where the performance impact is in the "I don't care" zone, and no one is giving a single thought to optimizing memory management. For other users in other contexts, the rest of the interface of the GC becomes relevant and incurs so much cost that the system would be simpler overall without garbage collection.

Many other systems sit somewhere in between, where a few hot loops or a few production issues require lots of pain and deep understanding of GC behavior, but 99% of users' work can be blissfully ignorant about that.

And in many of these contexts, well-informed intuition might be the best available measurement tool for assessing costs and benefits.

Darmani · 2024-12-23T23:50:29 1734997829

Hey Jonathan!

> each interface should pay us back for the trouble of defining it—and the more payback the better

This seems to be the core question of this comment. I'll make the boring claim that every piece of code should pay us back for the trouble of defining it, which doesn't leave much more to say in response.

An important thing in this kind of conversation is to keep clear track of whether you're talking about the general idea of an interface (the affordances offered by something for using it and the corresponding promises) or the specific programming construct that uses the "interface" keyword. When you talk about defining an interface, you could mean either.

Another thing to remember is that, when you use "interface" in the former sense, everything has an infinite number of interfaces. For example, your fridge offers a dense map of the different temperatures and humidities at each spot. You could "learn this interface" by mapping it out with sensors, and then you can take full advantage of that interface to keep your veggies slightly fresher and have your meat be defrosted in exactly 36.5 hours. But if you get that information, there are countless ways to forget pieces and go from "the back left of the bottom shelf tends to be 35-36 F" to "the entire back of the bottom shelf tends to be somewhere between 0.5 and 2 degrees colder than the top of the fridge" down to "idk the fridge just keeps stuff cold." These are examples of the infinitely many interfaces you can use your fridge at, each offering a different exact set of abilities, most of which are irrelevant for the average user.

Darmani · 2024-12-22T19:25:13 1734895513

My review has a bit of a negative vibe, but when when I look through my paper copy of PoSD, the margins are full of comments like "Yes!" and "Well said."

alpinisme · 2024-12-21T23:09:35 1734822575

I haven’t looked at the substr function but is that not similar to how you can `take 5 [1,2,3]` or `zip [1,2,3] [‘a’, ‘b’, ‘c’, ‘d’]`

noelwelsh · 2024-12-22T10:30:19 1734863419

Nice review. It reminded me of some of the WTF moments from the book :-) I should go back to it and write my own.

musicale · 2024-12-21T23:24:11 1734823451

Nice and seemingly balanced review.

Defining errors out of existence should be mandatory for all golang programs.

fuzztester · 2024-12-22T05:59:11 1734847151

err, are you serious, sir?