Hacker News new | past | comments | ask | show | jobs | submit login
“The Mess We're In” by Joe Armstrong at Strange Loop [video] (youtube.com)
157 points by sgrove on Sept 20, 2014 | hide | past | favorite | 77 comments



So, while in general I've enjoyed other things Joe Armstrong has written, I think this talk is pretty discombobulated and doesn't have a coherent narrative.

Here are some of the problems Joe posed:

  - bugs in software like Open Office, Keynote, grunt
  - code not being commented
  - computers booting slowly
  - computers using too much energy
  - code being written for efficiency rather than readability
He talks about distributed hash tables at the end. An interesting topic, definitely cool, but they have nothing to do with the problems he posed earlier.

This seems more like a disconnected list of gripes, plus a completely unrelated list of things he currently finds neat to think about. Which is totally fine, but I don't think it makes a particularly great talk.


I think it goes with the title of the talk. 'The Mess we're in' is not limited to any one of those items in particular but a total view when you add all of those up. Bugs, badly documented code, slow boot, energy consumption and hard to understand code all contribute to the mess we're in. And it's far from an exhaustive list.


Many of those would be fixed if there was a real concern for quality with corresponding responsibility when things go wrong.


I think the bigger issue is the existence of 'disclaimers'. Software production is the only branch of industry that I'm aware of that is capable of getting out from under manufacturing defect claims in that we state categorically (as an industry) that we have no responsibility, liability or even obligation to fix in case we ship a defective product. That really needs to change.


Disclaiming liability and such is an important F/OSS norm. Of course, proprietary software will improve on this (only) if forced to by competition. The omnipresence of EULAs is a much bigger problem, though. I think the F/OSS norms are better all around.


> Disclaiming liability and such is an important F/OSS norm.

Indeed. So there can be a market for companies that take on liability when serving commercial customers using F/OSS code that they have audited and that they feel exposes no more risks than they can bear. The original authors should definitely not be liable if they label their code as alpha or beta quality and do not wish to be exposed at all. They are doing a service to society. But once you aim your code at being used in production by entities that can suffer vast losses if your code turns out to be defective (in other words, if you sell your stuff to a business or private person) then you should be liable for those damages, or at a minimum you should insure against those damages.

Compare software to for instance the engineering profession to see how strange this software anomaly is.


Granted, then the price of such software will skyrocket. The price people pay for most software accounts for the fact that such liability is not covered. Couple that with common software development practices, as well as time invested.

It's not merely accepting liability, there are a whole slew of changes that need to come before this, and frankly, I doubt most people would pay for that. Indeed, if people want to be covered now, they can be. They just have to pay for it.


One way or the other, the problems with software are mostly a matter of economics and incentives.

Computers and software are the way they are because of the set of tradeoffs that the market rewards.

It's certainly possible to write software with fewer bugs, that consumes fewer CPU cycles, memory, starts faster etc...: but it does less. So far, most people and businesses prefer software that does more at the cost of slower boot times, more CPU usage, and a few more bugs.


I think it is mostly a lack of choice. If everybody does it then 'the market' becomes a de-facto monopoly and someone trying to do it right would not stand out in a meaningful way until it is too late. After all, all software is presented as 'bug free' until proven otherwise. Your bug free (really) software looks just as good as my bug free (really not) software on the outside.

Six months down the line, when my not so bug free code eats your data I will point to that line in my EULA that says I'm not liable. Nobody will care, after all it is your data that got lost, not theirs. The fact that your EULA does not have that line and that you offer a warranty does not count for anything until someone would be willing to pay a premium. The only people that would like to pay that premium are the ones that lost their data...

So it's an industry phenomenon. Imagine extrapolating this to buildings. Engineers claim their buildings will stand. Those engineers that talk nonsense will be sued out of business. But if they could disclaim responsibility they would continue to happily practice their borked trade and as a rule people would suffer from this. And so engineer became a word that actually meant something.

But in software 'engineer' is roughly equivalent to 'can hold keyboard without dropping it'.


Some industries have decided that software really does matter, and go to greater lengths to make sure it works.

It'd be annoying if Things for iOS crashed and lost all of my data. It'd be horrifying if flight control software crashed and all aboard a plane were killed. It stands to reason that some software is and should be held to higher standards than other software. It probably doesn't make sense that all software should be held to the same high standard, as it is extremely time- and resource-consuming to ship avionics software. Do folks really want to dish out a few $thousand for a copy of Things for iOS?

And some companies do already take responsibility for open source software. In aerospace development, we routinely use GNU software that has been thoroughly inspected and certified as good by companies that accept many thousands of dollars to stand behind it. (Of course, if we were to upgrade from their GNU Foo 2.1.0 to the FSF's copy of GNU Foo 2.2.0, then all bets are off.)


> Your bug free (really) software looks just as good as my bug free (really not) software on the outside.

No, actually, it looks a lot worse: given the same time and developers, the bug free software will do way less than the buggier software. That, or at feature parity, the bug-free software takes more time and/or requires more people, so arrives later or costs more.

I don't have any direct experience, but I suspect there are niches here and there where the market and/or regulations put a premium on no bugs. Avionics? Some categories of medical software?


Being able to sell software has precious little to do with the actual product but everything with marketing. So my crap software might (on the outside) look even better!

You can only tell good quality software from bad quality software by auditing the code, not by observing the software from a users perspective (unless it refuses even to perform the basics).


Observing the software from a user's perspective is all that counts, though. Marketing is important, yes, but if you're in a niche where quality counts more because bugs cost your users money, then people will sit up and take notice, eventually.


This is why companies that really need good code have internal developers.


There are a couple of things that seem incoherent, but maybe I'm missing something:

- Is the only reason why he touches upon the limits of computation and computing efficiency that it secures distributed hash tables and that there is space for improvement in terms of energy consumption, respectively?

- It seems contradicting that he advocates biologically inspired systems and lowering entropy at the same time. Aren't biological systems even messier than current computer systems?

- Wouldn't the 'condenser' very likely require AGI to be of any use for us?


DNA is a posterboy for spagetti code.

You have a function, it codes for a gene. Only then you have it's anti-sense translation, which can also code for a gene. And then you have post-translational processing, which takes that gene-product and makes into any number of other things. And then you have DNA binding proteins which effect readability, so that gene can code for a different gene when the normal start/stop codons are made accessible or inaccessible further up the DNA strand. And then the whole program also grinds to a halt if you remove any of the "junk" because the junk is used to control execution (translation) speed and inhibit the program from self-destructing (cancer).


Who designed that? Unbelievable.


A wise old graybeard who wrote the book on evolutionary and genetic algorithms.


Came here to see this. I expected to see a well reasoned argument for functional programming and how entrenched the OO mentality is. What's the mess we're in again?


Complexity, the Destroyer of Simplicity. (Joe's talk is broad, perhaps intentionally so, and hopefully will promote some big-picture thinking.)


What's the mess we're in again?

Capitalism. Nobody gets paid to think about the big problems.


"In the middle of the pattern matching algorithm, there was this single comment that read # and now for the tricky bit." (~10:40)

At ~28:00 is he saying that the optimal computing device could operate 27 orders of magnitude more efficiently that what we use today?

At ~36:00, finally something to make me sit up. "content addressable store".


He mentions these topics with regards to distributed hash tables:

  * https://en.wikipedia.org/wiki/Chord_(peer-to-peer)
  * https://en.wikipedia.org/wiki/Kademlia


Thermodynamically speaking, a perfectly reversible computer which erases no bits (and creates no entropy) approaches zero watts per operation to run.

Kurzweil has pointed out that one can think of the 10^15 state changes per second going on inside a 1Kg rock with no outside energy input as a computation device.

Ok, so at 28 minutes, he's referring to some quantum mechanical lower bound to change a bit. I imagine the rock is using that really tiny amount of energy from the outside environment.


> At ~36:00, finally something to make me sit up. "content addressable store".

"content addressable store" or cam (content-addressable-memory) is pretty well-known in cpu-arch, n/w equipment etc. domains. hashtables are s/w counterparts :)


Great talk. Very light-hearted and entertaining.

Always impressed by Joe. Programming since the 60s and still programming, still writing, giving talks. He is a great role model. I wish I would be programming and be just as excited about it when I am at his age.


I wished I would be half that excited about it at my present age!


I like his speaking style and appreciate the intro to Kademlia and Chord. This page has some visual representations: http://tutorials.jenkov.com/p2p/peer-routing-table.html


I am lucky enough to work in a "internal open source environment" - I can and do search the whole code base for a major Fortune 500 daily for pieces that fit my needs. And I often find them - but the process of getting it refactored to fit my exact needs (and so improving their code and the overall reduction in entropy) is mostly impossible - because of humans No one is really willing to change someone else's code without talking to them, agreeing, getting past thier "yes I have tests but if you change it then I don't really know ..."

It's a fundamental problem - good well maintained tests help but this is cultural not technical problem.


Joe and I are thinking similarly :D I'm going to dump some ideas here.

---

# JRFC 27 - Hyper Modular Programming System

(moving it over to https://github.com/jbenet/random-ideas/issues/27)

Over the last six months I've crossed that dark threshold where the desire of building a programming language has become an appealing idea. Terrifyingly, I _might_ actually build this some day. Well, not a language, a programming _system_. The arguments behind its design are long and will be written up some day, but for now I'll just dump the central tenet and core ideas here.

## > Hyper Modularity - write symbols once

### An Illustration

You open your editor and begin to write a function. The first thing you do is write the ([mandatory](https://code.google.com/p/go-wiki/wiki/CodeReviewComments#Do...) [doc comment](http://golang.org/doc/effective_go.html#commentary) describing what it does, and the type signature (yes, static typing). As you write, [your editor suggests](http://npmsearch.com/?q=factorial) lists of functions [published to the web](npmjs.org) (public or private) that match what you're typing. One appears promising, you inspect it. The editor loads the code. If it is exactly what you were going to write. You select it, and you're done.

If no result fits what you want, you continue to write the function implementation. You decompose the problem as much as possible, each time attempting to reuse existing functions. When done, you save it. The editor/compiler/system parses the text, analyzes + compresses the resulting ASG to try to find "the one way" to write the function. This representation is then content addressed, and a module consisting of the compresses representation, the source, and function metadata (doc string, author, version, etc) is published to the [(permanent) web](http://ipfs.io/), for everyone else to use.

### Important Ideas

- exporting a symbol (functions, classes, constants, ...) is the unit of modularity (node)

- system centered around writing _functions_ and writing them once (node)

- stress interfaces, decomposition, abstraction, types (haskell)

- use doc string + function signatures to suggest already published implementations (node, Go)

- content address functions based on compressed representations

- track version history of functions + dependents (in case there are bug fixes, etc). (node)

- if a function has a bug, can crawl the code importing it and notify dependents of bugfix. (node, Go)

- use static analysis to infer version numbers: `<interface>.<implementation>` (semver inspired)

- when importing, you always bind to a version, but can choose to bind to `<interface>/<implementation>` or just `<interface>`

- e.g. `factorial = import QmZGhvJiYdp9Q/QmZGhvJiYdp9Q` (though editors can soften the ugly hashes) (node + ipfs)

- all modules (functions) are written and published to the (permanent) web (public or private)

- when importing a function, you import using its content address, and bind it to an explicit local name (`foo = import <function path>` type of thing)

- the registry of all functions is mounted locally and accessible in the filesystem (ipfs style)

- _hyper modular_ means both to "very modular" and "modules are linked and on the web"

Note: this system is not about the language, it is about the machinery and process around producing, publishing, finding, reusing, running, testing, maintaining, auditing, bugfixing, republishing, and understanding code. (It's more about _the process of programming_, than _expressing programs_). This means that the system only expresses constraints on language properties, and might work with modified versions of existing languages.


Interesting. I had an idea similar to the first part of your comment, in which 'reinventing-the-wheel', everytime you start to write a new program, is avoided by 'autocompleting your code' based on the vast database of open source projects on the internet. If a programmer starts writing a code that declares some variables and opens a for loop, the smart editor starts searching the open source projects and lists the code chunks with beginnings most resembling what you've typed so far, and then you can pick one of those if you like, or keep writing. I haven't thought of what could be done afterwards though.

I tend to think it also has connections with a recent discussion here: https://news.ycombinator.com/item?id=8308881


"The first thing you do is write the (mandatory) doc comment describing what it does, and the type signature. As you write, your editor suggests lists of functions published to the web that match what you're typing."

As a concrete example, if I say (picking a syntax at random; underscore denotes where my cursor is):

  add :: int -> int -> int
  def add(a, b)
    """add two numbers"""
    _
and if the permanent web contains two functions:

  proj1_add :: int -> int -> int
  def proj1_add(a, b)
    """returns sum of arguments"""
    a+b

  proj2_sub :: int -> int -> int
  def proj2_sub(a, b)
    """returns difference between a and b"""
    a-b
Would you envision your system "matching" both or just the first? If just the first, on what basis do you imagine figuring this out?


Just the first. on the strings "sum", "add", and potentially on the symbolic operation `args[0]+args[1]`.


Yes. And Yes. I doubt that our conception of these things will be in use in twenty years - but I do think that assisting the process of taking an idea and putting into production (process of programming) is going to be more and more a feature of our world.

It allows for bringing up the worst ten percent of code without limiting the top ten. It is far far better than the Java idea of making it hard to shoot your foot off with the language.


It is a very interesting talk. Though it might take a while for some one to connect the dots Joe has sprayed all over.

Somewhat related to this, I have been annoyed by the way apps scatter information and have been working to find a way of managing the mess.

http://productionjava.blogspot.in/2014/07/the-broken-web.htm...

and

http://productionjava.blogspot.in/2014/08/coding-can-be-puni...


This talk attempts to provide a strategy for reducing complexity within software. The whole talk is really valuable, but if you're short of time.. skip to 36:00. In a real hurry? Start at 44:17.


During the last minute of the talk, he says:

"Computers are becoming a big environmental threat. They're using more energy than air traffic."

Is this actually true? Sure, the average person spends a lot more time on a computer than in a plane, but still it seems crazy that they'd be comparable. Or at least the comparison isn't very relevant, because the Internet can significantly reduce people's need to travel.


"Save the environment, code in C."


There's a lot to be said for keeping things as simple as possible. Although what qualifies as simple varies from application to application.

I was doing some preliminary analysis for a small project recently, and considering various frameworks and tools. Eventually I realized I could implement what was needed using four JSPs producing static html, with a bit of styling in CSS. No AOP, no injection framework, no JavaScript. And no explicit differentiation between device types.

The resulting application will start up quickly -- which is important when running in a PaaS environment -- and should work on any browser, including weird old stuff like Lynx. Less butterfly. More rat.


You are getting downvotes, but energy efficient computation has been a large driver in the recent upsurge in interest in C++.


Wait C++ is has been getting more popular? I'm not complaining (I mostly like C++), I guess I just didn't get the memo.


Lets see.

- Symbiam was coded in C++

- GCC moved to C++

- Clang and LLVM are done in C++

- Mac OS X drivers are written in Embedded C++ subset

- Windows team is moving the kernel code to be C++ compilable

- JPL is moving to C++

- AAA game studios have dropped C around PS3/XBox 360 timeframe

- It is the only common language available in all major mobile OS SDKs


I mean I was confused by the word 'recent', which to me means the last 2-3 years.


A few items on my list cover your 'recent'.

Decision to move Windows to C++ was announced by Herb Sutter in one of his Microsoft talks and the Windows 8 DDK was the first to support C++ in kernel code.

JPL going C++ was covered at their CppCon 2014 talk.

GCC switched to C++ as implementation language in 2012.

In any case, I would say, all languages with native code compilers can be used in relation to energy efficient computation.

It is just that C and C++ are the only ones on the radar of developers (and managers) looking for mainstream languages with native compilers.


True. Facebook has also been doubling down on C++ development it seems.


There is a talk from Andrei Alexandrescu at Going Native 2013 where he mentions one of Facebook KPIs is request per watt.


"Save the environment. Make black hat hackers, anti-virus tool vendors jobless. Code in Ada."


And run it on OS/370.


I don't know whether it's true or not, but the computer you're typing on is far from the only one likely you have running. Think of all the microprocessors in TVs, microwave ovens, thermostats, cars (I think most cars have about 30-50 MPUs nowadays)...

Let's ballpark it (fill in better numbers if you have them). Assume that a person takes an average of one 5,000 km trip by plane per year. ATW, a Boeing 737-900ER uses 2.59 liters of fuel per 100 km per seat, so that would be about 130 liters of fuel. Also ATW, Jet A-1 fuel has an energy density of 34.7 MJ/l, so about 4,500 MJ for the trip. A watt is one joule per second, and there are about 31 million seconds in a year, so the continuous power output equivalent for the plane flight would be about 145 watts. So, if all your computers put together are consuming more than ~150 watts, they're consuming more than the equivalent of a 5,000 km plane flight over the course of a year.

This is a very rough estimate, but it doesn't appear unreasonable on its face that his statement could be true.


How much energy is used by inefficient government systems that are running on old minicomputers and such? I bet government data centers have hundreds of old clunkers that could all be run on one modern rack of 1U boxes.


Well modern aircraft are pretty efficient and probably not the top offender.

However he also mentions we probably can't do a lot more with computer to reduce their power consumption.


> Well modern aircraft are pretty efficient and probably not the top offender.

It depends entirely on how much they get used! An SUV that gets driven once a year contributes less carbon than a Prius that is driven all day every day.

Comparing the aggregate numbers is obviously the only way to compare computers and airplanes anyway.


This was definitely not one of the better talks of the conference (though I was embarrassed to say so given the way people [rightly] idolize Armstrong), so I highly recommend checking out the rest: https://www.youtube.com/channel/UC_QIfHvN9auy2CoOdSfMWDw/vid...


While I agree that his talk is a little bit "disconnected", I think that it served its purpose when I am reading some "think big" discussion that is going on here. While his talk doesn't have one clear topic, it does deliver huge and interesting thoughts.


I totally agree with the "abolish names and places". Why can't I just write:

    $ cp hash://<somehash> .
and have my computer do whatever it takes to retrieve a file with this hash and copy it on my disk?


I'm not sure if this is sarcasm or not. The usability of such an approach is terrible: humans like names, and like hierarchy. This is the same reason we use DNS instead of IP addresses.

There was URN https://en.m.wikipedia.org/wiki/Uniform_Resource_Name many moons ago that is still used. A URN resolver is a software library that could convert that identifier to a URL.

URLs aren't much different from URNs but they actually specified a default resolution algorithm that everyone could fall back on. They were more successful because there was less need to separate identifiers and locators than originally thought, though it's still a debatable point whether the results are intuitive (eg. HTTP URLs for XML namespace identifiers which may or may not be dereferenceable).

HTTP URLs took advantage of DNS as an existing globally deployed resolver, coupled with a universally deployed path resolver (the web server) the rest was history. You could create a URL scheme called "hash" but it would be hard to see how you could design a standard resolver unless it was one big centralized hash table in the sky - you still would need to, at the very least, map objects to IP addresses.


> humans like names, and like hierarchy.

They do, but that does not mean there should not be other ways to access data. Hashes are universal and unambiguous. There should be a way to retrieve a file given its hash.

> You could create a URL scheme called "hash" but it would be hard to see how you could design a standard resolver unless it was one big centralized hash table in the sky - you still would need to, at the very least, map objects to IP addresses.

There would be an underlying P2P protocol that cp would use. On the other hand, cp doesn't even use FTP or HTTP so maybe that's too much to ask.

Maybe with curl or wget, then.


> Hashes are universal and unambiguous. There should be a way to retrieve a file given its hash.

I'm not sure you've thought through the complexity of what you're asking for.

Hashes require (a) a hash function everyone agrees to, (b) a way to resolve them to an IP address.

Unless you synchronized all global hashes across the Internet on everyone's computer (the git hashed project model -- which we know doesn't scale beyond a certain point unless you bucket things into independent hashes you care about), you'd basically have to do something like Hash://ip_address/bucket/hash or hash://bucket/hash if you want to give a monopoly to one IP address that manages giant hash in the sky

Which is back to URLs and HTTP, and no different from say Amazon S3


Why should there be that? You're talking about an enormous, complicated system. What's the use case that justifies the effort?


BitTorrent magnet links already kind of do this.

Theoretically speaking, isn't it possible to create a virtual BitTorrent FUSE filesystem?


Because this blocks the very human needs for error-checking and maintaining awareness of context.

It's not like you're going to type that in. You're going to copy and paste it from somewhere. So it's just as good to use

http://releases.ubuntu.com/14.04.1/ubuntu-14.04.1-server-amd...

as

hash://b4ed952f6693c42133f73936abcf86b8

In either case, your computer can do whatever it takes to get the file. With a useful URL, you'll have a reasonable notion about what's coming down and whether it matches your intentions.

Without that, the very natural question is, "Did I get the thing I wanted?" For example, it would be easy to paste the wrong hash code.

There are other benefits, like real-time binding. A hash is going to point to a particular sequence of bits. But you may not want a particular file, but rather the best current mapping from an idea to a file. E.g., if Ubuntu discovers a an issue with their released ISO, they can make a new one and replace what gets served up by the URL.


How would you remember the hash? I guess one could have some sort of directory-like system for mapping human-memorable names to hashes...


> How would you remember the hash?

I wouldn't. I'd make a symbolic link.

Basically the current directory/names structure would be an abstract layer above the hash-based system.


Plan 9 already did that in its file system...


And of course it would have to be tree structured to avoid naming collisions and bloat. Oh, wait...


You can.

> aria2c magnet:?xt=urn:btih:1e99d95f....


Didn't know about this. Thanks :-)


I'm guessing this is about what a disaster it is to use software, still to this day, but I just can't deal with his storytelling style. Takes forever to say "OpenOffice has a shitty bug".


What is this about?

Edit: Strange Loop is a multi-disciplinary conference that aims to bring together the developers and thinkers building tomorrow's technology in fields such as emerging languages, alternative databases, concurrency, distributed systems, mobile development, and the web.

Strange Loop was created in 2009 by software developer Alex Miller and is now run by a team of St. Louis-based friends and developers under Strange Loop LLC, a for-profit but not particularly profitable venture.


new to coding, and volunteered for a couple shifts in exchange for entry. first conference ever. this was AWESOME! incredibly well run, and obv not about the $$ - but they freaking deserve all the success they get, financial and otherwise!


Bloat.


this is so true.. we already spend a lot more time fixing and tweaking code than actually creating.


Wow this guy doesn't know what he is talking about. Just a bunch of numbers without any arguments.

I had to stop watching when a laptop was compared to a black hole.

I'm sure the laymen are impressed though.


I think you missed the point, that slide was about the theoretical limits of computation, it is a very weak upper bound that won't be achieved.


Your bio bit says:

> Please comment if you downvote. Or even better; just comment.

So here you go: you are utterly clueless, before you write comments like these do your homework or you will get downvoted a lot.

Or even better; just don't comment. Until you've done your homework, that is. It just increases the noise and does not add to the conversation at all.


FYI this guy designed Erlang :)


We are.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: