The Heart of Unix (2018)

chubot · 2024-10-04T04:08:45 1728014925

I generally agree with this article in that PROGRAMMABILITY is the core of Unix, and it is why I've been working on https://www.oilshell.org/ for many years

However I think the counterpoint is maybe a programming analog of Doctorow's "Civil War on General Purpose Computing"

I believe the idea there was that we would all have iPads and iPhones, with content delivered to us, but we would not have the power to create our own content, or do arbitrary things with computers

I think some of that has come to pass, at least for some fairly large portions of the population

(though people are infinitely creative -- I found this story of people writing novels their phone with Google Docs, and selling them via WhatsApp, interesting and cool - https://theweek.com/culture-life/books/the-rise-of-the-whats... )

---

The Unix/shell version of that is that valuable and non-trivial logic/knowledge will be hidden in cloud services, often behind a YAML interface.

And your job is now to LLM the YAML that approximates what you want to do

Not actually do any programming, which can lead to adjacent thoughts that the cloud/YAML owners didn't think of

In some cases there is no such YAML, or it's been trained out of the LLM, so you can't think that thought

---

There's an economic sense to this, in some ways, but personally I don't want to live in that world :)

camgunz · 2024-10-04T14:45:44 1728053144

I agree so strongly. I'm a Vim user and from time to time discuss Vim/Emacs vs. $IDE, and there are usually good considerations of how the various differences might affect a code base: would the ease of refactoring in an IDE lead to better consistency, does IDE autocomplete scatter typos throughout, etc.

But I can't recall a discussion of how they affect us. The tools we use, the techniques those tools allow and foreclose, profoundly shape our thoughts and feelings. This applies to any creative practice (I'm not one of those "code is art" people, but you are creating something), not just software.

I think I share your worry, but in a more abstract sense: how does the act of thinking about building software shape us, and what would we lose without it? Would we be better off? Would it have been better if we applied those mental resources elsewhere? Has society benefited from a huge swell of humans thinking this way?

Something that heartens me a little is that I think the rich world is on the cusp of being able to do things broadly only because we want to. I may never write another Django app again, unless I want to experience how they did it in the early 21st century. I think this culture is emerging--I wish it were more widespread, and we were more focused on bringing it to all humanity, but its emergence gives me hope.

chubot · 2024-10-05T06:11:54 1728108714

Agreed, I definitely like how Vim makes me feel, and how it opens up some head space

I recently wrote a comment about how I started using it 20 years ago, and even then it was viewed as OLD !! My older co-workers were using Java IDEs, wondering why I started using such an old editor

https://lobste.rs/s/20t1jj/wonderful_vi#c_rnkwas

And the original article is that the creator of Rails has switched to Linux + vim, from Mac + SublimeText I think.

So it's funny how Vim is timeless, and people keep re-discovering it. I discovered it in 2005 and never looked back!

bitwize · 2024-10-04T20:48:51 1728074931

> And your job is now to LLM the YAML that approximates what you want to do

s/YAML/JCL/g

s/LLM/clone and edit/g

And you've pretty much described the mainframe world.

For this reason, one of the things that AT&T thought to do back in the 70s with its new OS, Unix, was to give it to their engineers as a more sensible interface with which to write programs for, and submit jobs to, the mainframe. The version that was built for this purpose was called PWB/Unix (for Programmer's Workbench).

chubot · 2024-10-05T06:05:07 1728108307

Yes! And I remember "Back to the 70's with Serverless" (2020) as a great article which specifically mentioned JCL and the clunkiness of the modern cloud:

https://archive.ph/sGwuv

https://news.ycombinator.com/item?id=25482410

I referenced it on my blog -- it's a shame the link has rotted now

syndicatedjelly · 2024-10-04T05:23:13 1728019393

I see your concern, but don't think it's anything to be worried about. Is an electrician's job at risk because homeowners can purchasing wiring and outlets from a big box store and tap a new outlet in their home? Are mechanics worried about people who do oil changes at home?

There will always be a demand for skilled labor, but the definition of "skilled" is going to continue changing over time. That's a good sign, it means that the field is healthy and growing.

donatj · 2024-10-04T09:50:46 1728035446

My fear, perhaps ill founded, is that the "electricians" of the machines will just age out like COBOL devs. Highly sought after and in demand, yet work no one new is learning to take over.

A large percentage of the current software workforce, professional and open source, are people who learned these skills casually growing up rather than explicitly in school as a career. I'm not sure this demographic exists in any meaningful numbers in younger generations.

Will there be enough people to maintain our foundations when the only ones who understand them are the ones formally educated? What happens to the actual number of people even interested in a computing career path when they didn't grow up with "classical" computers?

I am happy to be totally wrong here, it's just the kind of thing that keeps me up at night.

chubot · 2024-10-04T12:24:07 1728044647

My concern is not really about jobs! It's about the "thoughts you're able to think"

I guess whether you think this is real or not is a similar question to whether you think the iPad/iPhone thing is real

Did that happen, or not? (honest question)

The irony is that if it did happen to a certain person, that person won't notice it

Personally I do think it's very real, because thoughts are correlated with what's "ready at hand", what you can accomplish in an environment

syndicatedjelly · 2024-10-04T23:50:47 1728085847

You can think whatever you want, despite what Paul Graham says. No one is required to pay you for the thoughts you think though

pjmlp · 2024-10-04T05:48:06 1728020886

Usually in many countries, insurances won't pay if something went bad by not being done by a professional electrician or mechanic.

coliveira · 2024-10-03T22:03:54 1727993034

The biggest disadvantage of the shell is that, by exchanging data using text, you lose opportunities to check for errors in the output. If you call a function in a programming language and an erroneous output happens, you get a crash or exception. In a shell, you'll get empty lines or, worse, incorrect lines, that will propagate to the rest of the script. This makes it impractical to write large scripts and debugging them gets more and more complicated. The shell works well for a few lines of script, any more than that and it becomes a frustrating experience.

hggigg · 2024-10-03T22:52:05 1727995925

It's even worse than that. Most non-trivial, and some trivial scripts and one liners rely on naive parsing (regex/cut etc) because that's the only tool in the toolbox. This resulted in some horrific problems over the years.

I take a somewhat hard line that scripts and terminals are for executing sequential commands naively only. Call it "glue". If you're writing a program, use a higher level programming language and parse things properly.

This problem of course does tend to turn up in higher level languages but at least you can pull a proper parser in off the shelf there if you need to.

Notably if I see anyone parsing CSVs with cut again I'm going to die inside. Try unpicking a problem where someone put in the name field "Smith, Bob"...

Bluecobra · 2024-10-04T01:58:26 1728007106

> Notably if I see anyone parsing CSVs with cut again I'm going to die inside. Try unpicking a problem where someone put in the name field "Smith, Bob"...

How do you tackle this? Would you count the numbers of commas in each line then manually fix the lines that contain more fields?

hggigg · 2024-10-04T08:56:55 1728032215

You parse them and reject CSVs that do not conform. There is absolutely no way to reason about a malformed CSV.

ReleaseCandidat · 2024-10-04T04:28:01 1728016081

Yes, as in "check the number of parsed fields for each line" and don't forget about empty fields. Throw an error and stop the program if the number of columns isn't consistent. Which doesn't mean that you can't parse the whole file and output all errors at once (which is the preferred way, we don't live in the 90s any more ;), just don't process the wrong result. And with usable error messages, not just "invalid line N".

harry8 · 2024-10-04T03:01:05 1728010865

http://www.catb.org/~esr/writings/taoup/html/ch05s02.html

paragraph titled DSV style. (Yeah esr, not a fan, whatever...)

Csv sucks no matter what, there is no one csv spec. Then even if you assume the file is "MS Excel style csv" you can't validate it conforms. There's a bunch of things the libraries do that cope with at least some of it that you will not replicate with cut or an awk one liner.

chasil · 2024-10-04T01:00:21 1728003621

You might have enjoyed DCL under VMS.

It did not immediately succumb to envy of the Korn shell.

aadhavans · 2024-10-04T03:10:21 1728011421

What if you had constraints on the CSV files? Suppose you knew that they don't contain spaces, for example. In that case, I don't see the problem with using UNIX tools.

throw10920 · 2024-10-04T03:18:25 1728011905

Then you're not actually processing the CSV format, you're processing a subset of it. You'll also likely bake that assumption into your system and forget about it, and then potentially violate it later.

Well-defined structured data formats, formal grammars, and parsers exist for a reason. Unix explicitly eschews that in favor of the fiction of "plain text", which is not a format for structured data by definition.

throw10920 · 2024-10-04T03:21:18 1728012078

> The biggest disadvantage of the shell is that, by exchanging data using text, you lose opportunities to check for errors in the output.

That's pretty bad, but isn't the complete lack of support for structured data an even bigger one? After all, if you can't even represent your data, then throwing errors is kind of moot.

chubot · 2024-10-04T03:50:31 1728013831

Oils/YSH has structured data and JSON! This was finished earlier this year

Garbage Collection Makes YSH Different (than POSIX shell, awk, cmake, make, ...) - https://www.oilshell.org/blog/2024/09/gc.html

You need GC for arbitrary recursive data structures, and traditionally Unix didn't have those languages.

Lisp was the first GC language, and pre-dated Unix, and then Java made GC popular, and Java was not integrated with Unix (it wanted to be its own OS)

----

So now you can do

    # create some JSON
    ysh-0.23.0$ echo '{"foo":[1,2,3]}' > x.json

    # read it into the variable x -- you will get a syntax error if it's malformed
    ysh-0.23.0$ json read (&x) < x.json


    # pretty print the resulting data structure, = comes from Lua
    ysh-0.23.0$ = x
    (Dict)  {foo: [1, 2, 3]}

    # use it in some computation
    ysh-0.23.0$ var y = x.foo[1]
    ysh ysh-0.23.0$ = y
    (Int)   2

throw10920 · 2024-10-04T04:06:17 1728014777

Structured shells are neat and I love them, but the Unix philosophy is explicitly built around plain text - the "structured" part of structured shells isn't Unixy.

chubot · 2024-10-04T04:21:51 1728015711

It's not either-or -- I'd think of it as LAYERED

- JSON denotes a data structure, but it is also text - you can use grep and sed on it, or jq

- TSV denotes a data structure [1], but it is also text - you can use grep on it, or xsv or recutils or ...

(on the other hand, protobuf or Apache arrow not text, and you can't use grep on them directly. But that doesn't mean they're bad or not useful, just not interoperable in a Unix style. The way you use them with Unix is to "project" onto text)

etc.

That is the layered philosophy of Oils, as shown in this diagram - https://www.oilshell.org/blog/2022/02/diagrams.html#bytes-fl...

IMO this is significantly different and better than say PowerShell, which is all about objects inside a VM

what I call "interior vs. exterior"

processes and files/JSON/TSV are "exterior", while cmdlets and objects inside a .NET VM are "interior"

Oils Is Exterior-First (Code, Text, and Structured Data) - https://www.oilshell.org/blog/2023/06/ysh-design.html

---

[1] Oils fixes some flaws in the common text formats with "J8 Notation", an optional and compatible upgrade. Both JSON and TSV have some "text-y" quirks, like UTF-16 legacy and inablity to represent tabs

So J8 Notation cleans up those rough edges, and makes them more like "real" data structures with clean / composable semantics

throw10920 · 2024-10-04T11:58:29 1728043109

> It's not either-or -- I'd think of it as LAYERED

You can almost always represent one format inside of another. I can put JSON in a text system, or I can hold text in Protobuf. That doesn't mean that the systems are equivalently powerful, or that a text-oriented system supports structured data.

> on the other hand, protobuf or Apache arrow not text, and you can't use grep on them directly

This is kind of tautological, because of course you can't grep through them, because Unix and grep don't understand structure, because they're built around text.

> IMO this is significantly different and better than say PowerShell, which is all about objects inside a VM

Why?

enriquto · 2024-10-04T06:20:52 1728022852

> if you can't even represent your data

any data can be represented as text

coliveira · 2024-10-04T15:12:29 1728054749

Yes you can represent anything as text, but then you create another problem: parsing the representation. You always need to balance the advantage of sending data in a free format and the need to parse it back into usable input.

throw10920 · 2024-10-04T11:41:05 1728042065

Only in the least useful, most pedantic sense of the word. I can "represent" image data as base64 text in my terminal and yet have no tools to meaningfully manipulate it with. This is a gotcha with no interesting thought behind it.

enriquto · 2024-10-04T12:12:59 1728043979

Your example is particularly funny, because one can easily undecode the base64 image, convert it to sixel so that it's displayed in the terminal verbatim, all in half a line of shell pipes. If the image is large, you can also pipe it to an interactive visualizer to look at it. Or, you just "cat" your base64 image as is, together with some recently echoed mail headers and pipe it to mail as an attachement. The tools to do that are standard and come with all unix distributions.

The greatness of unix is that none of the intermediary programs need to know that they are dealing with an image. This kind of simplicity is impossible if the type of your data must be known by each intermediary program. "Structured data" is often an unnecessary and cumbersome overkill, especially for simple tasks.

throw10920 · 2024-10-06T04:06:47 1728187607

> Your example is particularly funny, because one can easily undecode the base64 image

I don't see how that's relevant - the fact that a very small number of specific terminal programs can decode limited data formats and communicate over terminal control sequences that only work for that data format doesn't seem to matter discussion of whether the shell or OS supports structured data in general or whether it gives you the tools to manipulate that data.

Moreover, this is actually an example that further proves my point: you can't use classic Unix tools to process that image data. You can't use cut to select rows or columns, head or tail to get parts of an image, grep to look for pixels - of course you can't, because it's not text, and one of the core tenets of the Unix philosophy is that everything is plain text.

Now, this wasn't a great example to begin with, because media is somewhat of a special case (and it's somewhat sequential data, as opposed to hierarchical). Let's take something like an array of transactions for a personal budget in JSON form, where each transaction has fields like "type" (e.g. "credit", "debit"), "amount", and "other party", and "other party" further has "type" (e.g. "business", "person", "bank") and "name" fields. The Unix philosophy does not allow you to directly represent this data - there is no way to represent key-value pairs or nested data and then do things like extract fields or filter on values. Your primitives are lines and characters, by definition.

> This kind of simplicity is impossible if the type of your data must be known by each intermediary program.

I think you're misunderstanding what "structured data" is, because what you just described isn't a property of structured data. I can write a "filter" program that takes an arbitrary test function and applies it to a blob of data, and that program needs zero information about the data, other than that it's an array.

Or, it's possible that you're just not familiar with the concept of abstraction, where programming systems can expose an interface that requires other systems to not understand their internals.

> The greatness of unix is that none of the intermediary programs need to know that they are dealing with an image.

This has nothing to do with Unix. It's trivial to conceive of a shell or OS where programs pass around structured data and are able to productively operate on it without needing to know the full structure of that data.

> "Structured data" is often an unnecessary and cumbersome overkill

What does this even mean? Data is structured by definition. If something doesn't have internal structure to it, it is not data. The fact that Unix decides to be willfully ignorant of that structure and force you to ignore it doesn't mean that it doesn't exist.

I would hope you didn't mean "representing structured data as such is inconvenient" because that would be a very ignorant statement to make that conflates tooling with data format and that has zero empirical evidence behind it.

enriquto · 2024-10-06T09:25:16 1728206716

> there is no way to represent key-value pairs or nested data and then do things like extract fields or filter on values

man, ever heard of awk? It was designed exactly for that purpose.

throw10920 · 2024-10-06T15:53:57 1728230037

I'm very well aware of awk. This proves my point - Unix doesn't have a way to represent structure, so the only way that you can manipulate it is with purpose-built tools. There's no way to tell "sort" to sort on a specific field, for instance - sort does not understand structure. Things that are trivial in a structured/typed language like PowerShell or Python require a lot of sed/awk glue in Unix and are far less reliable as a result.

Anyone who has a basic understanding of computing theory understands that shell is obviously Turing-complete and can compute anything that another Turing machine has - which means that being able to hack something together does not mean that the system was designed for that paradigm. You can implement Prolog in C, but that does not mean that C is a declarative logic programming language, and if you find yourself repeatedly re-implementing Prologs in your C programs, you'd do very well to have the self-awareness to realize that C is probably not the right choice for the problems you're trying to solve. Similarly, if you have to repeatedly shell out to a tool that understands structure like awk or Perl or jq, you'd do well to consider as to whether the shell doesn't actually fit the problem you're trying to solve.

There's a reason why there are precisely zero large pieces of software written in bash - because it's an extremely poor paradigm for computation.

agumonkey · 2024-10-05T10:49:59 1728125399

reminds me of 'parse, don't validate'

chasil · 2024-10-04T00:23:06 1728001386

At the same time, the POSIX shell can be implemented in a tiny binary (dash compiles to 80k on i386).

Shells that implement advanced objects and error handling cannot sink this low, and thus the embedded realm is not accessible to them.

pjmlp · 2024-10-04T05:50:03 1728021003

Sure they can, Smalltalk and Lisp environments didn't had the luxury of 80k when they were invented.

amszmidt · 2024-10-04T06:13:17 1728022397

No, they had the luxury of having much more.

pjmlp · 2024-10-04T09:26:15 1728033975

Learn computing history, starting with Lisp REPLs on IBM 704 in 1958!

20 years later we had 64 KB to fit an whole OS and applications on home computers.

amszmidt · 2024-10-04T13:25:28 1728048328

I am quite aware of computer history. The Xerox Alto had between 90k and 512k of memory, and a disk drive that managed 2.5M.

Lisp Machines required a 80M disk with at least 512k of memory.

Neither Smalltalk nor Lisp fit in 64k in 1980 (~20 years later). Even the IBM 7094 which ran a very tiny Lisp (1.5) had around 32k 36bit words.

xkriva11 · 2024-10-04T15:12:01 1728054721

The very first version of Smalltalk was written in BASIC on Data General Nova.

Then, in 1983, there was a version of Lisp for ZX Spectrum (which had 48k of RAM and no floppy).

There was also Rosetta Smalltalk:

> ROSETTA SMALLTALK now runs on the Exidy Sorcerer computer! It requires 48K of memory, a disk, and CP/M.

amszmidt · 2024-10-04T15:51:49 1728057109

True, the absolutley first versions were entirely unusable. Glacial is what one of the authors said.

First version of Unix had a 512k disk pack, and no shell -- that 80k binary for shell would have been a dream.

"20 years later we had 64 KB to fit an whole OS and applications on home computers." is quite the claim when we are talking about Unix, Lisp or Smalltalk and comparing them to CP/M with DOS which .. does nothing in comparison, and ignoring literally all other aspects of a computer system.

pjmlp · 2024-10-04T13:29:23 1728048563

Unless I got my math wrong, 32 is less than 80.

Also where did I on my comment mentioned Lisp Machines?

amszmidt · 2024-10-04T13:36:08 1728048968

"20 years later" -- nobody was running IBM 705 in 1980 for Lisp or Smalltalk. The 7601 was backed by hard disk as well. These machines used quite a bit of paging.

You are purposefully confusing multiple decades of computing. Smalltalk was much later, and required quite large machines, so did Lisp when it became popular on machines like the PDP-10 and ITS which had much more memory, just running Macsyma was a PITA.

pjmlp · 2024-10-04T14:03:11 1728050591

We were Lisp on CP/M, ever heard of it?

Also now we are getting MMU and page loading into the argument?

amszmidt · 2024-10-04T14:41:30 1728052890

And CP/M wasn’t running the first versions of Lisp or Smalltalk.

Your claim that Lisp and Smalltalk didn’t have the luxury of 80k when they got invented, when intact they did. Much of the programs that ranusing Lisp specifically was VERY memory hungry.

Now your just arguing for the sake of arguing and just being antagonizing. Have fun.

anthk · 2024-10-04T14:05:43 1728050743

Could Macsyma be run on CP/M on an Altair?

pjmlp · 2024-10-04T14:33:23 1728052403

There were other Z80 computers able to run Lisp and CP/M.

Here he goes again with another application I didn't mention at all.

anthk · 2024-10-04T14:05:12 1728050712

Macsyma on the Simh emulator, Ka10, appears to do it fine under ITS.

amszmidt · 2024-10-04T14:19:38 1728051578

Your not sharing the system with a dozen other people running Emacs and Macsyma :-)

pjmlp · 2024-10-04T14:33:49 1728052429

Timesharing wasn't part of the discussion.

kragen · 2024-10-04T02:07:46 1728007666

that's dramatically larger than any pdp-11 executable, including the original bourne shell, and also, for example, xlisp, which was an object-oriented lisp for cp/m

advanced objects and error handling do not require tens of kilobytes of machine code. a lot of why the bourne shell is so error-prone is just design errors, many of them corrected in es and rc

ronjakoi · 2024-10-04T03:47:48 1728013668

What are es and rc? Can you give some links?

chubot · 2024-10-04T04:02:50 1728014570

Search for "rc shell" and "es shell"

https://en.wikipedia.org/wiki/Rc_(Unix_shell)

https://wryun.github.io/es-shell/

They are alternative shells, both from the 90's I believe. POSIX was good in some ways, but bad in that it froze a defective shell design

It has been acknowledged as defective for >30 years

https://www.oilshell.org/blog/2019/01/18.html#slogans-to-exp...

---

es shell is heavily influenced by Lisp. And actually I just wrote a comment that said my project YSH has garbage collection, but the es shell paper has a nice section on garbage collection (which is required for Lisp-y data structures)

And I took some influence from it

Trivia: one of the authors of es shell, Paul Haahr, went on to be a key engineer in the creation of Google

kragen · 2024-10-04T17:54:41 1728064481

Thanks, I didn't know that about es!

chasil · 2024-10-04T12:51:24 1728046284

David Korn would not agree with you.

"A lot of effort was made to keep ksh88 small. In fact the size you report on Solaris is without stripping the symbol table. The size that I am getting for ksh88i on Solaris is 160K and the size on NetBSD on intel is 135K.

"ksh88 was able to compile on machines that only allowed 64K text. There were many compromises to this approach. I gave up on size minimization with ksh93."

https://m.slashdot.org/story/16351

kragen · 2024-10-04T14:14:44 1728051284

it sounds like he did agree, even laboring under the heavy burden of bourne shell compatibility

userbinator · 2024-10-03T23:27:33 1727998053

The && and || operators let you branch on errors.

anthk · 2024-10-04T14:04:06 1728050646

Not like a Lisp REPL, you have to start over with pipes. That's my main issue with Unix.

theamk · 2024-10-03T22:35:16 1727994916

that's why the rule #1 of robust shell scripts is "set -e", exit on any error. This is not perfect, but helps with most of the errors.

mananaysiempre · 2024-10-03T23:04:54 1727996694

set -euo pipefail, if you're OK with the Bashism.

thristian · 2024-10-04T01:34:20 1728005660

Since POSIX 2024, `set -o pipefail` is no longer a bashism!

nxobject · 2024-10-04T08:33:17 1728030797

I have a slight bone to pick with the author's statement that Unix is homoiconic – sure, I can tail and patch a file, but it doesn't mean that I can seamlessly and quickly manipulate, generate, and execute the canonical representation of executable code in the same way that I can do with s-expressions, quote/quasiquote and eval. I think the bar for meaningful homoiconicity is should at least be raised to include that.

If "I can read my source just into the canonical datatype" was the standard for an environment to be meaningfully homoiconic, you could easily argue that bare metal was homoiconic for the same reason. And in fact I'd argue that it would be easier to hand-assemble VAX instructions to write more opcodes in memory (because the VAX-11 had such an extensive and convenient instruction set, especially all of the three-operand bit swizzling and manipulation instructions) than to do C code generation with a base Unix environment.

ColonelPhantom · 2024-10-04T15:16:03 1728054963

I find the word 'homoiconic' very hard to define anyway. For example, I think Tcl is also often described as homoiconic, which manifests in being able to make e.g. custom control structures (https://wiki.tcl-lang.org/page/code+is+data) although something like Haskell also deals with that quite well. However, in say Javascript, you would need to use special syntax when using every {} so that the block content can be seen (e.g. by wrapping it in a function, or passing it as a string to be eval()ed).

In the case of Unix, the question is probably "what is the canonical representation of executable code". I think the author's intent was to mostly talk about shell scripts, and not something like C code. Manipulating shell scripts via e.g. sed is, I think, pretty doable.

I'm also a bit confused why homoiconicity needs to be convenient to be real. Most people don't find shell a very convenient language in other aspects, either.

anthk · 2024-10-03T20:28:12 1727987292

Today Unix philisophy it's better done at 9front than the Unix clones themselves.

>Functional + universal data structure + homoiconic = power

It everything used TSV or tabular data, yes. But is not the case. With lisp you can always be sure.

>I edit my entries in Emacs.

Emacs can do dired (ls+vidir), eshell, rsync maybe to s3 (emacs package+rclone), markdown to HTML (and more from ORG Mode) and tons more with Elisp. With ORG you can basically define your blog and with little of Elisp you could upload your blog upon finishing.

>21st Century Terminal

Eshell, or Emacs itself.

>. What if we take the idea of Unix programs as pure functions over streams of data a little further? What about higher-order functions? Or function transformations? Combinators?

Hello Elisp. On combinators, maybe that shell from Dave from CCA. MPSH? https://www.cca.org/mpsh/

9dev · 2024-10-03T22:06:48 1727993208

Every time people praise Emacs like this, I wonder if I just don’t get it or they have an Emacs-shaped hammer and only see Emacs-shaped nails. Lots of braced nails, naturally.

lotharcable · 2024-10-04T01:54:59 1728006899

Emacs is unique because it is self-editable. That is you can edit and modify the program on the program realtime. There is a C-based core that can't be updated on the fly, but by and large Emacs a self-mutable Lisp virtual machine that comes with a built-in editor and repl.

Depending on how you want to look at it it is possible to say that Emacs editor you use when you first install it is just the default application for the ELisp machine. This is why people talk about things like Org-Mode as if it is this separate thing. It kinda really is. Sure it is included with Emacs nowadays, but it really is just another Elisp application. And, yes, it is a editor first and the machine is based around concepts like buffers, but it is still a full fledged programming environment.

Which also means that if you don't like Emacs as a editor you can write your own. Which people have done. It makes a great Vi/Vim editor with Evil that is far more compatible with Vim then most people imagine. I use "Meow-mode" which is another model editor that adopts some more modern approaches from things like Helix and puts a lot of focus on improving the efficiency of Emacs keyboard macros.

So saying that Emacs users just have a "Emacs-shaped hammer" makes as much sense as saying that all Java authors have is a big Java hammer or that Linux users can only see problems as Linux nails, or whatever.

There is a downside to all of this, of course.

Emacs where-everything-is-changeable-and-accessible-all-the-time doesn't lend itself to multi-threading, so if you have a lot of stuff going on in the "background" it can cause performance problems. The newer "native compilation" that became standard in the past few years does helps a lot, but there is a still a single thread deep down.

Also if you want to get very productive in Emacs there is a learning curve. If you are a sysadmin type that has been using Vi for decades then going to Emacs is going to be very painful. The best bet for becoming a advanced user very quickly is to learn just enough Emacs to do basic editing and navigating info files... and then just put the effort into learning Elisp. You don't have to do this, lots of people use it for years without learning any real elisp, but it does limit you. Of course thanks to things like Doom Emacs you don't lose much compared to other editors/IDEs.

Also things like Eshell and GNU Calc are criminally underrated and misunderstood. (hint: Eshell is not a terminal emulator and doesn't use a external shell program, so don't confuse it with things like ETerm)

And, hey, I can now have conversations with my editor with the help of ollama. So there is that.

anthk · 2024-10-04T14:10:34 1728051034

Calc is no Maxima but it's very powerful.

skydhash · 2024-10-04T00:26:18 1728001578

The nice thing about emacs is the customization. Unix utilities like ls, grep,… are opaque blob with switches. With Emacs, you have direct access to the functions and variables. Instead of praying for a switch or a configuration option, you just write or alter the code and integrate them together.

And while you have libraries and code access, elips is easier than the unix way (writing c/go/rust/… programs or bash/perl/awk/python/… scripts). Except for few cases.

amy-petrik-214 · 2024-10-04T03:16:19 1728011779

>Functional + universal data structure + homoiconic = power

>It everything used TSV or tabular data, But is not the case. With lisp you can always be sure.

basic unix kit is built around line-separated lines which are field-separated and you even get to choose your own separators and not get locked into tab. You can use this kitset, a common one, or other different kit. But with this kitset, yes, everything is indeed a table

Re: emacs https://www.youtube.com/watch?v=urcL86UpqZc&t=253s

sevensor · 2024-10-04T12:12:41 1728043961

I have a lot of respect for emacs, but my favorite editor takes what somebody in this conversation called an “exterior” approach instead. This is kakoune, and it gets a lot more attention for its editing model (which is quite good) than for its approach to dealing with the outside world. But the latter deserves some attention, because you extend kak by calling other programs for everything, and kak makes it very easy to do. For instance, I have a key bind that invokes the date command to insert a timestamp at the cursor. I’m sure there’s a very reasonable bit of elisp that produces the same result, and it probably looks a lot more like strftime. I think this makes for an instructive contrast; I’ve never seen another editor take integration with external processes quite so far.

coliveira · 2024-10-03T22:06:44 1727993204

Elisp is dependent on Emacs. It is useful to have a language that you can run without loading Emacs.

anthk · 2024-10-04T00:42:27 1728002547

Guile supports Elisp, albeit it's far slower. Also you can run

      emacs -q --script "foo.el"

foo.el being

   (/ 2.0 3.0)
   (princ "Hello")
   (terpri)

mmcgaha · 2024-10-03T21:11:59 1727989919

I always tell people the best way to learn how to use linux is to read The Unix Programming Environment.

anthk · 2024-10-03T22:40:12 1727995212

Perl superseded it for almost all of the chapters, except for the C ones. Altough for small programs, for sure it did.

Perl used to have an AWK to Perl converter because most of the language could be mapped 1:1 to Perl.

UPE would be fine under 9front save for sh (rc) and make (mk).

buescher · 2024-10-03T23:17:06 1727997426

I liked awk and perl was even better where either more structured (I know, I know) constructs were comfy or I needed perl dbi (which was awesome, what do people use now?) but that was a while ago. Sort of nuts that awk is much faster on really big columnar (csv etc) data, though.

cafard · 2024-10-04T01:26:39 1728005199

>> what do people use now?

Well, sometimes Perl DBI. But the young seem to learn Python about the time they get their drivers' licenses, and some unfortunate among them will inherit my code, so these days I use more psycopg or cx_Oracle (the latter now superseded, yes).

niobe · 2024-10-04T00:25:15 1728001515

Great article. I was only just thinking this week, "are there really still only 3 channels?".

But short of a massive overhaul and in spite of the shortcomings the current system still _works_ better than any other platform.

I would like to see unix stay relevant for the long-term however. It's possible these shortcomings lead one day to a the trade-off against newer systems not being worth making, or being just incompatible.

buescher · 2024-10-03T23:26:49 1727998009

With image-capable terminals and funky enhanced cli utilities we are sort of slouching towards something like a CLIM listener or a notebook interface at the shell. What would something in that vein that was really, really nice look like?

enriquto · 2024-10-04T16:03:14 1728057794

> What would something in that vein that was really, really nice look like?

No need to say "would". You can have exactly this experience today using gnuplot with sixel terminal.

buescher · 2024-10-04T16:15:18 1728058518

Yeah, that's where we are now. What would even better look like?

enriquto · 2024-10-04T17:14:29 1728062069

It looks perfect to me, it's really beautiful and I don't see how could it improve anymore.

A real improvement would be for some form of ipython to support in-terminal sixel graphics. There is no real technological impediment for that.

gavinhoward · 2024-10-04T03:15:02 1728011702

> I hope to see more "sugar" in languages to take advantage of calling out to other programs for help.

How about [1] and [2]?

My language has those because its first program was its own build script, which requires calling out to a C compiler. It had that before printing to stdout.

Turns out, that made it far more powerful than I imagined without a standard library. Calling out to separate programs is far better than a standard library.

[1]: https://git.yzena.com/Yzena/Yc/src/commit/95904ef79701024857...

[2]: https://git.yzena.com/Yzena/Yc/src/commit/95904ef79701024857...

emmelaich · 2024-10-04T03:04:46 1728011086

Nice article.

The criticism of the file system as overly simple or archaic is often been made, ever since the 70s. However the fact is that it IS use-able as a base for ACID capable software. Numerous reality based evidence attests to that.

I remember in Rochkind's book[0] there is a quote criticising Unix being inferior to IBM's MVS because it didn't have locking. As Rochkind retorts, MVS didn't either! Not as a kernel feature, but via user space software, which is eminently do-able in Unix too.

[0] https://www.oreilly.com/library/view/advanced-unix-programmi...

dvektor · 2024-10-04T13:31:05 1728048665

This was a fantastic post. It sums up many of the reasons why I love daily driving linux, and why the majority of my workflow happens in the terminal. Vim + Unix is the best IDE

anthk · 2024-10-04T14:02:19 1728050539

I agree; but entr does it better reacting to events than serializing pipes of code.

With entr you can watch a directory of souces and run make upon writing a file.

That's superior to the pure Unix <<batch jobs'>> philosophy over plain pipes.

golly_ned · 2024-10-04T01:40:49 1728006049

What does this article add to the countless others espousing the Unix model for exactly the same thing?

sim7c00 · 2024-10-04T09:49:36 1728035376

I'm not sure why but i kind of feel this is just the general purpose of an operating system, to provide an environment for things to 'be done in' using the system's resources. And you can't pre-build everything that needs to be done, so it's important to make it extensible and allow for interoperability between programs relatively easily.

Windows does this too... it's just less touted to be such an environment as it has a lot of applications built in binary form which do a lot of work for you. But essentially, most of it's functionality is exposed via scripting interfaces, a lot of programs can also be extended with simple scripts. This even without including powershell into the mix, which allows to really go the next mile. you can even createRemoteThread (maybe a bad idea, but an example of how extensible it is! ;D)... - it's not posix etc. but definitely programmable.

don't get me wrong, i do love that people on unix aim to make things pipeable. I'd hope someday they will make their outputs easier to parse though rather than to have to have cut,awk,sed in the mix every-time to reformat stuff into a structure more easily interpreted. A common difficult task is to parse output from 'ls'. The underlying data is quite organized at every level, but the tool outputs are a big struggle to parse if you don't want to rely on strict formatting of filenames.

this last bit ofcourse can be said about many things about computers and data storage/exchane - it's kind of always a mess. there's so many standard ways to output things, and non-standard, that it's just a zoo of stuff to parse...

I'd be delighted if someday there's an OS which requires things sent into another program to adhere to an open and well defined data formatting standard, and just one at that. I guess no one wants to reinvent the wheel though and make each piped data piece be strictly json or something like that. it would make life a lot easier and can even be serialized/deserialized fairly generically to optimise transfer where needed...

It's what i want to do for my own OS, sadly no one will ever use that... :D but i am free to dream as i type a million lines of defines and bit twiddles! :D

wwalexander · 2024-10-04T10:07:41 1728036461

> I'd be delighted if someday there's an OS which requires things sent into another program to adhere to an open and well defined data formatting standard, and just one at that

Emacs?

Jokes aside, I really do think S-exprs would be a good candidate for what you’re talking about as they can represent the same structures as JSON et al with less ceremony. Since S-exprs separate atoms with whitespace, and most UNIXy programs parse structured text using whitespace, you could implicitly wrap those programs’ input/output with parens to make them work on S-exprs.

II2II · 2024-10-04T10:03:18 1728036198

I agree with your points, heck even the classic Macintosh operating system exposed interfaces for end user programmability, but I think the article's author was illustrating that this is much closer to the norm for Unix based operating systems.

As for piping text around, that is largely a product of a different era. an era when resources were more limited and ideas less developed. Keep in mind that Unix was developed at a time when most interactive systems used glorified typewriters. (While a terminal doesn't have to be a computer, the complexity pretty much necessitates IC's and they quickly evolved into simple computers.)

immibis · 2024-10-04T11:57:05 1728043025

The decline of COM and Visual Basic has been a disaster for the human race. You could script GUI applications, in an object-oriented fashion, more easily than any Unix shell. You didn't just have to concatenate processes in a pipeline; you could arbitrarily reach inside them.

And remember being able to embed an arbitrary application into a rectangle in your Word document or PowerPoint presentation, with OLE? That was related.

metadat · 2024-10-04T00:28:56 1728001736

> We see that languages like Perl and Python have huge numbers of libraries for doing all sorts of tasks. Those libraries are only accessible through the programming language they were developed for. This is a missed opportunity for the languages to interoperate synergistically with the rest of the Unix ecosystem.

What would this interoperability look like, in practical terms?

For example, how would you invoke a program in language A from language B, other than the typical existing `system.exec(...)'.

paulddraper · 2024-10-04T06:21:59 1728022919

The author is saying these libraries expose Perl/Python/etc functions, that can only be invoked by Perl/Python/etc code.

Whereas Unix functions (programs) can be invoked by any programming language.

---

The C ABI would be a second place, as many languages can interact with it.

jeremyjh · 2024-10-04T01:04:27 1728003867

Its nonsense. They interoperate just as well as any other programs in UNIX. You can pipe stdin to them, pipe their output to other programs, or invoke the shell or other programs. The fact that they have libraries that don't require integration through text streams doesn't take anything away from the text processing interfaces and programs. Shell scripts have their place, and UNIX is beautiful, but that doesn't mean everything has to work this way.

ColonelPhantom · 2024-10-04T15:20:17 1728055217

It's not nonsense; the point is that calling a Numpy function needs to be done from Python. You cannot pipe stdin to np.array().

jeremyjh · 2024-10-04T16:40:26 1728060026

Yes exactly, and you shouldn't complain that Numpy isn't directly available to perl and grep. I think we're in agreement.

whartung · 2024-10-03T22:58:21 1727996301

I'm on board with this.

Unix is my favorite OS.

I like that it's fundamental unit of work is the process, and that, as users, we have ready access to those. Processes are cheap and easy.

I can stack them together with a | character. I can shove them in the background with a & (or ^Z and bg, or whatever). Cron is simple. at(1) and batch(1) are simple.

The early machines I worked on, processes were a preallocated thing on boot. They weren't some disposable piece of work. You could do a lot with it, but it's not the same.

Even when I was working on VMS, I "never" started new processes. Not like you do in Unix. Not ad hoc, "just for a second". No, I just worked directly with what I had. I could not compose new workflows readily out of processes.

Processes give a lot of isolation and safety. If a process goes mad, it's (usually) easily killed with little impact to the overall system. Thus its cheap and forgiving to mess up with processes.

inetd was a great idea. Tie stdin/stdout to a socket. Any one and their brother Frank could write a service managed by inetd -- in anything. CGI-BIN is the same way. The http server does the routing, the process manages the rest. Can you imagine shared hosting without processes? I shudder at the thought.

Binary processes are cheap too, with shared code segments making easy forks, fast startup, low system impact. The interpreters, of course, wrecked that whole thing. And, arguably, the systems were "fast enough" to make that impact low.

But inetd, running binary processes? That is not a slow server. It can be faster (pre-forking, threads, dedicated daemons), but that combo is not necessarily slow. I think the sqlite folks basically do this with Fossil on their server.

Note, I'm not harping on "one process, one thing", that's different. Turns out when processes are cheap and nimble, then that concept kind of glitters at the bottom of the pan. But that's policy, not capability.

But the Unix system is just crazy malleable and powerful. People talk about a post-holocaust system. How they want something like CP/M cuz its simple. But, really? What a horrific system! Yes, a "unix like system" is an order of magnitude more complex than something like CP/M. But its far more than an order of magnitude more capable. It's worth the expense.

Even something weak, like Coherent on a 286. Yea, it had its limitations, but the fundamentals were there. At the end of the world, just give me a small kernel, sh, vi, cc, and ld -- I can write the rest of the userland -- poorly :).

anthk · 2024-10-03T20:47:36 1727988456

On shells for Unix, this can be really useful to cut script regex matching in half:

https://www.cca.org/mpsh/docs-08.html

mustache_kimono · 2024-10-04T06:24:23 1728023063

> Compare that to Clojure, where you constantly define and redefine functions at the REPL.

It's an interactive shell FFS, does it get more REPL than that?!

`set -x` is what you want brother.

zzo38computer · 2024-10-04T05:02:32 1728018152

Being a programmable environment is one of the good benefits of UNIX, and piping programs together is also a good benefit of UNIX.

"Write programs that do one thing and do it well" and "Write programs to work together" are good ideas, too (unfortunately many programs don't).

I think that using a text stream for everything is not the best idea though. In many cases binary formats will do better. I think XML and JSON are not that good either.

I think "cache your compiler output to disk so you wouldn't have to do a costly compile step each time you ran a program" is a good idea, although this should not be required; REPL and other stuff they mention there is also very helpful.

They say the file system is also old. My idea is a transactional hypertext file system. It doesn't have metadata (or even file names), but a file can contain multiple numbered forks and you can store extra data in there.

(Transactional file system is something that I think is useful and that UNIX doesn't do.)

They are also right about the terminal is old, although some of the newer things that some people had tried to do have different sets of problems.

They also say another unfortunate thing is layering, and I agree that this layering is excessive.

Interoperating without needing FFI is also helpful (and see below what I mention about typed initial messages, too).

About the stuff listed in "Text streams, evolved", my idea of the operating system design, involves the "Common Data Format" (which is a binary format, somewhat like ASN.1 BER but different), and most data, including the command shell and most files, would use it; this also allows for common operations.

I agree with "a program which displays all of the thumbnails of the files listed on stdin would be much more useful to me than a mouse-oriented file browser", and I do not have a GUI file browser anyways. I do use command-line programs for most things, even though I have X Windows to run some GUI programs and to be able to have multiple xterms at once (I often have many xterms at once). However, it could be improved as I describe above, too.

They mention the shell. I agree that it could be greatly improved, and I think that it would go with the other improvements above. My operating system design effectively requires "programs as pure functions over streams of data" (although it is functions over "capabilities", and not necessarily "streams of data") due to the way that the capability-based security is working, and the way the linking and capability passing is working also allows working like higher-order functions and transformations and all of that stuff. Even, my idea also involves message passing (all I/O is done by passing messages between capabilities), too.

I had also considered programs that require types. One of the forks (like I mentioned above) of a executable file can specify the expected type of the initial message, and the command shell can use this to effectively make them like functions that have types.

Something they don't mention is security. That can also be improved; the capability-based security that I mention above, if you have proxy capabilities too, will improve it. There is also the possibility that users can use the command shell and write other programs to make up your own proxy capabilities, and this allows programs to be used to do things that they were not necessarily designed to do, in addition to improving security. Instead of merely a user account, it might e.g. allow to write to only one file, or allow connecting to only one remote computer (without the program knowing which one it is, and perhaps even with data compression that the application program is unaware of), etc.

I still think that, even if you have powerful computers, you should still program it efficiently anyways.

The new one won't be UNIX; it will be something else.

Tor3 · 2024-10-04T06:45:33 1728024333

Using text streams between piped-together processes is not a requirement though. I'm using binary streams for some of the stuff I do, as I write simulators for some hardware (and other things) which gets processed by something else through a pipe or two (and may end up being parsed into text at or near the final point).

paulddraper · 2024-10-04T06:17:21 1728022641

> Unix is homoiconic

Wild, very cool

gregw2 · 2024-10-03T21:26:57 1727990817

The author of the article seems unaware of awk or jq or perl one-liners for handling JSON or other forms of data from UNIX command line.

taejavu · 2024-10-03T21:53:03 1727992383

The contents of the article indicate you're mistaken:

> You really can use the best tool for the job. I've got Bash scripts, awk scripts, Python scripts, some Perl scripts. What I program in at the moment depends on my mood and practical considerations.

anthk · 2024-10-03T22:41:21 1727995281

You often have to do dances with JSON, XML, TSV... converters before parsing the actual data.

If you use something like Emacs, you just handle s-exps.

cutler · 2024-10-04T01:17:25 1728004645

What about jq and family?