> Ultimately IO is [an] intrinsically complex topic, and trying to paper over that complexity with simple interfaces is disingenuous and falls flat on edge cases.
I’m not against pointing out the edge cases in the Plan 9 file model[1,2], but the thing is, I haven’t seen complex I/O interfaces that aren’t a horror show, either. Granted, I haven’t seen that many of those at all, but I’ve had a thorough look at the ones in OS/2, Win32, and NT, and none of them seem particularly inspiring.
I’d very much like to see some nice alternatives, to be clear!
The reference to io_uring also doesn’t seem all that strong of an argument, honestly. I’d like to say there are three layers to the idea of “traditional” “Unix” “files” as an OS (not storage) interface:
- System and user resources you have access to are identified by unforgeable references (called “fds”; the merits of allowing userspace to control their naming as opposed to having the kernel assign the names are debatable). You can feed bytes into these, (ask to) get bytes out of them, and perhaps have a out-of-band call to e.g. transmit one of the other references you hold to a peer. You can of course also delete a reference.
So far this is just dynamically typed object-capabilities by another name. It’s going to require higher-level protocols on top, but so does basically everything else on this level of generality.
Plan 9 mostly (if not completely) eliminated ioctls here by using separate control files instead.
(The part where the OS merges the payloads of write calls into a byte stream then cuts it back up into reads is more opinionated, but I don’t think even Bell Labs systems ever adhered to that principle strictly[1].)
- You obtain (most of) these references by navigating a stringy hierarchical namespace. There are additional calls to do so and to modify that namespace.
This is less of an obviously correct least common denominator, and as history shows the consensus is less strong here as well. Mountpoints, symlinks, namespaces per TFA, even the *at() calls all change how this part functions. On the other hand, I don’t think anybody longs for version numbers or nesting limits (or even drive letters) of other systems that have used naming approaches similar enough for a comparison.
- You access these services through synchronous system calls write(), read(), ioctl(), close(), open(), etc.
This is the part that is changed by the introduction by io_uring... But I don’t feel it’s all that important for the conceptual model, unlike the preceding points.
[1] Cutting a bytestream into packets is as always a tedious slog of buffering so some of the protocol implementations use write() / read() boundaries thus actually (depend on being able to) use the API in a datagram-like fashion (which 9P enables IIUC).
[2] Auth is based on a /proc/self-like hack wherein the kernel-side implementation of the “file” inspects the opening process through kernel-side knowledge you can’t access nor proxy from userspace.
I have never found the Be Book to be particularly engaging reading, but this might finally give me a reason to work through some parts. Thanks!
I’m not sure how much stock to put into the multimedia claims, however—a lot has changed since then, both in the state of human knowledge about low-latency multimedia, A/V sync, network streaming, etc., and in what we can and can’t afford on machines we perform multimedia processing on. How relevant and how commonly known are the insights that BeOS incorporated today?
(You can see that I expect the answers to be “not very” and “extremely”, but sometimes life surprises us. For example—did you know that Microsoft shipped a renderer for 2D animation based on FRP ideas, designed with the direct participation of Conal Elliott himself, in 1998? It was called DirectAnimation and released as part of DirectX 5.)
I’m not against pointing out the edge cases in the Plan 9 file model[1,2], but the thing is, I haven’t seen complex I/O interfaces that aren’t a horror show, either. Granted, I haven’t seen that many of those at all, but I’ve had a thorough look at the ones in OS/2, Win32, and NT, and none of them seem particularly inspiring.
I’d very much like to see some nice alternatives, to be clear!
The reference to io_uring also doesn’t seem all that strong of an argument, honestly. I’d like to say there are three layers to the idea of “traditional” “Unix” “files” as an OS (not storage) interface:
- System and user resources you have access to are identified by unforgeable references (called “fds”; the merits of allowing userspace to control their naming as opposed to having the kernel assign the names are debatable). You can feed bytes into these, (ask to) get bytes out of them, and perhaps have a out-of-band call to e.g. transmit one of the other references you hold to a peer. You can of course also delete a reference.
So far this is just dynamically typed object-capabilities by another name. It’s going to require higher-level protocols on top, but so does basically everything else on this level of generality.
Plan 9 mostly (if not completely) eliminated ioctls here by using separate control files instead.
(The part where the OS merges the payloads of write calls into a byte stream then cuts it back up into reads is more opinionated, but I don’t think even Bell Labs systems ever adhered to that principle strictly[1].)
- You obtain (most of) these references by navigating a stringy hierarchical namespace. There are additional calls to do so and to modify that namespace.
This is less of an obviously correct least common denominator, and as history shows the consensus is less strong here as well. Mountpoints, symlinks, namespaces per TFA, even the *at() calls all change how this part functions. On the other hand, I don’t think anybody longs for version numbers or nesting limits (or even drive letters) of other systems that have used naming approaches similar enough for a comparison.
- You access these services through synchronous system calls write(), read(), ioctl(), close(), open(), etc.
This is the part that is changed by the introduction by io_uring... But I don’t feel it’s all that important for the conceptual model, unlike the preceding points.
[1] Cutting a bytestream into packets is as always a tedious slog of buffering so some of the protocol implementations use write() / read() boundaries thus actually (depend on being able to) use the API in a datagram-like fashion (which 9P enables IIUC).
[2] Auth is based on a /proc/self-like hack wherein the kernel-side implementation of the “file” inspects the opening process through kernel-side knowledge you can’t access nor proxy from userspace.