Regarding "We could conceivably have some version of AI that reliably performs any task you throw at it consistently" - it is very clear to anyone who just looks at the recent work by Anthropic analyzing how their LLM "reasons" that such a thing will never come from LLMs without massive unknown changes - and definitely not from scale - so I guess the grandparent is absolute right that openai is nor really working on this.
I think this is right but also missing a useful perspective.
Most HN people are probably too young to remember that the nanotech post-scarcity singularity was right around the corner - just some research and engineering way - which was the widespread opinion in 1986 (yes, 1986). It was _just as dramatic_ as today's AGI.
That took 4-5 years to fall apart, and maybe a bit longer for the broader "nanotech is going to change everything" to fade. Did nanotech disappear? No, but the notion of general purpose universal constructors absolutely is dead. Will we have them someday? Maybe, if humanity survives a hundred more years or more, but it's not happening any time soon.
There are a ton of similarities between nanotech-nanotech singularity and the moderns LLM-AGI situation. People point(ed) to "all the stuff happening" surely the singularity is on the horizon! Similarly, there was the apocalytpic scenario that got a ton of attention and people latching onto "nanotech safety" - instead of runaway AI or paperclip engines, it was Grey Goo (also coined in 1986).
The dynamics of the situation, the prognostications, and aggressive (delusional) timelines, etc. are all almost identical in a 1:1 way with the nanotech era.
I think we will have both AGI and general purpose universal constructors, but they are both no less than 50 years away, and probably more.
So many of the themes are identical that I'm wondering if it's a recurring kind of mass hysteria. Before nanotech, we were on the verge of genetic engineering (not _quite_ the same level of hype, but close, and pretty much the same failure to deliver on the hype as nanotech) and before that the crazy atomic age of nuclear everything.
Yes, yes, I know that this time is different and that AI is different and it won't be another round of "oops, this turned out to be very hard to make progress on and we're going to be in a very slow, multi-decade slow-improvement regime, but that has been the outcome of every example of this that I can think of.
I won't go too far out on this limb, because I kind of agree with you... but to be fair -- 1980s-1990s nanotech did not attract this level of investment, nor was it visible to ordinary people, nor was it useful to anyone except researchers and grant writers.
It seems like nanotech is all around us now, but the term "nanotech" has been redefined to mean something different (larger scale, less amazing) from Drexler's molecular assemblers.
> Did nanotech disappear? No, but the notion of general purpose universal constructors absolutely is dead. Will we have them someday? Maybe, if humanity survives a hundred more years or more,
I thought this was a "we know we can't" thing rather than a "not with current technology" thing?
Specific cases are probably impossible, though there's always hope. After all, to ue the example the nanotech people loved: there are literal assemblers all around you. Whether we can have singular device that can build anything (probably not - energy limits and many many other issues) or factories that can work on atomic scale (maybe) is open, I think. The idea of little robots was kind of visibly silly even at the peak.
The idea of scaling up LLMs and hoping is .. pretty silly.
Every consumer has very useful AI at their fingertips right now. It's eating the software engineering world rapidly. This is nothing like nanotech in the 80s.
Sure. But fancy autocomplete for a very limited industry (IT) plus graphics generation and a few more similar items, are indeed useful. Just like "nanotech" coating of say optics or in the precise machinery or all other fancy nano films in many industries. Modern transistors are close to nano scale now, etc.
The problem is that the distance between a nano thin film or an interesting but ultimately rigid nano scale transistor and a programmable nano level sized robot is enormous, despite similar sizes. Same like the distance between an autocomplete heavily relying on the preexisting external validators (compilers, linters, static code analyzers etc.) and a real AI capable of thinking is equally enormous.
Another possibility is that OpenAL thinks _none_ of the labs will achieve AGI in a meaningful timeframe so they are trying to cash out with whatever you want to call the current models. There will only be one or two of those before investors start looking at the incredible losses.
If they think AGI is imminent the value of that payday is very limited. I think the grandparent is more correct: OpenAI is admitting that near term AGI - which, being that the only one anyone really cares about is the case with exponential self improvement - isn't happening any time soon. But that much is obvious anyway despite the hyperbolic nonsense now common around AI discussions.
If I were a person like several of the people working on AI right now (or really, just heading up tech companies), I could be the kind to look at a possible world-ending event happening in the next - eh, year, let's say - and just want to have a party at the end of the world.
Every single time this has been tried it has gone wrong, but sure.
Almost all of the operations done on actual filesystems are not database like, they are close to the underlying hardware for practical reasons. If you want a database view, add one in an upper layer.
BeFS wasn't a database. It had indexed queries on EAs and they had the habit of asking application files to add their indexable content to the EAs. Internally it was just a mostly-not-transactional collection of btrees.
There was no query language for updating files, or even inspecting anything about a file that was not published in the EAs (or implicitly do as with adapters), there were no multi-file transactions, no joins, nothing. Just rich metadata support in the FS.
Yeah I am talking more deep architecture, and BeOS is more notable here mostly on just the user-interface level.
However, I think it is reasonable to think that with way more time and money, these things would meet up. Think about it as digging a tunnel from both sides of the mountain.
> they are close to the underlying hardware for practical reasons
Could you provide reference information to support this background assertion? I'm not totally familiar with filesystems under the hood, but at this point doesn't storage hardware maintain an electrical representation relatively independent from the logical given things like wear leveling?
- You can reason about block offsets. If your writes are 512B-aligned, you can be ensured minimal write amplification.
- If your writes are append-only, log-structured, that makes SSD compaction a lot more straightforward
- No caching guarantees by default. Again, even SSDs cache writes. Block writes are not atomic even with SSDs. The only way to guarantee atomicity is via write-ahead logs.
- The NVMe layer exposes async submission/completion queues, to control the io_depth the device is subjected to, which is essential to get max perf from modern NVMe SSDs. Although you need to use the right interface to leverage it (libaio/io_uring/SPDK).
> You can reason about block offsets. If your writes are 512B-aligned, you can be ensured minimal write amplification.
Not all devices use 512 byte sectors, an that is mostly a relic from low-density spinning rust;
> If your writes are append-only, log-structured, that makes SSD compaction a lot more straightforward
Hum, no. Your volume may be a sparse file on SAN system; in fact that is often the case in cloud environments; also, most cached RAID controllers may have different behaviours on this - unless you know exactly what your targeting, you're shooting blind.
> No caching guarantees by default. Again, even SSDs cache writes. Block writes are not atomic even with SSDs. The only way to guarantee atomicity is via write-ahead logs.
Not even that way. Most server-grade controllers (with battery) will ack an fsync immediately, even if the data is not on disk yet.
> The NVMe layer exposes async submission/completion queues, to control the io_depth the device is subjected to, which is essential to get max perf from modern NVMe SSDs.
Thats storage domain, not application domain. In most cloud systems, you have the choice of using direct attached storage (usually with a proper controller, so what is exposed is actually the controller features, not the individual nvme queue), or SAN storage - a sparse file on a filesystem on a system that is at the end of a tcp endpoint. One of those provides easy backups, redundancy, high availability and snapshots, and the other one you roll your own.
I'm not sure how any of these negate the broader point: filesystems provide a lower-level interface to the underlying block device than a database.
To say that that's not true would require more than cherry-picking examples of where some fileystem assumption may be tenuous, it would require demonstrating how a DBMS can do better.
> Not all devices use 512 byte sectors
4K then :). Files are block-aligned, and the predominant block size changed once in 40 years from 512B to 4K.
> Hum, no. Your volume may be a sparse file on SAN system
Regardless, sequential writes will always provide better write performance than random writes. With POSIX, you control this behavior directly. With a DBMS, you control it by swapping out InnoDB for RocksDB or something.
> Thats storage domain, not application domain
It is a storage domain feature accessible to an IOPS-hungry application via a modern Linux interface like io_uring. NVMe-oF would be the networked storage interface that enables that. But this is for when you want to outperform a DBMS by 200X, aligned I/O is sufficient for 10-50X. :)
Is file alignment on disk guaranteed, or does it depend on the file system?
The NVMe layer is not the same as the POSIX filesystem, there is no reason we need to throw that as part of knocking the POSIX filesystem off it's privileged position.
Overall you are talking about individual files, but remember what really distinguishes the filesystem is directories. Other database, even relational ones, can have binary blob "leaf data" with the properties you speak about.
> Is file alignment on disk guaranteed, or does it depend on the file system?
I think "guaranteed" is too strong a word given the number of filesystems and flags out there, but "largely" you get aligned I/O.
> The NVMe layer is not the same as the POSIX filesystem, there is no reason we need to throw that as part of knocking the POSIX filesystem off it's privileged position.
I'd say that the POSIX filesystem lives in an ecosystem that makes leveraging NVMe layer characteristics a viable option. More along with the next point.
> Overall you are talking about individual files, but remember what really distinguishes the filesystem is directories. Other database, even relational ones, can have binary blob "leaf data" with the properties you speak about.
I think regardless of how you use a database, your interface is declarative. You always say "update this row" vs "fseek to offset 1048496 and fwrite 128 bytes and then fsync the page cache". Something needs to do the translation from "update this row" to the latter, and that layer will always be closer to hardware.
Mature database implementations also bypass a lot of kernel machinary to get closer to the underlying block devices. The layering of DB on top of FS is a failure.
You are confusing that databases implement their own filesystem equivalent functionality in an application-specific way with the idea that FS's can or should be databases.
I am not confusing any such thing. You need to define "database" such that "file system" doesn't include it.
Common usage does this by convention, but that's just sloppy thinking and populist extentional definitining. I posit that any rigorous, thought-out, not overfit intentional definition of a database will, as a matter of course, also include file systems.
I'm OK including that — tmpfs is similar, but we can easily exclude that by requiring persistence. The intentional definition doesn't need to be expansive the point of being useless!
I had the unique experience as a youth in attending a school where a substantial portion of the school was funneled there by one of the many 1970s and 1980s troubled teen corporations that spun out of Synanon after it collapsed. This one specialized in drug addicts.
Almost all of my classmates (not me, unfortunately) were from exceptionally wealthy families and excepting one none of them ever mentioned any childhood trauma. Instead they were precocious partiers who got into drugs despite being underage and going to the nightclubs in the seedy part of town - no one at the time was turning away hot young women or gay(for pay or real) young men. And the club scene was a drug scene. It still is.
I don’t think trauma is actually at the root of almost all drug abusers. The only first class abusers (pot and alcohol in serious quantities daily) that I know at the moment grew up in perfectly fine suburban families and are in good, non-narcissistic/controlling/etc relationships with their families. They’re just addicts who can’t stop. One of them is going to die from it, eventually, given his level of alcohol consumption.
It isn't close at all.
reply