More

stouset · 2026-02-20T01:21:06 1771550466

I found it funny because the opposite direction, people accused Tesla of naming “autopilot” misleadingly, because it gave them the impression of fully unattended self-driving.

In aviation, autopilot features were until recently (and still for GA pilots) essentially just cruise control: maintain this speed and heading, maintain this climb rate and heading, maintain this bank angle, etc.

k4rli · 2026-02-20T07:18:56 1771571936

Because Tesla was claiming in 2016 that "next year" it would be able to drive across the Unted Sttes without any inputs.

zoky · 2026-02-20T01:57:16 1771552636

Well, okay, but that’s like 95% of flying.

loloquwowndueo · 2026-02-20T02:02:46 1771552966

It’s the other 5% that takes 90% of effort :)

nobodyandproud · 2026-02-20T02:12:31 1771553551

Though by the 0.1% highly qualified and extensively trained, so that the chances of misunderstanding by a pilot is like 0.00001% or less.

stouset · 2026-02-14T22:21:07 1771107667

I just did a quick once-over on the PR and am pretty shocked by how "simple" it is. For switching the background graphics library, I would have expected this to be some pretty delicate surgery. But the "meat" is swapping out the old abstraction layer for the new one (500–600LOC) then `s/blade/wgpu/g`. There's a little more to it than that, but not much.

I'm already a Zed user, but to me that's an extremely good indicator of the engineering quality of the project.

stouset · 2026-02-14T00:03:01 1771027381

I’m not really sure the point you’re trying to make behind “as long as you don’t mind dying early and painfully from easily preventable diseases technically you can live in utopia”. Would you mind clarifying your position here?

beeflet · 2026-02-14T00:11:32 1771027892

the pre-industrial utopia has been created

stouset · 2026-02-13T20:48:03 1771015683

When chess engines were first developed, they were strictly worse than the best humans. After many years of development, they became helpful to even the best humans even though they were still beatable (1985–1997). Eventually they caught up and surpassed humans but the combination of human and computer was better than either alone (~1997–2007). Since then, humans have been more or less obsoleted in the game of chess.

Five years ago we were at Stage 1 with LLMs with regard to knowledge work. A few years later we hit Stage 2. We are currently somewhere between Stage 2 and Stage 3 for an extremely high percentage of knowledge work. Stage 4 will come, and I would wager it's sooner rather than later.

MITSardine · 2026-02-14T13:27:59 1771075679

There's a major difference between chess and scientific research: setting the objectives is itself part of the work.

In chess, there's a clear goal: beat the game according to this set of unambiguous rules.

In science, the goals are much more diffuse, and setting those in the first place is what makes a scientist more or less successful, not so much technical ability. It's a very hierarchical field where permanent researchers direct staff (postdocs, research scientists/engineers), direct grad students. And it's at the bottom of the pyramid where the technical ability is the most relevant/rewarded.

Research is very much a social game, and I think replacing it with something run by LLMs (or other automatic process) is much more than a technical challenge.

bluecalm · 2026-02-13T20:56:05 1771016165

The evolution was also interesting: first the engines were amazing tactically but pretty bad strategically so humans could guide them. With new NN based engines they were amazing strategically but they sucked tactically (first versions of Leela Chess Zero). Today they closed the gap and are amazing at both strategy and tactics and there is nothing humans can contribute anymore - all that is left is to just watch and learn.

TGower · 2026-02-13T21:03:13 1771016593

With a chess engine, you could ask any practitioner in the 90's what it would take to achieve "Stage 4" and they could estimate it quite accurately as a function of FLOPs and memory bandwidth. It's worth keeping in mind just how little we understand about LLM capability scaling. Ask 10 different AI researchers when we will get to Stage 4 for something like programming and you'll get wild guesses or an honest "we don't know".

stouset · 2026-02-13T21:48:21 1771019301

That is not what happened with chess engines. We didn’t just throw better hardware at it, we found new algorithms, improved the accuracy and performance of our position evaluation functions, discovered more efficient data structures, etc.

People have been downplaying LLMs since the first AI-generated buzzword garbage scientific paper made its way past peer review and into publication. And yet they keep getting better and better to the point where people are quite literally building projects with shockingly little human supervision.

By all means, keep betting against them.

baq · 2026-02-13T21:19:10 1771017550

Chess grandmasters are living proof that it’s possible to reach grandmaster level in chess on 20W of compute. We’ve got orders of magnitude of optimizations to discover in LLMs and/or future architectures, both software and hardware and with the amount of progress we’ve got basically every month those ten people will answer ‘we don’t know, but it won’t be too long’. Of course they may be wrong, but the trend line is clear; Moore’s law faced similar issues and they were successively overcome for half a century.

IOW respect the trend line.

blt · 2026-02-13T22:50:32 1771023032

And their predictions about Go were wrong, because they thought the algorithm would forever be α-β pruning with a weak value heuristic

NitpickLawyer · 2026-02-13T21:31:00 1771018260

> With a chess engine, you could ask any practitioner in the 90's what it would take to achieve "Stage 4" and they could estimate it quite accurately as a function of FLOPs and memory bandwidth.

And the same practitioners said right after deep blue that go is NEVER gonna happen. Too large. The search space is just not computable. We'll never do it. And yeeeet...

guluarte · 2026-02-13T23:17:48 1771024668

so we are going back to physical labor then

empath75 · 2026-02-13T22:36:54 1771022214

We are already at stage 3 for software development and arguably step 4

zarzavat · 2026-02-14T08:05:28 1771056328

We are at level 2.5 for software development, IMO. There is a clear skill gap between experienced humans and LLMs when it comes to writing maintainable, robust, concise and performant code and balancing those concerns.

The LLMs are very fast but the code they generate is low quality. Their comprehension of the code is usually good but sometimes they have a weightfart and miss some obvious detail and need to be put on the right path again. This makes them good for non-experienced humans who want to write code and for experienced humans who want to save time on easy tasks.

empath75 · 2026-02-14T17:11:58 1771089118

> The LLMs are very fast but the code they generate is low quality.

I think the latest generation of LLM with claude code is not low quality. It's better than the code that pretty much every dev on our team can do outside of very narrow edge cases.

stouset · 2026-02-12T20:36:01 1770928561

Google built ten different chat products, how did that go?

XorNot · 2026-02-12T21:44:19 1770932659

Does it matter? Microsoft won by default with Teams because it actually turns out no one cares about chat or even has a choice in it: employees use whatever the company picks.

deaux · 2026-02-13T07:15:17 1770966917

No one uses Teams for personal use. LLMs are used daily for personal use by hundreds of millions of people at this point.

pragmatic · 2026-02-12T22:52:17 1770936737

It's bundled with office and no serious business can live without excel.

thevillagechief · 2026-02-12T23:46:26 1770939986

The world, other than the US, runs on WhatsApp. Business, support and payments are done there. So people do care.

deaux · 2026-02-13T07:13:55 1770966835

If you're going to say "other than the US" then you've got to say at a minimum "other than the US and China", but really "other than the US and China and Japan and Korea and Taiwan and Thailand and Russia and most of Central Asia".

Only mentioning the US is wildly americentric even by HN standards.

mvdtnz · 2026-02-13T02:26:55 1770949615

Gosh doesn't that sound familiar.

stouset · 2026-02-11T05:32:13 1770787933

That was 2024. I wonder what could possibly be different about today, in 2026?

stouset · 2026-02-06T02:25:57 1770344757

And 20 years ago people were making the exact same kinds of comments and everyone had the same reaction: yeah, MySQL has been putting numbers up like that for a decade.

stouset · 2026-02-04T18:24:44 1770229484

Even more complications for a “why can’t they just…”. It’s almost as if this kind of thing is difficult to do in practice.

stouset · 2026-02-04T15:36:53 1770219413

Absolutely every aspect of it?

What’s so hard about adding a feature that effectively makes a single-user device multi-user? Which needs the ability to have plausible deniability for the existence of those other users? Which means that significant amounts of otherwise usable space needs to be inaccessibly set aside for those others users on every device—to retain plausible deniability—despite an insignificant fraction of customers using such a feature?

What could be hard about that?

gabeio · 2026-02-04T15:50:38 1770220238

> despite an insignificant fraction of customers using such a feature?

Isn't that the exact same argument against Lockdown mode? The point isn't that the number of users is small it's that it can significantly help that small set of users, something that Apple clearly does care about.

achierius · 2026-02-04T17:15:37 1770225337

Lockdown mode costs ~nothing for devices that don't have it enabled. GP is pointing out that the straightforward way to implement this feature would not have that same property.

stouset · 2026-02-04T17:14:42 1770225282

Lockdown mode doesn’t require everyone else to lose large amounts of usable space on their own devices in order for you to have plausible deniability.

PunchyHamster · 2026-02-04T16:22:49 1770222169

now I want to know what dirty laundry are their upper management hiding on their devices...

tosapple · 2026-02-04T17:40:34 1770226834

The 'extra users" method may not work in the face of a network investigation or typical file forensics.

Where CAs are concerned, not having the phone image 'cracked' still does not make it safe to use.

billfor · 2026-02-04T15:56:34 1770220594

Android phones are multi-user, so if they can do it then Apple should be able to.

Gud · 2026-02-04T15:59:29 1770220769

And how do you explain your 1TB phone that has 2GB of data, but only 700GB free?

deno · 2026-02-04T16:45:13 1770223513

The "fake" user/profile should work like a duress pin with addition of deniability. So as soon as you log in to the second profile all the space becomes free. Just by logging in you would delete the encryption key of the other profile. The actual metadata that show what is free or not were encrypted in the locked profile. Now gone.

tosapple · 2026-02-04T17:43:24 1770227004

Good idea, but this is why you image devices.

deno · 2026-02-04T23:07:03 1770246423

Sorry I explained it poorly and emphasized the wrong thing.

The way it would work is not active destruction of data just a different view of data that doesn’t include any metadata that is encrypted in second profile.

Data would get overwritten only if you actually start using the fallback profile and populating the "free" space because to that profile all the data blocks are simply unreserved and look like random data.

The profiles basically overlap on the device. If you would try to use them concurrently that would be catastrophic but that is intended because you know not to use the fallback profile, but that information is only in your head and doesn’t get left on the device to be discovered by forensic analysis.

Your main profile knows to avoid overwriting the fallback profile’s data but not the other way around.

But also the point is you can actually log in to the duress profile and use it normally and it wouldn’t look like destruction of evidence which is what current GrapheneOS’s duress pin does.

deno · 2026-02-05T15:43:41 1770306221

The main point is logging in to the fake profile does not do anything different from logging in to the main profile. If you image the whole thing and somehow completely bypass secure enclave (but let's assume you can't actually bruteforce the PIN because it's not feasible) then you enter the distress PIN in controlled environment and you look at what writes/reads it does and to where, even then you would not be able to tell you are in the fake profile. Nothing gets deleted eagerly, just the act of logging in is destructive to overlapping profiles. This is the only different thing in the main profile. It know which data belongs to fallback profile and will not allocate anything in those blocks. However it's possible to set up the device without fallback profile so you don't know if you are in the fallback profile or just on device without one set up.

Hopefully I explained it clearly. I haven't seen this idea anywhere else so I would be curious if someone smarter actually tried something like that already.

tosapple · 2026-02-05T18:49:14 1770317354

What you say makes sense, just like the true/veracrypt volume theory. I can't find the head post to my "that's why you image post" but what concerns me is differing profiles may have different network fingerprints. You may need to keep signal and bitlocker on both, EVERYTIME my desktop boots a cloud provider is contacted -- it's not very sanitary?

It"s a hard problem to properly set up even on the user end let alone the developer/engineer side but thank you.

morkalork · 2026-02-04T16:06:04 1770221164

The same way when you buy a brand new phone with 200GB of storage that only has 50GB free on it haha

heraldgeezer · 2026-02-04T16:10:22 1770221422

System files officer ;)

davidwritesbugs · 2026-02-04T16:31:03 1770222663

"Idunno copper, I'm a journalist not a geek"

stouset · 2026-02-04T18:22:45 1770229365

That is about one fiftieth of the work that needs to go into the feature the OP casually “why can’t they just”-ed.

jb1991 · 2026-02-04T15:59:34 1770220774

This is called whataboutism. This particular feature aside, sometimes there are very good reasons not to throw the kitchen sink of features at users.

NitpickLawyer · 2026-02-04T15:48:33 1770220113

Truecrypt had that a decade+ ago.

ratg13 · 2026-02-04T16:23:30 1770222210

Not sure if you know the history behind it, but look up Paul Le Roux

Also would recommend the book called The Mastermind by Evan Ratliff

edm0nd · 2026-02-04T16:40:57 1770223257

imo Paul Le Roux has nothing to do with TrueCrypt

ratg13 · 2026-02-04T16:56:40 1770224200

He wrote the code base that it is based on in combination with code he stole. The name is also based on an early name he chose for the software.

Whether he was involved in the organization and participated in it, is certainly up for debate, but it's not like he would admit it.

https://en.wikipedia.org/wiki/E4M

hackerfoo · 2026-02-04T16:09:07 1770221347

Maybe one PIN could cause the device to crash. Devices crash all the time. Maybe the storage is corrupted. It might have even been damaged when it was taken.

This could even be a developer feature accidentally left enabled.

izzydata · 2026-02-04T15:45:57 1770219957

It doesn't seem fundamentally different from a PC having multiple logins that are accessed from different passwords. Hasn't this been a solved problem for decades?

paulryanrogers · 2026-02-04T15:49:12 1770220152

Apple's hardware business model incentivizes only supporting one user per device.

Android has supported multiple users per device for years now.

bsharper · 2026-02-04T16:00:46 1770220846

You can have a multiuser system but that doesn't solve this particular issue. If they log in to what you claim to be your primary account and see browser history that shows you went to msn.com 3 months ago, they aren't going to believe it's the primary account.

inetknght · 2026-02-04T16:16:51 1770221811

My browser history is cleared every time I close it.

It's actually annoying because every site wants to "remember" the browser information, and so I end up with hundreds of browsers "logged in". Or maybe my account was hacked and that's why there's hundreds of browsers logged in.

compiler-guy · 2026-02-04T15:48:33 1770220113

Multi-user has been solved for decades.

Multi-user that plausibly looks like single-user to three letter agencies?

Not even close.

izzydata · 2026-02-04T16:01:33 1770220893

Doesn't having standard multi-user functionality automatically create the plausible deniability? If they tried so hard to create an artificial plausible deniability that would be more suspicious than normal functionality that just gets used sometimes.

wtallis · 2026-02-04T16:35:27 1770222927

What needs to be plausibly denied is the existence of a second user account, because you're not going to be able to plausibly deny that the account belongs to you when it resides on the phone found in your pocket.

greesil · 2026-02-04T16:09:43 1770221383

Android has work profiles, so that could be done in Android. iPhone still does not.

reaperducer · 2026-02-04T16:17:34 1770221854

Android has work profiles

Never ever use your personal phone for work things, and vice versa. It's bad for you and bad for the company you work for in dozens of ways.

Even when I owned my own company, I had separate phones. There's just too much legal liability and chances for things to go wrong when you do that. I'm surprised any company with more than five employees would even allow it.

greesil · 2026-02-04T18:53:14 1770231194

What's the risk? On Android, the company can remotely nuke the work profile. The work profile has its own file system and apps. You can turn it off when to don't want work notifications.

PunchyHamster · 2026-02-04T16:23:17 1770222197

you're surprise corporations are cheap

skeptic_ai · 2026-02-04T17:05:10 1770224710

Police ask: give me pass for work profile. If you don’t: prison.

vlovich123 · 2026-02-04T16:51:27 1770223887

iPhone and macOS are basically the same product technically. The reason iPhone is a single user product is UX decisions and business/product philosophy, not technical reasons.

While plausible deniability may be hard to develop, it’s not some particularly arcane thing. The primary reasons against it are the political balancing act Apple has to balance (remember San Bernardino and the trouble the US government tried to create for Apple?). Secondary reasons are cost to develop vs addressable market, but they did introduce Lockdown mode so it’s not unprecedented to improve the security for those particularly sensitive to such issues.

achierius · 2026-02-04T17:16:32 1770225392

> iPhone and macOS are basically the same product technically

This seems hard to justify. They share a lot of code yes, but many many things are different (meaningfully so, from the perspective of both app developers and users)

ashdksnndck · 2026-02-04T16:47:31 1770223651

You think iPhones aren’t multi-user for technical reasons? You sure it’s not to sell more phones and iPads? Should we ask Tim “buy your mom an iPhone” Cook?

stouset · 2026-02-04T00:21:18 1770164478

The TL;DR is they radiate it into space via large, high surface area arms that stick out of the station.