Hacker Newsnew | past | comments | ask | show | jobs | submit | stouset's commentslogin

I found it funny because the opposite direction, people accused Tesla of naming “autopilot” misleadingly, because it gave them the impression of fully unattended self-driving.

In aviation, autopilot features were until recently (and still for GA pilots) essentially just cruise control: maintain this speed and heading, maintain this climb rate and heading, maintain this bank angle, etc.


Because Tesla was claiming in 2016 that "next year" it would be able to drive across the Unted Sttes without any inputs.

Well, okay, but that’s like 95% of flying.

It’s the other 5% that takes 90% of effort :)

Though by the 0.1% highly qualified and extensively trained, so that the chances of misunderstanding by a pilot is like 0.00001% or less.

I just did a quick once-over on the PR and am pretty shocked by how "simple" it is. For switching the background graphics library, I would have expected this to be some pretty delicate surgery. But the "meat" is swapping out the old abstraction layer for the new one (500–600LOC) then `s/blade/wgpu/g`. There's a little more to it than that, but not much.

I'm already a Zed user, but to me that's an extremely good indicator of the engineering quality of the project.


I’m not really sure the point you’re trying to make behind “as long as you don’t mind dying early and painfully from easily preventable diseases technically you can live in utopia”. Would you mind clarifying your position here?

the pre-industrial utopia has been created

When chess engines were first developed, they were strictly worse than the best humans. After many years of development, they became helpful to even the best humans even though they were still beatable (1985–1997). Eventually they caught up and surpassed humans but the combination of human and computer was better than either alone (~1997–2007). Since then, humans have been more or less obsoleted in the game of chess.

Five years ago we were at Stage 1 with LLMs with regard to knowledge work. A few years later we hit Stage 2. We are currently somewhere between Stage 2 and Stage 3 for an extremely high percentage of knowledge work. Stage 4 will come, and I would wager it's sooner rather than later.


There's a major difference between chess and scientific research: setting the objectives is itself part of the work.

In chess, there's a clear goal: beat the game according to this set of unambiguous rules.

In science, the goals are much more diffuse, and setting those in the first place is what makes a scientist more or less successful, not so much technical ability. It's a very hierarchical field where permanent researchers direct staff (postdocs, research scientists/engineers), direct grad students. And it's at the bottom of the pyramid where the technical ability is the most relevant/rewarded.

Research is very much a social game, and I think replacing it with something run by LLMs (or other automatic process) is much more than a technical challenge.


The evolution was also interesting: first the engines were amazing tactically but pretty bad strategically so humans could guide them. With new NN based engines they were amazing strategically but they sucked tactically (first versions of Leela Chess Zero). Today they closed the gap and are amazing at both strategy and tactics and there is nothing humans can contribute anymore - all that is left is to just watch and learn.

With a chess engine, you could ask any practitioner in the 90's what it would take to achieve "Stage 4" and they could estimate it quite accurately as a function of FLOPs and memory bandwidth. It's worth keeping in mind just how little we understand about LLM capability scaling. Ask 10 different AI researchers when we will get to Stage 4 for something like programming and you'll get wild guesses or an honest "we don't know".

That is not what happened with chess engines. We didn’t just throw better hardware at it, we found new algorithms, improved the accuracy and performance of our position evaluation functions, discovered more efficient data structures, etc.

People have been downplaying LLMs since the first AI-generated buzzword garbage scientific paper made its way past peer review and into publication. And yet they keep getting better and better to the point where people are quite literally building projects with shockingly little human supervision.

By all means, keep betting against them.


Chess grandmasters are living proof that it’s possible to reach grandmaster level in chess on 20W of compute. We’ve got orders of magnitude of optimizations to discover in LLMs and/or future architectures, both software and hardware and with the amount of progress we’ve got basically every month those ten people will answer ‘we don’t know, but it won’t be too long’. Of course they may be wrong, but the trend line is clear; Moore’s law faced similar issues and they were successively overcome for half a century.

IOW respect the trend line.


And their predictions about Go were wrong, because they thought the algorithm would forever be α-β pruning with a weak value heuristic

> With a chess engine, you could ask any practitioner in the 90's what it would take to achieve "Stage 4" and they could estimate it quite accurately as a function of FLOPs and memory bandwidth.

And the same practitioners said right after deep blue that go is NEVER gonna happen. Too large. The search space is just not computable. We'll never do it. And yeeeet...


so we are going back to physical labor then

We are already at stage 3 for software development and arguably step 4

We are at level 2.5 for software development, IMO. There is a clear skill gap between experienced humans and LLMs when it comes to writing maintainable, robust, concise and performant code and balancing those concerns.

The LLMs are very fast but the code they generate is low quality. Their comprehension of the code is usually good but sometimes they have a weightfart and miss some obvious detail and need to be put on the right path again. This makes them good for non-experienced humans who want to write code and for experienced humans who want to save time on easy tasks.


> The LLMs are very fast but the code they generate is low quality.

I think the latest generation of LLM with claude code is not low quality. It's better than the code that pretty much every dev on our team can do outside of very narrow edge cases.


Google built ten different chat products, how did that go?

Does it matter? Microsoft won by default with Teams because it actually turns out no one cares about chat or even has a choice in it: employees use whatever the company picks.

No one uses Teams for personal use. LLMs are used daily for personal use by hundreds of millions of people at this point.

It's bundled with office and no serious business can live without excel.

The world, other than the US, runs on WhatsApp. Business, support and payments are done there. So people do care.

If you're going to say "other than the US" then you've got to say at a minimum "other than the US and China", but really "other than the US and China and Japan and Korea and Taiwan and Thailand and Russia and most of Central Asia".

Only mentioning the US is wildly americentric even by HN standards.


Gosh doesn't that sound familiar.

That was 2024. I wonder what could possibly be different about today, in 2026?


And 20 years ago people were making the exact same kinds of comments and everyone had the same reaction: yeah, MySQL has been putting numbers up like that for a decade.


Even more complications for a “why can’t they just…”. It’s almost as if this kind of thing is difficult to do in practice.


Absolutely every aspect of it?

What’s so hard about adding a feature that effectively makes a single-user device multi-user? Which needs the ability to have plausible deniability for the existence of those other users? Which means that significant amounts of otherwise usable space needs to be inaccessibly set aside for those others users on every device—to retain plausible deniability—despite an insignificant fraction of customers using such a feature?

What could be hard about that?


> despite an insignificant fraction of customers using such a feature?

Isn't that the exact same argument against Lockdown mode? The point isn't that the number of users is small it's that it can significantly help that small set of users, something that Apple clearly does care about.


Lockdown mode costs ~nothing for devices that don't have it enabled. GP is pointing out that the straightforward way to implement this feature would not have that same property.


Lockdown mode doesn’t require everyone else to lose large amounts of usable space on their own devices in order for you to have plausible deniability.


now I want to know what dirty laundry are their upper management hiding on their devices...


The 'extra users" method may not work in the face of a network investigation or typical file forensics.

Where CAs are concerned, not having the phone image 'cracked' still does not make it safe to use.


Android phones are multi-user, so if they can do it then Apple should be able to.


And how do you explain your 1TB phone that has 2GB of data, but only 700GB free?


The "fake" user/profile should work like a duress pin with addition of deniability. So as soon as you log in to the second profile all the space becomes free. Just by logging in you would delete the encryption key of the other profile. The actual metadata that show what is free or not were encrypted in the locked profile. Now gone.


Good idea, but this is why you image devices.


Sorry I explained it poorly and emphasized the wrong thing.

The way it would work is not active destruction of data just a different view of data that doesn’t include any metadata that is encrypted in second profile.

Data would get overwritten only if you actually start using the fallback profile and populating the "free" space because to that profile all the data blocks are simply unreserved and look like random data.

The profiles basically overlap on the device. If you would try to use them concurrently that would be catastrophic but that is intended because you know not to use the fallback profile, but that information is only in your head and doesn’t get left on the device to be discovered by forensic analysis.

Your main profile knows to avoid overwriting the fallback profile’s data but not the other way around.

But also the point is you can actually log in to the duress profile and use it normally and it wouldn’t look like destruction of evidence which is what current GrapheneOS’s duress pin does.


The main point is logging in to the fake profile does not do anything different from logging in to the main profile. If you image the whole thing and somehow completely bypass secure enclave (but let's assume you can't actually bruteforce the PIN because it's not feasible) then you enter the distress PIN in controlled environment and you look at what writes/reads it does and to where, even then you would not be able to tell you are in the fake profile. Nothing gets deleted eagerly, just the act of logging in is destructive to overlapping profiles. This is the only different thing in the main profile. It know which data belongs to fallback profile and will not allocate anything in those blocks. However it's possible to set up the device without fallback profile so you don't know if you are in the fallback profile or just on device without one set up.

Hopefully I explained it clearly. I haven't seen this idea anywhere else so I would be curious if someone smarter actually tried something like that already.


What you say makes sense, just like the true/veracrypt volume theory. I can't find the head post to my "that's why you image post" but what concerns me is differing profiles may have different network fingerprints. You may need to keep signal and bitlocker on both, EVERYTIME my desktop boots a cloud provider is contacted -- it's not very sanitary?

It"s a hard problem to properly set up even on the user end let alone the developer/engineer side but thank you.


The same way when you buy a brand new phone with 200GB of storage that only has 50GB free on it haha


System files officer ;)


"Idunno copper, I'm a journalist not a geek"


That is about one fiftieth of the work that needs to go into the feature the OP casually “why can’t they just”-ed.


This is called whataboutism. This particular feature aside, sometimes there are very good reasons not to throw the kitchen sink of features at users.


Truecrypt had that a decade+ ago.


Not sure if you know the history behind it, but look up Paul Le Roux

Also would recommend the book called The Mastermind by Evan Ratliff


imo Paul Le Roux has nothing to do with TrueCrypt


He wrote the code base that it is based on in combination with code he stole. The name is also based on an early name he chose for the software.

Whether he was involved in the organization and participated in it, is certainly up for debate, but it's not like he would admit it.

https://en.wikipedia.org/wiki/E4M


Maybe one PIN could cause the device to crash. Devices crash all the time. Maybe the storage is corrupted. It might have even been damaged when it was taken.

This could even be a developer feature accidentally left enabled.


It doesn't seem fundamentally different from a PC having multiple logins that are accessed from different passwords. Hasn't this been a solved problem for decades?


Apple's hardware business model incentivizes only supporting one user per device.

Android has supported multiple users per device for years now.


You can have a multiuser system but that doesn't solve this particular issue. If they log in to what you claim to be your primary account and see browser history that shows you went to msn.com 3 months ago, they aren't going to believe it's the primary account.


My browser history is cleared every time I close it.

It's actually annoying because every site wants to "remember" the browser information, and so I end up with hundreds of browsers "logged in". Or maybe my account was hacked and that's why there's hundreds of browsers logged in.


Multi-user has been solved for decades.

Multi-user that plausibly looks like single-user to three letter agencies?

Not even close.


Doesn't having standard multi-user functionality automatically create the plausible deniability? If they tried so hard to create an artificial plausible deniability that would be more suspicious than normal functionality that just gets used sometimes.


What needs to be plausibly denied is the existence of a second user account, because you're not going to be able to plausibly deny that the account belongs to you when it resides on the phone found in your pocket.


Android has work profiles, so that could be done in Android. iPhone still does not.


Android has work profiles

Never ever use your personal phone for work things, and vice versa. It's bad for you and bad for the company you work for in dozens of ways.

Even when I owned my own company, I had separate phones. There's just too much legal liability and chances for things to go wrong when you do that. I'm surprised any company with more than five employees would even allow it.


What's the risk? On Android, the company can remotely nuke the work profile. The work profile has its own file system and apps. You can turn it off when to don't want work notifications.


you're surprise corporations are cheap


Police ask: give me pass for work profile. If you don’t: prison.


iPhone and macOS are basically the same product technically. The reason iPhone is a single user product is UX decisions and business/product philosophy, not technical reasons.

While plausible deniability may be hard to develop, it’s not some particularly arcane thing. The primary reasons against it are the political balancing act Apple has to balance (remember San Bernardino and the trouble the US government tried to create for Apple?). Secondary reasons are cost to develop vs addressable market, but they did introduce Lockdown mode so it’s not unprecedented to improve the security for those particularly sensitive to such issues.


> iPhone and macOS are basically the same product technically

This seems hard to justify. They share a lot of code yes, but many many things are different (meaningfully so, from the perspective of both app developers and users)


You think iPhones aren’t multi-user for technical reasons? You sure it’s not to sell more phones and iPads? Should we ask Tim “buy your mom an iPhone” Cook?


The TL;DR is they radiate it into space via large, high surface area arms that stick out of the station.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: