It is great to live in an ideal world, and in fact, in most software, you can do what you are suggesting quite cheaply. But once you get past the sort of "quip on hacker-news" level of thinking about this, or trivial and cheap production testing scenarios, people have to make real tradeoffs because it's never that simple.
Talking about those is much more interesting than just asserting that everything should be a certain way, without any consideration for real world constraints, like cost of units, etc.
It would be interesting if you could develop about what you mean by real world constraints and how cost of units affect what I wrote, which I did from real world experience.
I'm not defensive, I just find your extreme position remarkably silly.
You included nothing about how costs affect anything. You simply assert that you should always test prod on pristine production units.
There are plenty of times outside of software where production units cost millions or you can only produce them so quickly, or both, and where your extreme take would result in remarkable cost or a competitor eating your lunch.
Which is precisely why its not done, and in the real world tradeoffs are made between what really needs 100% assurance and not. Spending money or losing customers for 5 9s of reliability through testing when two are needed is not a best practice, and is often explicitly called out as such.
In the case of rivian, maintaining a significant fleet of expensive, pristine, exact customer spec (ie not debuggable) cars just to try to get 100% prod ota success assurance is unlikely to provide value vs getting 98% assurance and not doing that (by rough calculation, it stands at 98% after this incident).
My position is neither silly nor extreme. It's the way it is usually done and other comments here have been along the same lines.
In fact you are trying to spin what I wrote to an extreme to make your point.
By the way, it is not about 100% success assurance but assurance that failure does not brick the unit. This is an assurance that should be, and can be, close to 100%, indeed a good number of 9s because, obviously you cannot brick 2 cars out of 100 for every software upgrade!
Talking about those is much more interesting than just asserting that everything should be a certain way, without any consideration for real world constraints, like cost of units, etc.