It is great to live in an ideal world, and in fact, in most software, you can do...

mytailorisrich · on Nov 15, 2023

I am not sure why you are being so defensive.

It would be interesting if you could develop about what you mean by real world constraints and how cost of units affect what I wrote, which I did from real world experience.

DannyBee · on Nov 17, 2023

I'm not defensive, I just find your extreme position remarkably silly.

You included nothing about how costs affect anything. You simply assert that you should always test prod on pristine production units.

There are plenty of times outside of software where production units cost millions or you can only produce them so quickly, or both, and where your extreme take would result in remarkable cost or a competitor eating your lunch.

Which is precisely why its not done, and in the real world tradeoffs are made between what really needs 100% assurance and not. Spending money or losing customers for 5 9s of reliability through testing when two are needed is not a best practice, and is often explicitly called out as such.

In the case of rivian, maintaining a significant fleet of expensive, pristine, exact customer spec (ie not debuggable) cars just to try to get 100% prod ota success assurance is unlikely to provide value vs getting 98% assurance and not doing that (by rough calculation, it stands at 98% after this incident).

mytailorisrich · on Nov 17, 2023

My position is neither silly nor extreme. It's the way it is usually done and other comments here have been along the same lines.

In fact you are trying to spin what I wrote to an extreme to make your point.

By the way, it is not about 100% success assurance but assurance that failure does not brick the unit. This is an assurance that should be, and can be, close to 100%, indeed a good number of 9s because, obviously you cannot brick 2 cars out of 100 for every software upgrade!