Yep, that’s the same reason why it runs on a PowerPC750 chip. They cost $200k+ f...

velvetz · on March 3, 2021

Maybe a stupid question, but couldn’t they put there something “cutting-edge” next to the existing system? Then if that fails then whatever, the rest of the system is still functional as expected, but at least they know that in the future what they would need to change to be able to use the new technologies in that environment. Although I can imagine that even the smallest weight increase could cause issues, so probably that’s the reason.

wongarsu · on March 3, 2021

Ingenuity (the Mars helicopter) has a much more modern processor that isn't particularly radiation hardened, and instead just restarts within milliseconds on any fault (fast enough to recover if it happens in flight). If Ingenuity does well I wouldn't be surprised to see a similar design in a future rover.

josho · on March 3, 2021

But now you’ve added weight and need to remove something else.

As a tangent, I remember a talk from Grady Booch where he put software cost in terms of mass that needed to be added to your rocket. So every new feature cost weight by requiring more chips and more fuel.

m4rtink · on March 3, 2021

Scott Manley mentioned in one of his recent videos about Perserverance that they have an FPGA on board the rad hardened PPC main cpu can offload tasks to, such computer vision based navigation during landing.

Not sure what the long term plan is for that hardware - I guess they can just do stuff more slowly on the main CPU if it gets fried by radiation. Or maybe being an FPGA the can just isolate the radiation affected regions over time and use the rest ?

reilly3000 · on March 3, 2021

You got me curious and I found a product listing with a rather amazing explanation of what they're doing for radiation-proofing.

Radiation Performance

> RTG4 FPGAs are immune to radiation-induced changes in configuration, due to the robustness of the flash cells used to connect and configure logic resources and routing tracks. No background scrubbing or reconfiguration of the FPGA is needed in order to mitigate changes in configuration due to radiation effects. Data errors, due to radiation, are mitigated by hardwired SEU resistant flip-flops in the logic cells and in the mathblocks. Single Error Correct Double Error Detect (SECDED) protection is optional for the embedded SRAM (LSRAM and uSRAM) and the DDR memory controllers. This means that if a one-bit error is detected, it will be corrected. Errors of more than one bit are detected only and not corrected. SECDED error signals are brought to the FPGA fabric to allow the user to monitor the status of these protected internal memories.

https://www.microsemi.com/product-directory/rad-tolerant-fpg...

m4rtink · on March 3, 2021

Wow, a rad hard FPGA! Now that's something I didn't expect - thanks for digging this up! :)

pkaye · on March 3, 2021

I know the camera image compression happens on a x86 processor. Not sure if it is rad hardened. It would be okay if the processor is not doing critical work and can be power cycled due to a rare SEU failure. Also the same with the helicopter uses a mobile phone processor.

Also some of the components are not rated for long term use. The microphones they installed will probably fail due to extreme temperature cycling. The helicopter is expected to last only 4 uses and anything beyond that is a bonus.

grishka · on March 3, 2021

They did this exact thing with the helicopter — it's using a Snapdragon SoC that runs Linux.

Cthulhu_ · on March 3, 2021

Especially since hardware is generally designed for speed and performance instead of reliability. Didn't NASA use a lot of 486es for a long time for this same reason?

But yeah. It's not a problem if a CPU gets buggy or dies in a datacenter with 10K similar ones, because of redundancy and a completely different approach to software workloads nowadays. I mean a lot of production systems barely even care what CPU they run on.

And ~200 Mhz is still powerful enough for a LOT of applications.

darkr · on March 3, 2021

> Didn't NASA use a lot of 486es for a long time for this same reason?

They were also using Amigas until ~2006 for various duties, including launch control.

wiz21c · on March 3, 2021

Could you explain why such a processor is 1000x more expensive than a non-rugged one ? I mean, considering the future of earth, I'd certainly welcome a processor that could withstand a lifespan of decades (including the harsh "I'm going to unsolder it to put it in another device" environment).

TeMPOraL · on March 3, 2021

Part of it is low demand, so you don't benefit from economies of scale. Also the customers buying this type of equipment are able to bear such costs. They aren't gonna haggle over $100k on a critical component to a $3B mission.

dylan604 · on March 3, 2021

a $100k here, a $100k there, suddenly, we're talking real money.

reilly3000 · on March 3, 2021

Oh we're talking real money. JPL will destroy 78 production grade parachutes to find all the ways they can break. There is tremendous energy poured into eliminating risk, because launch is so expensive. As it becomes cheaper, perhaps these measures won't be as necessary.

derekp7 · on March 3, 2021

With Spacex being able to launch a car out beyond the orbit of Mars, is their current capability being considered for future JPL missions? Or is everything waiting on Starship?

dylan604 · on March 5, 2021

Thankfully, SpaceX isn't waiting for anyone. They are in the front with a machete wacking down the limbs leaving a trail for others to follow. Of course SpaceX is standing on the shoulders of those missions that lead the way before, but let's be honest, after Gemini, Mercury, Apollo, Shuttle, everything else has been a bit stagnant. Not taking anything away from all of the probes and rover missions. Those are the only things that make NASA relevant. Finally, we have a space agency actively working on humans in space again.

TeMPOraL · on March 5, 2021

And it's not like people at JPL are clenching their fists and screaming, "Damn you, Shotwell! Begone, Mueller! SEC eat you, Musk!". NASA is fully on-board with what SpaceX is doing - cutting straight into and reversing the dreaded Space Mission Costs Spiral. They may be cautious now, but eventually, they'll just start putting scientific probes on SpaceX rockets too (and those of competitors). SLS notwithstanding, NASA is happy to outsource launches and focus on bleeding edge missions.

And if SpaceX blazes the trail? Expect more scientific missions to follow, as cheaper and more frequent launches - the Spiral put into reverse - cuts mission costs across the board. NASA will be able to do more cool stuff with their budget. And so will everyone else.

pja · on March 3, 2021

IIRC it’s a custom process using wide-bandgap silicon (part of the whole rad-hardening thing) that no-one else needs or wants, so you have to pay the full costs of a fab run for a relatively small number of chips. Hence the cost per chip is huge.

_9vzr · on March 3, 2021

The low demand definitely leads to higher prices but the other aspect is all of the additional testing involved for every chip. There's standard testing at every level of the manufacturing process and there will be further environmental testing at the package level. They will even run radiation tests on the wafer to characterize any quirks.

m4rtink · on March 3, 2021

Hopefully as more space & deep space stuff gets going there will be more demand and the unit cost will go down thanks to economies of scale. Or alternatively ways of computing are tried and found viable (massive shielding, standardized massive redundancy, etc.).

ben_w · on March 3, 2021

From this list, I assume the isotopic purity is the biggest contribution to the price, though I expect relative lack of mass production doesn’t help: https://en.m.wikipedia.org/wiki/Radiation_hardening

wiml · on March 3, 2021

For radiation-hardened semiconductors sometimes you use specialized processes like silicon-on-insulator. Even if you're using a more conventional process, you're probably using specialized upset-resistant gate designs and layout. And even your gate-level or HDL-level design might have changes to make it possible to detect and recover from some kinds of bitflips. It really affects the whole technology stack.

WarOnPrivacy · on March 3, 2021

Specs: https://en.wikipedia.org/wiki/RAD750