Yep, that’s the same reason why it runs on a PowerPC750 chip. They cost $200k+ for under 200Mhz on a 150-250 nm process, but can withstand up to 1,000,000 rads of radiation and function within extreme temperature ranges for decades. They have been proven to work in the harshest environments on prior missions, and the risk of an unproven processor is giant; a CPU failure could jeopardize the mission just as much as a crash landing. Design constrains are so very different when you shoot something 140M miles away!
Maybe a stupid question, but couldn’t they put there something “cutting-edge” next to the existing system? Then if that fails then whatever, the rest of the system is still functional as expected, but at least they know that in the future what they would need to change to be able to use the new technologies in that environment.
Although I can imagine that even the smallest weight increase could cause issues, so probably that’s the reason.
Ingenuity (the Mars helicopter) has a much more modern processor that isn't particularly radiation hardened, and instead just restarts within milliseconds on any fault (fast enough to recover if it happens in flight). If Ingenuity does well I wouldn't be surprised to see a similar design in a future rover.
But now you’ve added weight and need to remove something else.
As a tangent, I remember a talk from Grady Booch where he put software cost in terms of mass that needed to be added to your rocket. So every new feature cost weight by requiring more chips and more fuel.
Scott Manley mentioned in one of his recent videos about Perserverance that they have an FPGA on board the rad hardened PPC main cpu can offload tasks to, such computer vision based navigation during landing.
Not sure what the long term plan is for that hardware - I guess they can just do stuff more slowly on the main CPU if it gets fried by radiation. Or maybe being an FPGA the can just isolate the radiation affected regions over time and use the rest ?
You got me curious and I found a product listing with a rather amazing explanation of what they're doing for radiation-proofing.
Radiation Performance
> RTG4 FPGAs are immune to radiation-induced changes in configuration, due to the robustness of the flash cells used to connect and configure logic resources and routing tracks. No background scrubbing or reconfiguration of the FPGA is needed in order to mitigate changes in configuration due to radiation effects. Data errors, due to radiation, are mitigated by hardwired SEU resistant flip-flops in the logic cells and in the mathblocks. Single Error Correct Double Error Detect (SECDED) protection is optional for the embedded SRAM (LSRAM and uSRAM) and the DDR memory controllers. This means that if a one-bit error is detected, it will be corrected. Errors of more than one bit are detected only and not corrected. SECDED error signals are brought to the FPGA fabric to allow the user to monitor the status of these protected internal memories.
I know the camera image compression happens on a x86 processor. Not sure if it is rad hardened. It would be okay if the processor is not doing critical work and can be power cycled due to a rare SEU failure. Also the same with the helicopter uses a mobile phone processor.
Also some of the components are not rated for long term use. The microphones they installed will probably fail due to extreme temperature cycling. The helicopter is expected to last only 4 uses and anything beyond that is a bonus.
Especially since hardware is generally designed for speed and performance instead of reliability. Didn't NASA use a lot of 486es for a long time for this same reason?
But yeah. It's not a problem if a CPU gets buggy or dies in a datacenter with 10K similar ones, because of redundancy and a completely different approach to software workloads nowadays. I mean a lot of production systems barely even care what CPU they run on.
And ~200 Mhz is still powerful enough for a LOT of applications.
Could you explain why such a processor is 1000x more expensive than a non-rugged one ? I mean, considering the future of earth, I'd certainly welcome a processor that could withstand a lifespan of decades (including the harsh "I'm going to unsolder it to put it in another device" environment).
Part of it is low demand, so you don't benefit from economies of scale. Also the customers buying this type of equipment are able to bear such costs. They aren't gonna haggle over $100k on a critical component to a $3B mission.
Oh we're talking real money. JPL will destroy 78 production grade parachutes to find all the ways they can break. There is tremendous energy poured into eliminating risk, because launch is so expensive. As it becomes cheaper, perhaps these measures won't be as necessary.
With Spacex being able to launch a car out beyond the orbit of Mars, is their current capability being considered for future JPL missions? Or is everything waiting on Starship?
Thankfully, SpaceX isn't waiting for anyone. They are in the front with a machete wacking down the limbs leaving a trail for others to follow. Of course SpaceX is standing on the shoulders of those missions that lead the way before, but let's be honest, after Gemini, Mercury, Apollo, Shuttle, everything else has been a bit stagnant. Not taking anything away from all of the probes and rover missions. Those are the only things that make NASA relevant. Finally, we have a space agency actively working on humans in space again.
And it's not like people at JPL are clenching their fists and screaming, "Damn you, Shotwell! Begone, Mueller! SEC eat you, Musk!". NASA is fully on-board with what SpaceX is doing - cutting straight into and reversing the dreaded Space Mission Costs Spiral. They may be cautious now, but eventually, they'll just start putting scientific probes on SpaceX rockets too (and those of competitors). SLS notwithstanding, NASA is happy to outsource launches and focus on bleeding edge missions.
And if SpaceX blazes the trail? Expect more scientific missions to follow, as cheaper and more frequent launches - the Spiral put into reverse - cuts mission costs across the board. NASA will be able to do more cool stuff with their budget. And so will everyone else.
IIRC it’s a custom process using wide-bandgap silicon (part of the whole rad-hardening thing) that no-one else needs or wants, so you have to pay the full costs of a fab run for a relatively small number of chips. Hence the cost per chip is huge.
The low demand definitely leads to higher prices but the other aspect is all of the additional testing involved for every chip. There's standard testing at every level of the manufacturing process and there will be further environmental testing at the package level. They will even run radiation tests on the wafer to characterize any quirks.
Hopefully as more space & deep space stuff gets going there will be more demand and the unit cost will go down thanks to economies of scale. Or alternatively ways of computing are tried and found viable (massive shielding, standardized massive redundancy, etc.).
For radiation-hardened semiconductors sometimes you use specialized processes like silicon-on-insulator. Even if you're using a more conventional process, you're probably using specialized upset-resistant gate designs and layout. And even your gate-level or HDL-level design might have changes to make it possible to detect and recover from some kinds of bitflips. It really affects the whole technology stack.