99% is absolutely reasonable for one layer of defense among many! That is one of the best methods to achieve truly high reliability, as is needed in this case: stack many reliable systems in such a way that they all must fail to get an overall failure. It is not perfect, of course, and things can always cascade, but it is a powerful technique.
I'm guessing you've never worked in system-critical infrastructure? Airplanes are another level up from that. I'm not the one saying this, the FAA and NTSB are. Nothing is allowed to go wrong, ever.
I've been stuck at SFO for hours at least once a year because our flight had some mechanical issue. (Always on a Delta flight to ATL or MSP, don't know why.)
They did fix it and then eventually we took off. In a way that's "not going wrong". On the other hand, they didn't cancel it and send the plane to be disassembled for failure analysis. That'd certainly be safer.
Everything involved in aviation is designed to be extremely reliable, but parts are still expected to break. Airplanes have a list of parts which are allowed to be broken without grounding the airplane. Every part has a well-documented procedure for inspection, maintenance, and replacement.
Investigations happen when, despite following the documented procedures, stuff somehow still goes wrong. They are done to improve the procedures so that it can never happen again.
Inspection and maintenance is a significant source of errors, though. Nobody wants to disassemble an entire plane when a single part develops a well-isolated failure.
Air Transat Flight 236 had its engine swapped out with a spare during routine maintenance. However, the engines had a different "patch level", leading to them installing a hydraulic hose with the wrong length. This hose rubbed on the fuel line leading it to develop a leak. The subsequent flight ran out of fuel halfway over the Atlantic, and they narrowly avoided having to ditch it into the ocean.
American Airlines Flight 4439 was done on an aircraft with a faulty trim switch. Prior to the flight, maintenance engineers wanted to replace it, but this was cancelled mid-process due to the time required to acquire a replacement part. They re-installed the switch and marked it as inoperable - which is not an issue as - despite the switch being safety-critical - there are two other trim switches available. However, the faulty trim switch was reinstalled backwards, and the pilot still tried to use it due to muscle memory. This nearly lead to a pilot-induced stall.
There are literally dozens of stories like that. In aviation, there is no room for error.
The fact they didn’t take off with the mechanical issue should be a point in their favor, not against them.
More accurately though, safety critical things are not allowed to go wrong. If they do, they get investigated. What is and isn’t deemed safety critical is a document written in blood, unfortunately.
99% is absolutely reasonable for one layer of defense among many! That is one of the best methods to achieve truly high reliability, as is needed in this case: stack many reliable systems in such a way that they all must fail to get an overall failure. It is not perfect, of course, and things can always cascade, but it is a powerful technique.