If it happens extremely rarely (like, once every 6 months) or it’s super transient and low impact, we kick it and move on.
If it starts happening a 3rd or 4th time, or the severity increases we start to dig in and actually fix it.
So we’re not giving up, and losing all diagnosis/bugfixing ability, just setting a threshold. There’ll always be issues, some of them will always be mystery issues, so you can’t solve everything, so you’ve got to triage appropriately.
If it happens extremely rarely (like, once every 6 months) or it’s super transient and low impact, we kick it and move on.
If it starts happening a 3rd or 4th time, or the severity increases we start to dig in and actually fix it.
So we’re not giving up, and losing all diagnosis/bugfixing ability, just setting a threshold. There’ll always be issues, some of them will always be mystery issues, so you can’t solve everything, so you’ve got to triage appropriately.