> And we define service health purely using SLIs that reflect actual service usage (or ability to use the service). As such, we do not rely on Docker HEALTHCHECKs. We don't want to rely on a container measuring its own health
I'm not sure you fully grasp the issue or understand how Docker works. Docker's healthchecks are not "container measuring its own health". Docker's healthchecks are a standard interface that was designed to allow container orchestration services to poll containers to check if they are still in working order.
From your own description, it sounds like you tried to reinvent the wheel, and did it poorly.
And I'm sorry to break it to you, but if you have developers faking health checks in production then your choice of container runtime or container orchestration system is not the problem you need to worry about.
I'm not sure you fully grasp the issue or understand how Docker works. Docker's healthchecks are not "container measuring its own health". Docker's healthchecks are a standard interface that was designed to allow container orchestration services to poll containers to check if they are still in working order.
From your own description, it sounds like you tried to reinvent the wheel, and did it poorly.
And I'm sorry to break it to you, but if you have developers faking health checks in production then your choice of container runtime or container orchestration system is not the problem you need to worry about.