> Sadly, the effect of this is that there's less incentive to build resilient systems, since there's always someone available to fix things quickly.
I found this to be the opposite as long as the developer who wrote the system also owns the running of said system. It will not take many late night/weekend calls before the developer makes it more resilient.
Selling this to the business can be easy, because even if someone is there to fix it the system was presumably still down. This is why I push every developer to learn communication skills, and rudimentary business skills. If the developer (or their manager) cannot communicate to the business why a more resilient system is better for the business that's on them. This also requires developers to come off the "I only want to do it perfectly to 999999s", and again think about what can the business reasonably afford to do.
What they do is make it resilient to multi-hour fixes, while still keeping the light issues that look like emergencies but can be fixed by running a bat file to still go through, minimizing the work they have to do personally while still making them look like a hero.
Of course this will fall apart once their manager catches on but in some monolithic organizations that could take years
I found this to be the opposite as long as the developer who wrote the system also owns the running of said system. It will not take many late night/weekend calls before the developer makes it more resilient.
Selling this to the business can be easy, because even if someone is there to fix it the system was presumably still down. This is why I push every developer to learn communication skills, and rudimentary business skills. If the developer (or their manager) cannot communicate to the business why a more resilient system is better for the business that's on them. This also requires developers to come off the "I only want to do it perfectly to 999999s", and again think about what can the business reasonably afford to do.