I really wish we would stop calling it technical debt. Every team/org I’ve worked in with “tech debt” issues has had very tactical problems that could have been communicated, invested in, and solved. But instead the org talked about “tech debt” - an immeasurable boogeyman that anyone outside of the engineering org has no grasp of and, most importantly, management up the chain to the C-Suite have decreasing mental models of investment/pay-off the further up you get.
Teams saying “tech debt” are perpetually under funded and under appreciated.
Instead, speak a language your management chain understands.
* These specific services have outgrown their architecture and back pressure keeps outgrowing their current scale, we need to invest in a more reactive architecture. It’s going to cost 3 teams 1Q and we will prevent N outages based on historical data.
* In 2023, engineers far fingered the deployment of these services N times causing various levels of service outages, one made the news, we need to invest in guardrails in our CI/CD to prevent that. It’ll cost one team 2Q and we will prevent N outages.
* We had 4 employees across our engineering org quit last quarter because holding the pager burned them out, we need to stand up a tiger team that can help kick our metrics into shape.
Speak a language your management understands. Speak in terms of delivering features (feature velocity), reliability (outages), employee retention, hiring through resume driven development, etc.
You’ll find you’re negotiating in a positive sum game if you do this. You give me 1 unit of investment for this problem this quarter and I’ll give you 1.3 units of return next quarter. And maybe there are greater returns elsewhere so you aren’t making a competitive bid and that is okay, or maybe your management will invest in you and you just signed yourself up to deliver 1.3 units. But don’t handwave and ask for budget.
But here's a crucial pattern in your proposed language (not sure if you noticed it): you have to let bad things happen first. You need N outages to happen. You need N people to quit.
I still believe it's the right thing to do. Humans suck at being objective. The moment we find a "flaw" in the architecture it becomes the most important thing in the world to fix it. Even if the "flaw" was there for 5 years and never caused an issue.
Sticking to objective signals (outages, quitting, bugs, etc) is the only way to stay grounded in reality. But you have to let those signals to happen first. More than that: they need to happen often enough to start forming a pattern. It's just the cost you have to accept, because the alternative is much worse.
E.g. it is impossible to invest in reliability, refactoring, bugfixing right on time. You can either be too late, or too early. Counterintuitively, being too late is almost always the best option. Reason being, there's virtually unlimited number of improvements you can do too early.
That said, none of the engineering teams I worked with could accept that. I know I couldn't.
Yes! And if they don't care about velocity and reliability, don't tell them that. If they've been going on about hiring and employee retention, tell them how this tech debt thing is going to have such a huge change that you can turn it into a kickass conference talk and hire more 10x programmers, which is a bigger value to them than "I made the app more maintainable". They don't care much what your team does; just give them something they want to buy.
I had to laugh. You're (correctly) saying: "Describe it all as a positive additional software feature you're about to introduce." Quite right. Sell the sizzle, don't use the word "vegetable" when telling children what's for dinner. "Internal features" not "tech debt."
A software company is a software company, not a technical company.
So, in terms of Risk and Responsibility, discussions of Technical Debt don't sufficiently examine the nature of Risk (across-org complexity sources, wetware, workflows, market feedback cycles etc.). The concept also skews and pigeonholes the Responsibility of dealing with it to a small subset of people.
I had to laugh. You're (correctly) saying: "Describe it all as a positive additional software feature you're about to introduce." Quite right. Sell the sizzle, and never use the word "vegetable" when telling children what's for dinner.
Not writing for your audience is a losing game. They may have been fantastic developers, but the siren call of upper management may be what they think and speak now.
Teams saying “tech debt” are perpetually under funded and under appreciated.
Instead, speak a language your management chain understands.
* These specific services have outgrown their architecture and back pressure keeps outgrowing their current scale, we need to invest in a more reactive architecture. It’s going to cost 3 teams 1Q and we will prevent N outages based on historical data.
* In 2023, engineers far fingered the deployment of these services N times causing various levels of service outages, one made the news, we need to invest in guardrails in our CI/CD to prevent that. It’ll cost one team 2Q and we will prevent N outages.
* We had 4 employees across our engineering org quit last quarter because holding the pager burned them out, we need to stand up a tiger team that can help kick our metrics into shape.
Speak a language your management understands. Speak in terms of delivering features (feature velocity), reliability (outages), employee retention, hiring through resume driven development, etc.
You’ll find you’re negotiating in a positive sum game if you do this. You give me 1 unit of investment for this problem this quarter and I’ll give you 1.3 units of return next quarter. And maybe there are greater returns elsewhere so you aren’t making a competitive bid and that is okay, or maybe your management will invest in you and you just signed yourself up to deliver 1.3 units. But don’t handwave and ask for budget.