Simple and brief rules are more successful in practice than long and complicated rules.
I feel a briefer and more-to-the-point "When To Refactor" guide is to ask the following questions in the following order and only proceed when you can answer YES to every single question.
1. Do we have test coverage of the use-cases that are affected?
2. Are any non-trivial logic and business changes on the horizon for the code in question?
3. Has the code in question been undergoing multiple modifications in the last two/three/four weeks/months/years?
Honestly, if you answer NO to any of the questions above, you're in for a world of hurt and expense if you then proceed to refactor.
That last one might seem a bit of a reach, but the reality is that if there is some code in production that has been working unchanged for the last two years, you're wasting your time refactoring it.
More importantly, no changes over the last few years means that absolutely no one in the company has in-depth and current knowledge of how that code works, so a refactor is pointless because no one knows what the specific problems actually are.
I'd argue with that. Small-scale "micro-refactoring" operations are safe almost by definition. [1]
It depends on your language. In a large Python system [2], all bets are off, but in Java if you use your IDE to rename a method to have a clearer name or change the signature of a method or extract or inline a method the risk of breaking anything is close to zero whether or not you have tests.
Personally I think there is no conflict between feature work and micro-refactorings. If micro-refactorings are helpful for a feature you're working on, jump to it!
Personally I've had the experience of being a Cassandra [3] when it comes to YAGNI; [4] maybe it is different for a junior dev but so often I go to a meeting and say "what if you decide you need to collect N phone numbers instead of 2?" and over the next few months there is a stream of tickets for 3, 4, 5 phone numbers until they finally make it N. Or a problem w/ the login process that is obvious to me becomes the subject of a panic two months later.
As such I see the rule of three [5] to apply DRY is too conservative, it is way two common that the senior dev wrote two cases and then the junior dev comes in and copies it 15 times. At least on the projects I've worked on (mainly web-oriented, but many involving 'intelligent systems', data science, etc.) people have made way too many excuses for why repeating themselves is good and it has had awful consequences.
I'd extend #2 to any change, not just non-trivial. It's the classic Kent Beck tweet, "for any change, make the change easy (warning: this may be hard), then make the easy change"
Fixing my mental model of thinking about refactoring as a separate thing from "normal" development was key for me. Once I viewed refactoring as a thing that you do all the time as part of development, then I stopped even asking this question.
I somehow ended up doing the same. Probably as a result of too many failed heroic attempts at modifying read-only code in one big go.
Now any heroic modification of code comes in a series of incontroversial, isolated and testably idempotent modifications, followed by a minimal change to business logic.
> More importantly, no changes over the last few years means that absolutely no one in the company has in-depth and current knowledge of how that code works, so a refactor is pointless because no one knows what the specific problems actually are.
reply
That’s is the actual reason you might need to rewrite it.
If your critical system is written in an ancient script that nobody can understand, and does no longer supported, and as a security risk, at some point it will simply stop working. And there will be nobody that can fix it.
Yes, rewrite may be painful, but if you can no longer find the people to support the old thing it may be necessary
> If your critical system is written in an ancient script that nobody can understand, and does no longer supported, and as a security risk, at some point it will simply stop working. And there will be nobody that can fix it.
> Yes, rewrite may be painful, but if you can no longer find the people to support the old thing it may be necessary
I agree, but we weren't talking about rewrites, we were talking about refactors.
If you refactor some ancient old COBOL application, the result would be a new COBOL application.
If you rewrite some ancient old COBOL application, the result would most probably not be a new COBOL application.
>the reality is that if there is some code in production that has been working unchanged for the last two years, you're wasting your time refactoring it.
So much this. I recall watching people early in their careers who wanted to make their mark go after code like this that was just a waste of time and more likely than not to blow up things downstream they didn't understand. And sadly watched managers praise their tenacity rather than understanding the explosions that were being created.
#2 Are there related changes on the horizon for the code being refactored?
I think more qualifications than that probably miss times you should refactor.
#3 Has the code been changing recently OR have changes been delayed because modifying the unrefactored code is considered too difficult.
I've seen too many times where unrefactored code is considered too dangerous/difficult to modify even with total test coverage. Refactoring is a necessary step towards self documenting code in those cases.
One reason I might accept one or more NOs to your questions:
Does the refactor support pending work, which isn’t directly related to the refactored code, but benefits from the lessons learned and applied in the refactor… even in some indirect way?
This might be providing a clearer pattern you’ll apply to similar new functionality; or it might be providing a new abstraction or even eliminating a failed abstraction which sets that pending work on the right path.
Yeah. I did a (very small) refactor. It took, IIRC, four days. When I was done, I could write the new thing I was implementing in 10 new lines that used the newly-refactored existing code.
> Do we have test coverage of the use-cases that are affected?
With statically typed languages and good tooling like JetBrains ReSharper, there are guaranteed safe automated large refactorings that can be done as long as someone isn’t using reflection.m
I regularly do large changes on many millions of lines legacy code bases with no unit tests. It can be done it just requires a lot of work. The only factor that matters is do we need to do this, the rest not so much
I feel a briefer and more-to-the-point "When To Refactor" guide is to ask the following questions in the following order and only proceed when you can answer YES to every single question.
1. Do we have test coverage of the use-cases that are affected?
2. Are any non-trivial logic and business changes on the horizon for the code in question?
3. Has the code in question been undergoing multiple modifications in the last two/three/four weeks/months/years?
Honestly, if you answer NO to any of the questions above, you're in for a world of hurt and expense if you then proceed to refactor.
That last one might seem a bit of a reach, but the reality is that if there is some code in production that has been working unchanged for the last two years, you're wasting your time refactoring it.
More importantly, no changes over the last few years means that absolutely no one in the company has in-depth and current knowledge of how that code works, so a refactor is pointless because no one knows what the specific problems actually are.