I've been trying to introduce this into my current company for as long as I've been working there, with no success so far, with varying reasons:
- Unit tests for existing projects often fail
- Unit tests for existing projects take several minutes to run
- Our current build process cannot be automated.
Quite frustrating, especially since I've seen a lot of cases where it could've saved us from near-disasters.
EDIT:
I think on the subject of Emergence, the writer has missed one key point: emergence requires interplay between different levels of the system.
> Unit tests for existing projects often fail - Unit tests for existing projects take several minutes to run - Our current build process cannot be automated.
Same here. Except we have no almost no unit tests because
1) Most of the devs don't know how to write unit tests
2) Most of the devs don't know hot to write code that is testable, even if they could write the actual unit tests.
About a year or so ago we had a project that got a few unit tests added to it, within days they were broken nad failing. The dev just added a [TestCategory("Blah")] and excluded them from being run...
>tests that run like shit off a shiny shovel, because everything is mocked, meaning that nothing is actually tested.
Can you expand on this a bit?
I've always worked under the impression that tests should be focused on testing a narrow slice of code. So if my SUT has some dependencies, those will be mocked with the expected result from the dependency, and possibly even with the expected parameters in the arguments. The asserts will check the result from the SUT, but also that the expected method on the mock was called. This way, I'm just testing the code for the SUT, nothing more.
The problem is that bugs rarely occur in one unit. They occur in the interaction between multiple units. The smaller you choose to define what a "unit" is (because everybody has their own definition of what a unit test is), the more true this rule becomes. Extreme case of small unit would be testing every single line individually which obviously nobody has resources to do, doesn't tell you anything about the complete system and which still gives you 100% code coverage!
Your "expected result from dependency" might be volatile or hard to mock due to bugs, state, timing, configuration, unclear documentation, version upgrades or other factors inside the dependency. So when the system breaks while all unit tests are still passing you get this blame-game where one team is accusing the dependency for not behaving as they expect, when the truth is that the interface was never stable in the first place or was never meant to be used that way.
What you have to do is to choose your ratio of system test vs unit test. The scenario GP describes is companies that spend 99% of their testing budget on unit test and 1% on system test, instead of a more healthy 40-60.
Thanks. That makes a lot of sense. So while testing a given class, it may have some dependencies, but those may be external resources (a DB, an API, etc), or internal ones. It sounds like the recommendation is only to mock where those external dependencies lie, and leave the internal dependencies. Eventually, as you go down the chain, those internal dependencies will get to external ones (which will likely still need some sort of mock/fake/stub), but you're allowing more of the logic and interaction of the system to be tested, rather than just the logic in the one class that's directly being tested.
I'm not the GP in question, but I have worked on code that I think fits this phrasing.
In that project's case, 99% of tests were mocks where the only thing being tested was whether or not the mocked function got called the expected number of times or with expected arguments.
So the many thousands of tests ran very quickly, and over 90% of the code was covered by tests; however, nothing was actually being functionally tested in those cases.
In other words, the tests ran like shit off a shiny shovel.
What then happens is that the mocked out functions change what they return, or the order of what they do inside (e.g
so that for a given set of inputs now throws a different exception). Someone forgets to update the mocks. All the tests continue to pass, even though none of the conditions are actually possible in the program.
I'm old to development, and would like to ask for the same thing. But something beyond "Pragmatic Unit Testing".
You see, I've read that. I've read one other book on unit tests too, and been on 2-day long training in TDD, and spent many hours trying to write unit tests, and yet the skill still eludes me. It's like I have a blind spot there, because I can't for the best of me figure out how to test most of the code I write.
In the projects I'm working on, I find roughly 10% of the code to be unit-testable even in principle. That's the core logic, the tricky parts - like the clever pathfinding algorithm I wrote for routing arrows in diagrams, or the clever code that diffs configurations to output an executable changeset (add this, delete that, move this there...). This I usually write functional-style (regardless of language), I expect specific inputs and outputs, so I can test such code effectively. Beyond that, I can also test some trivial utilities (usually also written in functional style). But the remaining 80-90% of any program I work on turns out to be a combination of:
- code already tested by someone else - external dependencies
- code bureaucracy, which forms vast majority of the program - that is, including/injecting/managing dependencies, moving data around, jumping through and around abstraction layer; this I believe is untestable in principle, unless I'm willing to test code structure itself (technically doable for my Lisp projects...)
- the user interface, the other big part, which is also hilariously untestable, and rarely worth testing automatically, as any regression there will be immediately noticed and reported by real people
I'm having trouble even imagining how to unit-test the three things above, and it's not something covered in unit testing books, tutorials or courses I've seen - they all focus on the basics, like assertions and red-green-refactor, which are the dumb part of writing test. I'm looking for something for the difficult part - how to test the three categories of code I mentioned above.
You test for whether the integration works correctly, for example if you use a library for computation you will test whether your function which uses the library is producing the expected result
> - code bureaucracy, which forms vast majority of the program - that is, including/injecting/managing dependencies, moving data around, jumping through and around abstraction layer; this I believe is untestable in principle, unless I'm willing to test code structure itself (technically doable for my Lisp projects...)
Here again you do some integration testing, for example when handling database operations, you can setup a test to load fake data in the database, run operations on this fake data and compare the results, and then purge the fake data.
> - the user interface, the other big part, which is also hilariously untestable, and rarely worth testing automatically, as any regression there will be immediately noticed and reported by real people
This sort of testing has become much more common for web apps and mobile apps, there are two things being tested one is whether the interface loads correctly under different conditions(device types, screen resolutions etc) this is tested by comparing the image for the 'correct' configuration with the test results, the other thing is whether the interface behaves correctly for certain test interactions - this is tested by automating a set of interactions using an automation suit and then checking whether the interface displays/ouputs the correct result.
I've also struggled with this notion of "untestable" code. Untestable code seems to usually be a big pile of function calls. Testable code seems to be little islands of functionality which are connected by interfaces where only data is exchanged.
Practically, this seems to be about shying away from interfaces which look like 'doThis()', 'doThat()', and more like 'apply(data)'. Less imperative micro-management and more functional-core/imperative-shell.
Edit: a network analogy might be: think about the differences between a remote procedure call API vs. simply exchanging declarative data (REST).
Have you heard of the ideas in https://www.destroyallsoftware.com/screencasts/catalog/funct...? Essentially, his position is that it's only worth unit-testing those tricky bits, and the rest is inherently not worth unit-testing, because you'll almost-certainly embed the same assumptions you're attempting to test into the tests themselves.
In other words, I agree with you, but would question why you're even _trying_ to unit-test those other areas that aren't conducive to unit testing.
Yes, I've heard of the idea of functional core, imperative shell. In fact, that's how I write most code these days. But I don't think I've watched this talk before, so it just went straight into my todo list.
> In other words, I agree with you, but would question why you're even _trying_ to unit-test those other areas that aren't conducive to unit testing.
Well, other people make funny looks at me when they see how little tests I write...
No, but honestly, I see so much advocacy towards writing lots of tests, or even starting with tests, and so I'm trying (and failing) to see the merit of this approach in the stuff I work on.
> you'll almost-certainly embed the same assumptions you're attempting to test into the tests themselves.
That was my main objection when I was on a TDD course - I quickly noticed that my tests tend to structurally encode the very implementation I'm about to write.
The only thing that smelled like a path to enlightenment seemed to be Haskell's QuickCheck [0]. It's actually going out searching for bugs for me, rather than me having to pretend like I can think of all the failure cases. I haven't implemented it in a project yet though so I don't know how much easier it is than unit testing.
To be fair: much of it it practice and prior knowledge. I personally found the book Pragmatic Unit Testing helpful, as well as the Clean Code chapter on unit tests.
Others may be more able to find you more readily available sources.
Unit testing is easy for beginners. Unit testing done well is much less so. This results in many inexperienced people writing poor unit tests.
In my last job, I worked on a legacy C++ code base. 500K lines of thorough testing, but none was a unit test. It took 10 hours to run the full test suite. I set about the task of converting portions for unit testability.
I was surprised how much of a challenge it was (and I learned a lot). There were few resources on proper unit testing in C++. Mostly learned from Stack Overflow. Lessons learned:
1. If your code did not have unit tests, then likely your code will need a lot of rearchitecting just to enable a single unit test. The good news is that the rearchitecting made the code better.
As a corollary: You'll never know how much coupling there is in your code until you write unit tests for it. Our code looked fairly good, but when I wanted to test one part, I found too many dependencies I needed to bring in just to test it. In the end, to test one part, I was involving code from all over the code base. Not good. Most of my effort was writing interface classes to separate the parts so that I could unit test them.
2. For C++, this means your code will look very enterprisey. For once, this was a good thing.
3. Mocking is an art. There's no ideal rule/guideline for it. Overmock and you are just encoding bugs into your tests. Undermock and you are testing too many things at once.
4. For the love of God, don't do a 1:1 mapping between functions/methods and tests. It's OK if your test involves more than one method. Even Robert Martin (Uncle Bob) says so. I know I go against much dogma, but make your unit tests test features, not functions.
5. If your unit tests keep breaking due to trivial refactors, then architect your unit tests to be less sensitive to refactors.
6. For classes, don't test private methods directly. Test them through your public interface. If you cannot reach some code in a private method via the public interface, throw the code away!
7. Perhaps the most important: Assume a hostile management (which is what I had). For every code change you make so that you can write a unit test, can you justify that code change assuming your project never will have unit tests? There are multiple ways to write unit tests - many of them are bad. This guideline will keep you from taking convenient shortcuts.
This advice is all about unit tests, and not TDD. With TDD, it is not hard to test yourself into a corner where you then throw everything away and restart. If you insist on TDD, then at least follow Uncle Bob's heuristic. For your function/feature, think of the most complicated result/boundary input, and make that your first unit test. This way you're less likely to develop the function into the wrong corner.
When I completed my proof of work, the team rejected it. The feedback was it required too much skill for some of the people in the team (40+ developers across 4 sites), and the likelihood that all of them will get it was miniscule. And too much of the code would need to change to add unit tests.
Well that they fail and that they take minutes to run isn't a bad thing; what is bad is that broken things end up on master.
Think of it this way: Any time the build is broken, everyone working on the project is interrupted. If you have 10 people working on the codebase and the build is broken for an hour, that's a whole workday and then some wasted.
You could start a grassroots movement - create a pre-push or pre-commit hook that runs tests before it ends up on the remote. Don't worry about the tests taking minutes to run, if you're waiting on that several times a day you're probably publishing too many small changes over the course of a day.
> Well that they fail and that they take minutes to run isn't a bad thing; what is bad is that broken things end up on master.
You don't understand; they are continually broken on master.
Furthermore, you assume that we only commit to master after reviews etc have been done. This isn't the case. Commits, even intermittent commits, are pushed to master, and reviewed from there.
You have to make the people love you. Then they will follow your good advice joyfully. In fact they'll follow your bad advice just as much, so be careful you don't get promoted to VP.
It's an important distinction between "our current build process cannot be automated" and "our current build process is needlessly complicated and thus the ROI from automating it is questionable."
You can do a lot of things between bash scripts, PowerShell, etc. Even custom executables.
I'm a big believer in unit tests, but to be fair, this rule isnt quite accurate. Suppose I write a test as follows:
Assert(rand() % 2 == 0);
When the test gets run the first time, it can pass and the code can be admitted, but when the next person goes to add code, it can fail. At this point, you've broken your defining principle: that all the code in the test base is correct. The answer of what to do here isnt clear as its a tradeoff in time spent fixing the tests (or potentially the test framework) and developing features. If you have deadlines this becomes more tenuous. Then you figure out that the reason the test is flaky is because it isnt being run in a real environment...etc.
All I'm saying is that it isnt as simple as this "Rule" would like to have you believe.
I've been trying to introduce this into my current company for as long as I've been working there, with no success so far, with varying reasons:
- Unit tests for existing projects often fail - Unit tests for existing projects take several minutes to run - Our current build process cannot be automated.
Quite frustrating, especially since I've seen a lot of cases where it could've saved us from near-disasters.
EDIT: I think on the subject of Emergence, the writer has missed one key point: emergence requires interplay between different levels of the system.