Serious question, I've been programming 15 odd years, somewhere around a million people used my code last year, I've written entire sites on my own that happily run today.
I don't write tests. Projects that already had tests, those tests caught a single bug in about 3.5 years of working on that code base.
Yes, 1 bug. We'd have picked it up anyway when actually testing it.
Why do you feel #1 or #16 are actually true?
Am I (and colleagues) a magic snowflake, or are tests a massive waste of time, or is it somewhere in the middle?
>I've written entire sites on my own that happily run today.
This is probably the key issue. You're either working by yourself or you are on small enough team projects that it's easy to recognize when you are breaking things with a change.
Tests are about rapidly detecting regressions in a programmatic way. It should not require manually testing every time you want to make a small change to the code base and if you depend on reviewers seeing that a change will break something, you are in for a world of hurt on any large project.
Every large project I've been on that has skipped tests has ended up turning into a shitshow where master is broken for days at a time and you can't even tell the cause without hours of manual testing with bisection.
> if you depend on reviewers seeing that a change will break something, you are in for a world of hurt on any large project.
I don’t agree with this at all. Here’s a counter example: Linux.
I think the real problem here is that people don’t take review seriously and/or just don’t care about the project. We build these complicated, expensive systems that do all the thinking for us, not because it’s better, but because we simply don’t care.
Having tests is better than letting sloppy developers run amok. In my experience though tests aren’t really needed on projects with a maintainer that actually gives a damn.
Bad example because a massive chunk of linux is independent systems that have no overlap at all. The netlink code has jack-shit to do with a graphics driver from intel, etc.
Within subsystems regressions sneak in all of the time that don't directly cause compile-time failures.
I'm in the same boat and have thought a lot about it. The biggest correlation I see from automated testing advocates is that they are usually using dynamic languages for non-trivial core business logic. I mean that's pretty much it. If you're using a dynamic language for such things, like anything past 10k lines of code, then I agree, you pretty much need automated tests since they're partially filling in the role of a compiler and type system. It begs the question of why to use dynamic languages for tricky business logic, but that's a whole different flamewar!
If the testing advocates are using something with even a basic compiler/type-system (e.g. Java), then IME it often comes down to not being fluent with the design patterns, tricks, IDE shortcuts/plugins, tools, etc. that compilers and type-systems can provide to work with you to catch bugs at build time. I mention Java because it's a language that allows you to keep writing "dynamic" code and avoid the compiler/type-system to a good degree if you want (and boy do people want to). E.g. passing top-level Object types around and casting, using HashMaps instead of a proper Class, heavy reflection use, "stringly" typing, etc. Unlike say Haskell or Rust which force correctness down your throat to a much higher degree.
Obviously no matter the language choice, their comes a point where automated testing becomes cost-effective (e.g. safety-critical software). At that point however there should be formal methods and such in play also. And I'd say for the majority of us working on CRUD apps, we're not at that cost-effectiveness point.
You are able to go without tests because you're writing entire apps on your own. You know what each file, class, and module is supposed to do, and how to test it for regressions manually.
If you're working on a large team with a large suite of applications, this isn't the case. Very often I'll need to make changes to code that I don't explicitly own, or that I've never seen before today. If this code doesn't have tests, this is a recipe for disaster. I don't really understand what this code is supposed to do. How do I know that my change didn't break any existing functionality? Is it obvious how or even possible to test the functionality in this file manually? Maybe it's a background job that processes a file off of an SFTP server, how long will it take me to set up a manual test for that?
It's also about iteration time. Automated tests allow to you check the correctness of every single change nearly instantaneously. I don't want to need switch to a browser and wait for a form to submit just to check that my code still works after a minor refactor, or to ensure that my typo was fixed successfully. Tests mean that code is much more likely to be refactored often, and will lead to much cleaner and easier to maintain codebases.
This argument, and most of the others at this same level, have a glaring error. They only justify integration tests. They do not justify unit tests (in nearly all cases).
Unit tests wouldn't catch the examples you gave:
> How do I know that my change didn't break any existing functionality?
You don't. And with many unit tests ... you still don't. A software engineer with a few years experience will know this.
> Maybe it's a background job that processes a file off of an SFTP server, how long will it take me to set up a manual test for that?
This especially, no unit test is ever going to catch. This behavior would depend on so many things: encryption, networking, scheduling, starvation, graceful error handling, retry logic ... Not something any amount of unit tests could reasonably verify.
So this must be an argument against unit tests ? And yet, I don't think so.
In practice, you see the opposite. These arguments are designed to force people to write unit tests, and actually used against integration tests (which are complex tests that require tweaking in intelligent ways every time you make a serious change). Of course integration tests and "instantaneous" are usually opposites.
So I don't understand this argument. It only makes sense on the surface. It does not actually work on real large code projects. So I don't understand that you make this argument.
And I especially do not understand that nearly every large company follows advice like this. I mean, I can come up with cynical reasons: "more pointless work with zero verifiable results, but nice "quantifiable" (if meaningless) numbers ! Of course we're in favor !". But really, I cannot come up with a coherent to explain the behavior seen in developers in a large company.
Most statistics about unit tests make no sense. Coverage for instance, says nothing about how many edge cases you handle. Says nothing about many guarantees (like bounded and reasonable allocation for "weird" inputs, same with cpu usage, that it still works together with the OS and other daemons on the system it'll be installed on, ...). Since tests have no "real" purpose for software (they're code that doesn't run) number of tests is meaningless too. Number of lines of test code ... that's beyond useless.
The bigger a piece of software, the more useless unit tests become and the more critical realistic integration tests become ... and the less likely you are to actually find them.
And frankly, for small pieces of code, I find that specialized ways to "test" them that provide actual guarantees of correctness (like Coq) actually work. Unit tests ... never catch edge cases, not even with the best of developers. Of course, even Coq sort-of only makes sure you KNOW about every edge case and requires you to verify it's correctness before letting you pass, so bugs in your understanding of the problem will still be bugs.
But at least you have a theoretical guarantee that a developer actually thought about every edge case. Also, it's pretty hard to use. The language could be clearer, but that's not really the problem. The problem is that Coq WILL show you that no matter how well you think you understand a problem, your understanding is not complete. Often in surprising, interesting ways.
The SFTP file processor is actually a quintessential example of how unit tests can help new developers edit existing code.
A unit test for the processor would mock out all of the SFTP fetching, or move that to a different class, and focus on only the core business logic of processing the file. The core logic could easily be unit tested, and changes could be made to the file processor without needing to replicate the entire SFTP environment in order to determine if there were regressions in the core business logic.
The alternative is needing to spin up a test SFTP environment, or somehow do that mocking in my manual test, just in order to do something as simple as refactor the class or make a small business logic change. A unit test empowers any developer to make those changes, without needing much knowledge of the environment the code runs in.
> without needing to replicate the entire SFTP environment in order to determine if there were regressions
Jep. And then it doesn't actually work. Because it tries to read every file at the same time. Because the strings it passes in to resolve the server don't work. Because it never schedules the thread reading from the socket. Because it uses a 100 byte buffer (plenty for tests, after all). Because ...
And even then, what you're describing is not a unit test. A unit tests tests a single unit of code. Just 1 function, nothing more, and mocks out everything else.
So you can never test a refactor of code with a unit test. Because a refactor changes how different pieces of code work together, and unit tests by definition explicitly DON'T test that (all interactions should be mocked out). So a unit test would be useless.
The fact that something might go wrong in the integration test doesn't mean unit tests for the core logic aren't helpful. Besides, you're probably going to be using an external library for the SFTP connect, so it's very likely to go just fine.
And you can totally use unit tests for what I'm describing. Two classes
- SFTPHandler. Connects via SFTP, downloads latest files off the server, passes contents as a string to the processor class `FileProcessor.process(downloaded_file)`
- FileProcessor. Has one public function, process, which processes the file - doing whatever it needs to. This function can then very easily be unit tested, just passing strings for test files into the function. You can also refactor the `process` function as much as you like, not needing to worry about the SFTP connection at all. The `process` function probably calls a bunch of private functions within that class, but your unit tests don't need to worry about that.
I've used a setup like this in production, it works just fine, and allowed us to improve the performance of the file processing logic and make changes to it very easily and often - without worrying about regressions to the core business logic.
In my experience if there's one thing absolutely guaranteed it's that unit tests decrease the performance of whatever they're testing (Because it eventually leads to local wins that are big losses for the program as a whole, because this encourages doing idiotic stuff like allocating large buffers and keeping enormous non-shared caches and maps)
Now in the example given, performance does not matter, so I do wonder why you'd mention it at all.
How about you just answer me this question: Did you still see significant bug volumes after implementing the unit tests for the FileProcessor ?
Obviously I believe the answer to be "yes". I feel like your statement that changes were made "very easily and often" sort of implies that yes, there were many bugs.
Note that testing based on past bugs is not called unit testing. That is, as you might guess, regression testing (and has the important distinction that it's a VERY good practice to go back through your regression tests once a year, and throw out the ones that don't make sense anymore, which should be about half of them)
Besides, I've very rarely seen tests actually catch bugs. Bugs come from pieces of code not doing what developers expect them to do in 2 ways :
1) outright lack of understanding what the code does (this can also mean that they understand the code, but not the problem it's trying to solve, and so code and tests ... are simply both wrong)
2) lack of consideration for edge cases
3) lack of consideration for the environment the code runs in (e.g. scaling issues. Optimizing business logic that processes a 10M file then executing it on 50G of data)
None of these has a good chance of getting caught by unit tests in my experience.
But developers seem to mostly hate integration tests. Tests that start up the whole system, or even multiple copies of it, and then rapidly run past input through the whole system. When it fails, it takes a while to find why it fails. It may fail, despite all components, potentially written by different people, being "correctly written" just not taking each other into account. It may fail because of memory, cpu starvation, filesystem setup. It may fail occassionally because the backend database decided to VACUUM, and the app is not backing off. It may fail after a firmware upgrade on equipment it uses.
The problem I have with these "issues" is simple: they represent reality. They will occur in production.
And in some ways I feel like this is a fair description: unit tests are about "proving you're right", even, and perhaps especially, if you're wrong. "You see, it isn't my code ! Not my fault !".
It still seems like the core of your argument is "Some things are hard to test, so you might as well not test anything at all" - which I really don't buy.
> How about you just answer me this question: Did you still see significant bug volumes after implementing the unit tests for the FileProcessor ?
Kinda tricky to answer this, since there were test for this class from the start. But, overall the answer is "no" - we did not see significant bugs in that class. Occasionally we'd get a bug because the SFTP connection failed, or execution took to long and the connection closed - the type of bug that to you seems to negate the value of uniting testing the FileProcessor. But, without the unit tests for the FileProcessor, I'd have those bugs plus more bugs/bad behavior in the business logic. How is that better, exactly?
The idea that tests reduce performance is ridiculous. Improving performance requires making changes to a system while ensuring it's behavior hasn't changed. This is exactly what testing provides. Without tests, you can't optimize the code at all without fear of introducing regressions.
> It still seems like the core of your argument is "Some things are hard to test, so you might as well not test anything at all"
Nope. My argument is twofold:
1) unit tests don't actually provide the guarantees that people keep putting forward as reasons to write them
2) this makes them a dangerous, as they provide "reassurance" that isn't actually backed by reality
> But, without the unit tests for the FileProcessor, I'd have those bugs plus more bugs/bad behavior in the business logic. How is that better, exactly?
So the tests failed to catch problems with the program's behavior. Maybe it's just me, but I call that a problem.
Testing the business logic as a whole is not a unit test, except in the marginal cases where all your business logic fits neatly in a single function, or at least a single class. If it's actually a few algorithms, a few files, a bunch of functions and you test everything together, that's an integration test and not a functional test.
If you use actual (or slightly changed) data to test that business logic, as opposed to artificially crafted data, that again makes it not a unit test.
> The idea that tests reduce performance is ridiculous
If you use tests to optimize code you're optimizing a piece of code in isolation, without taking the rest of the system into account. That works for trivially simple initial optimization, but falls completely on it's face when you're actually writing programs that stress the system.
> Improving performance requires making changes to a system while ensuring it's behavior hasn't changed.
The system as a whole, sort of. Individual units ? Absolutely not.
Besides, tests merely provide FALSE assurance behavior hasn't changed. Time and time again I've had to fix "I've optimized it" bugs. TLDR is always the same. Unit tests pass so the code "must be right" (meaning they don't run the code outside of unit tests, and the unit tests only directly test the code, not blackbox). Then in the actual run edge cases were hit.
And "edge cases" makes it sound like you hardly ever hit this. Just yesterday we had a big issue. What happened ? Well we had a method that disables an inbound phone line (e.g. for maintenance, or changes). All unit tests passed. Unfortunately we really should have tested that it does NOT disable anything else (method was essentially "go through list, if it matches, disable". Needless to say, it disabled every line). Regression testing added.
We had someone optimize the dial plan handling. He didn't realize that his "cache" was in fact recreated at the very beginning of every request, when the code evaluated a dial plan, in other words, it was a serious performance regression rather than an improvement. Really looked like an improvement though. Unit tests ... passed. Of course, "the behavior hadn't changed". About 2 calls got through for ~6 minutes (normally thousands). Now we have a test that turns up the whole system, with the actual production dial plan, and then goes through 10000 calls. If it takes more than 1 minute, test fails. Developers hate it, but it's non-negotiable at this point.
I can go on for quite a while enumerating problems like this. Billing files erased (forgot to append). Entire dialplan erased on startup (essentially same problem). Lots of ways to trigger cpu starvation. Memory exhaustion. Memory leaks (the system is mostly Java). Connecting unrelated calls together (that was pretty fun for the first 2-3 minutes). Ignoring manager signals (one server having to do all calls ... which is not happening)
This is a repeating pattern in our failures. Unit tests ... essentially always pass. And time and time again someone tries to make the point that this must mean the code is correct.
OK, I get that you don't like unit tests. You also seem to have a very strict and unhelpful definition of what a unit test is. I don't really care for the academia of it, I just want to know that my code works. So, if you call it a unit test or an integration test or a functional test or a behavioral test - whatever. The important thing that it allows me get some idea of whether or not the code is doing what it's supposed to do.
What do you propose instead of testing? Manually QA every single piece of the system for every change? Not a lot of companies have the headcount for a the fleet of QA people that would require.
For what it's worth I think you both make great points. Whomever one agrees with perhaps mostly hinges on whether unit tests can get in the way of integration tests.
I'm inclined to believe that some of the exhaustive unit testing culture can provide a false sense of security, and the work involved to write them can get in the way of writing proper integration tests (or 'degrees of' if you see it as a spectrum).
Provided that proper test coverage includes both integration tests and unit tests, it probably won't hurt to do both. I like unit tests as part of my process (the red-green TDD process can be quite addictive and useful), but under pressure I'd prioritize integration tests, knowing that the more time I spend unit testing, the less I'll spend integration testing.
It comes down to the cost of failure. If the cost of failure for a particular module say, payments processing or rocket guidance system, is high then writing tests could be much cheaper than waiting for a failure, especially in production.
Your code is full of bugs, you just haven't noticed yet.
Whether that matters or not (to you or your end users) is your grey ground in the middle.
It's possible you've only ever worked on things that are simple or obvious enough to not have the need for tests, and that code that never changes. Especially if that's UI code, where you're testing it manually as you write it, you may be able to live with this on small code bases.
If you have never worked in an environment where the benefit of at least some level of automated testing is not blindingly obvious, my advice to you is to stop doing the same thing over and over and find a new job.
I hardly think that encouraging someone to seek out alternative positions from which to view the world, and a suggestion that there is a grey area for all this constitutes zealotry. Quite the opposite, in fact.
And in my experience projects with unit tests are far _less_ riddled with bugs, which is the whole point, no?
And I don't know understand why on earth you think I'm telling anyone they are "shit". I'd advise someone to broaden their horizons if they said they thought functional programming was worthless, or that assembly language was always pointlessly low level, or any one of a number of possible assertions that wouldn't sit well with me. That is far from "you are shit" - it's much more assuming they are not shit, but haven't had the right context yet.
There is something in the middle actually. Just write test for the hard tasks or the algos that you implemented your self. (Some stuff that I don't undertand end up much easier to write once I've created some tests to help me iterate faster)
Testing basic class re-use (I'm thinking about serializers, or ViewSets like in django) doesn't make much sense to me, and in fact was 100% automatised.
Those "simple" test end-up being useful when you start hitting your app with the hammer to change a lot of stuff in it. I havn't had any use for these otherwhise.
Many if not most tests are a waste of time (especially when the same guarantees are provided by, say, a type system).
That said, automated tests catch a problem that would make it to production just about weekly in my experience. If you're trying to do any kind of HA/CI/CD they're really really helpful. Beyond that, isolated "unit tests" are somewhat helpful too in my experience, but only so far as they allow you to easily run your test suite (and rarely, find bugs faster).
Broadly speaking, there are several steps in the development process where defects are caught:
1. During development: you code something, run it, see something fail and fix it. Most people don't consider this part of QA because it's coding/debugging.
2. Code review: a teammate reads your code and catches a bug or an opportunity for a bug to arise.
3. Manual QA: a teammate tries the new feature, finds a defect and reports it.
4. Production: a user reports the error.
5. Monitoring: tools like Sentry may warn you before a user reports the issue.
With your experience, you probably have learned to avoid patterns in your code that make #1 occur often. Maybe you use static typing, linters, an IDE or some other tools to help you avoid silly mistakes.
If you perform code review (#2), then you help less-experienced teammates avoid mistakes even if they haven't learned to avoid error-prone patterns or don't use tooling like you do. They spend more time in #1 than you do.
If you care (above average for developer standards) about your job, you probably manually test the software you make to see how your users experience your creation (#3) before checking-in code, while code-reviewing, and/or afterwards to help testers. This takes time and if you find mistakes, that means other teammates that did this before you invested time also.
If your company releases progressively (using feature flags) you get info from users (#4) and fix before it's considered an issue. Also, if using monitoring tools (#5) you may avoid users ever noticing there was an issue. These all lead to rework and are risky.
I don't have enough information about what process you follow, so I tried making references to generic practices. In theory, writing tests first reduces the amount of time reacting in situations 1-5. I have felt the benefits in my own team and seen them when coaching other teams.
If none of what I have written applies to you and your team, please let me know. Would love to buy you all a meal to hear about what you do to avoid defects in your software. I'm not being snarky here. Have genuinely been asking myself how do these developers I read about online manage to ship bug-free software really fast without writing tests.
It's probably a combination of #3 (I always do that now, use the feature I just made/improved/fixed) and that I always make sure I can reliably repro a bug before I try and fix the bug, so I always know it's fixed.
Also a significant majority of my code has been statically typed as you say and I'm moving even more over to static typing now I'm using TS over JS in all new code.
> Am I (and colleagues) a magic snowflake, or are tests a massive waste of time, or is it somewhere in the middle?
Tests are definitely a massive waste of time if you never need them. Write and forget code exists and can certainly run happily for years. On the other hand, frequently changing something in a large existing codebase, especially without static typing, is almost certainly going to end up more expensive without automated tests.
I'm not a professional programmer, but I think it comes down to whether you expect to ever do major refactoring. If requirements change substantially, would you write new code, or would you modify the existing code and attempt to maintain some kind of backward compatibility? If the latter, tests can be worth it.
I have a library that never had any tests until I wanted to re-architect it in a fundamental way whilst porting it from Python 2 to Python 3. Since it was a wire-protocol type of thing, the Python bytes–unicode distinction made this hard. So I wrote tests for each piece of functionality as I ported that piece. I keep the tests around nowadays even though the porting is complete, but am kind of lazy writing ones for new functionality.
Of course tests also seem really important when you're accepting 3rd party contributions to your codebase. I've made what I thought were totally innocuous pull requests to projects only to see in the CI tests that my code causes regressions.
I usually focus on end-to-end tests: when I had a new feature or do some refactoring I don't have to try things manually. I just launch my test suite go prepare some coffee and if the suite passes I'm confident in not having broken anything.
Also writing new tests for new features helps thinking about the edge-cases you'll encounter.
When someone reports a bug I reproduce it. Then I write a test reproducing the problem. And at last I correct it. Now I know 6 months or one year from now I won't have a regression and see this bug coming back. People really hate when something they reported months ago is a problem again.
Tests fail all the time during development. Test fails indicate a bug. You fix the bug before you commit. The bug never makes it to prod. It's impossible that only one test ever failed during development.
When I was a rookie programmer 12 years ago, I was assigned a task to optimise a routine. It took current vehicle speed as input, performed integration and calculated total distance traveled. I made a mistake that was let through code review and was caught by an automated test. My calculations were off by 1% which accumulated to a larger amount in 10 minutes. After that incident we never had another bug for 5 years in that module. Does this make the tests invalid? Our does this improve confidence in our code?
God help the person who has to pick up after you, do refactoring etc etc. Not writing tests is unprofessional as it introduces risk. If code is being used by 1 person or is some non-critical internal application then fine. If it is being used by 1 million people to earn a living the people responsible should lose their jobs. I spend a significant amount of my time tidying up after people who don't write tests and leave a mess and it's a nightmare (but keeps my family fed, housed and clothed so...).
I don't write tests. Projects that already had tests, those tests caught a single bug in about 3.5 years of working on that code base.
Yes, 1 bug. We'd have picked it up anyway when actually testing it.
Why do you feel #1 or #16 are actually true?
Am I (and colleagues) a magic snowflake, or are tests a massive waste of time, or is it somewhere in the middle?