It's funny. Just as the author reached this realisation I've reached the realisation that tests are killing my productivity and are exactly the wrong strategy for my nascent product.
I've been testing obsessively for the past 18 months but just realised that I've got a stack of perfectly tested code that's perfectly wrong for what I need it for. I'm going to have to tear it apart and rebuild.
I've been thinking about what the right balance is between the two. As code matures the value of tests becomes higher and higher but early on when you need to shape and reshape they really kill momentum.
This is one of the reasons why I don't make testing into a religion but pick a time and a place to add the tests. The 'write your tests first' school is right in that it can create another boost but it comes with the problem that it puts a damper on any exploratory programming. I typically take three tries to get it 'right', and only after the third try with a reasonably stable interface is where I will add tests.
Maybe I'm a lousy designer in that I can't nail down an interface on the first try but on non-trivial code this is unfortunately my experience so far.
First try: very quick, rough and dirty. Second try: a bit slower, mostly right but still needs major rework. Third try: minor tweaks in the longer term but mostly finalized.
The elapsed time is usually something under a day for the first try, a few days to a week for the second and after the third try it goes into 'maintenance' pretty quickly.
I think that what you say makes a lot of sense and I completely agree with doing code-test-code. True TDD (which I have tried) requires too much constant change to be practical.
The churn I was referring to though is less that of short-term interface changes but more to do with long-term multi-month revamping of the product proposition. During these highly exploratory phases, the main risk with errors is not that they multiply but that they never happen because the code isn't run.
I always knew this in theory but like all lessons I needed to learn it the hard way to really appreciate.
This makes me wonder even more though when you put it together with this part of your post:
...I have good hope that in the long run this will tremendously increase my productivity. Even if in the short term it seemed as though my productivity took a hit (after all, writing code that tests stuff does not add any functionality to the program itself).
You seem to be saying that even though you've been doing it for a year, you're still suffering a drop in productivity. Have I misunderstood you? Doesn't that indicate that it's actually not a good idea? Or do you have evidence that there's less customer visible bugs to deal with? (evidence rather than your hopes would be better :)
It doesn't really feel like you've communicated why all that extra testing code is so great. And let's not forget that's what it actually is, code which can be wrong itself. And you're not even doing TDD but adding them way down the line!
There definitely is a drop in initial productivity, after the exploratory phase is over there is a hick-up in the output because I'm writing test cases. Then, as soon as that is done there is a huge boost in productivity because it becomes much (and I can't emphasize enough how much that really is) easier.
So it is more a matter of uneven output than an overall loss, overall it is definitely a net gain. Hard to quantify though because you will never know how much time you would have spent debugging and testing if you had not written those tests first.
It is more of a feeling than that it is measurably slower, in fact it is measurably faster but when writing the test cases it feels slower. I hope that clears it up, sorry for the confusion!
As for fewer customer visible bugs: yes, absolutely, no doubt about that.
The 'write your tests first' school is right in that it can create another boost but it comes with the problem that it puts a damper on any exploratory programming.
As far back as 1999 I can remember the "write your tests first" people saying "... unless you're doing exploratory programming and are likely to throw away the prototype."
> why I don't make testing into a religion but pick a time and a place to add the tests
Quite. Testing is great. "Test Driven Development" where expensive Agile Consultants who haven't written code in ten years tell teams they always have to write tests first for everything is bullshit: http://xrl.us/bmebj3
Many successful devs also preach TDD. TDD is great when your client has a high quality spec, or you are being paid for meeting team spec. TDD is less great when your project goal is "build something neat" with a very loose spec and wide tolerances.
Well it really depends. You can make a religious Process out of TDD where you start with something like Cucumber and work your way down through the design. You can TDD your glue code. You can TDD trivial stuff, like when you put a link in your web app to your new feature, you could write a view test that verifies that the link appears. You can, if you really want to, go through an obnoxious iterative process of writing the same function in a progressively less and less broken state. There might even be value to this approach. I wouldn't know.
Or you could do something like this: whenever you have a function that seems to do an actual algorithm, or process some data, or provide a predictable result given an input, you write up a few asserts and programmatically see whether it seems to work immediately after writing it instead of writing it and wasting your time futzing about with it by hand, or tracking down a bug in it after noticing strange results coming out of the whole program.
Or you could do something like this: whenever you get a bug report, create a test case that reproduces the bug first.
Agile Consultants are like anyone else who tries to sell you enlightenment: they wouldn't eat unless there was actually a kernel to truth behind what they were saying. Writing your tests first is a fantastic idea sometimes.
Even if the components built of units change, and thus trach their tests, won't the tests covering the units still be useful?
Also: I find that a lot of component changes are interface changes -- the test code has to change to match the new interface and return values, but still remains useful.
I think it depends a lot on the nature of the problem that you're tackling. If you know in advance that your task is to turn input 'A' into output 'B', then I think it makes sense to write some of your tests before writing any of your actual code.
Most of the projects I'm involved with, however, are highly experimental, both in terms of the inputs and outputs that we're dealing with, as well as the ways we interact with the user. Writing tests or even specifications (except in the most generic sense) upfront makes no sense in this case: what we need to do is quickly make some crude stuff, play around with it, and then make slightly less crude stuff. Wash, rinse, and repeat.
After several cycles of this, we know what we actually want to do, and can start buckling down and writing production code. Some of it may or may not be borrowed from the prototyping cycles. In any case, testing becomes CRITICAL at this point.
That's what I use tests to do. Because of the effort of setting up and re-factoring the tests I have a long hard think before I write any code.
I consider it using an API before writing it any code. Doing this forces me to consider how I would actually use it and usually my initial assumptions on how the code should work are wrong.
I usually end up with the correct solution this way. This is of course for small parts of the application. The overall idea may be wrong, but since everything is now in small reusable components its easy to plug and play to get the correct solution.
Probably the same result as rapid iterations, throwing away portions and then building correctly the last time. I wonder which one is more effective or if its dependent on the person.
It depends on the stage of your company and product, which I think of as a continuum: on the far right is banking software, life support machines, and airline code; on the far left are prototypes and fun hacks. The amount of testing you do should depend on where on that continuum your project lies.
Let's say it's a scale from 1 to 10: 1 being no testing, 10 meaning hardcore static analysis and tests for your test's tests). Life support machines are 10, as are aircraft, and please god the software in my car. Most projects start at 1, become roughly 3 in the early stages, and settle on 6 when they become more mature.
Using the wrong level at the wrong stage is a recipe for disaster. No tests for your mature product instils fear of change - anything you do will likely break something, and you will have no warning about it. But writing lots of tests for your prototype will kill your productivity.
One answers "Does the code work as I intended?" For that, I write unit tests, integration tests, or end-to-end tests as needed.
The other answers "Is this product useful?" For that, we do things like user tests, A/B tests, and landing page tests.
I think you need both. If you do too much programmer-focused testing, then you can end up with something that's well-built and useless. If you do too much user testing and iteration, you end up with something that people like, but that is buggy and hard to evolve.
> I've been testing obsessively for the past 18 months but just realised that I've got a stack of perfectly tested code that's perfectly wrong for what I need it for.
I couldn't agree more. TDD has its place, but it so greatly reduces your ability to react to change that it's a non-starter for me.
Excellent points. At the company that I work for, rigorous testing is an absolute necessity. Our web applications are critical to our users' business and any significant bugs in production is not acceptable. We have unit tests, selenium tests AND manual testing that goes on before any release.
As you mentioned, in the early stages of a product, bugs in production may be tolerable especially when you're not sure you have a viable product. However, once you hit a significant user base that is paying for the service, service disruptions from bugs can really hurt business.
Obviously we're not trying to be NASA, but I would say we spend 25-30% of our development time writing tests.
> Obviously we're not trying to be NASA, but I would say we spend 25-30% of our development time writing tests.
With absolutely no disrespect intended, this seems low. I've read some studies about testing, and I chat to my customers a lot about it (my startup, Circle, does hosted continuous integration), and I would estimate that most projects spend over 50% of time testing.
I initially wrote 35-40% but wasn't sure if that was realistic. No disrespect taken at all - I'm pushing for a more concerted effort on better unit test coverage. We also have a very large set of Selenium tests that run in a distributed build environment.
The convergence between Minimum Viable Products and Hacking is a poignant one and suitable for this audience, I suppose.
People who develop new security breaks -- let's call them inbreakers because they break in -- are generally either Hackers or Script Kiddies, and not End Users or Software Engineers. I would broadly file computer users into those four categories, and maybe add some other categories on a good day. What does this mean?
Let's start with Hackers. It means that by the time the inbreaking software is finished, it's quite possibly buggy, it quite possibly doesn't work on all infectable PCs, it's built to be a proof-of-concept which can be expanded. It's supposed to work, but it's not necessarily supposed to be clean and efficient and well-documented. That's someone else's problem. This is the Hacker mentality. It is the same that you should focus on for your Minimum Viable Products. Get version 0.1 out really quickly: it doesn't have to handle any edge cases, instead focus on just getting the core logic right. Everything else can wait. That is what "hacker" connotes in software circles. It's going to be ugly. It's going to be hacks. In the context of web development you should think, "Version 0.1 won't work on IE 6 or 7. It won't have graphics. It might have seconds of latency. It won't use a database -- I'll just use an array. Unless I'm in PHP and I absolutely have to use a database -- then I'll store everything in a JSON string stored in one text field. Usernames will just be stored in the query string part of the URL. My checkIfUserIsAdmin() function will return true. People can sign up for an email list and that's about it -- no automatic emails, I will solve that later." The core rule for hacking is, if you spend time doing something which isn't fun, what the hell are you doing?
The Script Kiddies are the more common inbreakers our applications, but let's broadly define them as "people who use your application warts and all." They know it breaks. That doesn't matter, they're surprised that they can use it at all. They're not relying on your code to provide them anything, and they're not going to program your code either. For web development, these are the folks who use Mailinator and DownForEveryoneOrJustMe, perhaps also, more broadly, you could describe searchers (especially Scroogle users), blog readers, and so forth, as this sort. They're a natural match for hackers: hackers put out code which does the right thing but has warts, script kiddies use bundles of that code to actually compromise lots of systems and don't care necessarily about the warts.
End Users, by contrast, are people who want to pay you, or at least consume your ads, for the services you provide. They expect something to Work As Expected. Maybe your boss has to negotiate with them to figure out what exactly the required functionality is, and then a contract has to be worked out. Or maybe you're just large enough that a substantial userbase relies on you -- Gmail and Google aren't technically doing something different from Mailinator and DuckDuckGo, handling emails and searching the web respectively, but you expect your email to Work and if Gmail is down for an hour a day next month you might huffily leave to some other provider.
As you start to provide this reliability to attract End Users and Get Money, you'll need to work towards version 1.0 of your product, slowly patching in the things that were missing from the original idea. You'll have to have real logins with salted passwords. Your system won't just concatenate to the Email List -- it will send out live emails and handle bounces. For those types of work, you want Software Engineers. You want someone who can weave edge cases and apply them, you want tests for new code, you want someone who studies the API of all the languages reasonably available to figure out which calls can be made more efficient.
I think these are not different people, but different hats. If you're cooking, you wear all of these hats. Sometimes you have a great idea for throwing some ingredients together: hey, I've never tasted chicken with avocados, that might actually work together, let's combine last night's guacamole with the night before's roast chicken and see how it tastes -- hacker. Okay, it tasted good, but to really serve this to my friends, I suppose I'll want a much less ugly presentation -- engineer. Okay, I pick up the knife to slice the avocado -- script kiddie. Then we eat up and my friends compliment me on my crazy ideas -- end users.
For a lark I'm going to take an old project that has been running for years without a hitch and I'm going to retro-actively add tests to the code. I'm really curious how many bugs and unexpected behaviors will turn up.
Last week I got an enhancement request from a customer that basically said, "Add B capability and make it work exactly like A capability."
"Cool," I thought, "This should be easy."
So I examined all the "A" stuff, which had been in production since 2008. Then I cut, pasted, modified, and added a whole bunch of stuff. (I know, I know, every once in a while, a programmer's just gotta take the lazy way out.)
When I started unit testing my B stuff, I broke everything on almost every try. This was before I even assembled a test plan, it was just a hacker beating up his own work.
How could this be? So I went back and beat up the A stuff and broke it in all the same places. Stuff thousands of people have used thousands of times. Sigh.
How is the A stuff wrong when thousands of people have used it successfully(presumably for its intended purpose) thousands of times? I understand where you're coming from. But the longer I do this, the less dogmatic about testing I get. If the code works for its intended purpose then it's probably all right. Now, adding features and having confidence in refactoring is another story with untested code.
I recently discovered something interesting. The requirements gathering we do kinda sucks. How often do we bring in end users and watch them use the system as they do? Why do we not trust user input with our code but do with our ears? I recently got the opportunity to do this for a project that's being rewritten. The information you gather from this is gold.
The reason thousands of people have used it successfully is because most people are not developers, aren't lazy and willingly accept having to do something inefficiently over and over again. If there is some flaw in the way the system works then people will adapt and go around the mountain. They won't even think about this and, if it's so automatic that it's buried in their unconscious, they may never even think to tell you.
The other reason people have used it successfully is because they assumed that the broken output was correct. If it's not tested, how do you know that your report generator is spitting out the correct figures?
Also, kind of related, just as one of the benefits of unit testing is that you write well structured and testable code, one of the benefits of functional testing is that you think about the interface. I'd consider any inefficient interface to be a bug.
Do you write lots of security fuzz unit tests? Cause I don't. I write tests that hit the edge cases I can think of, but invariably, I can't think of everything. I really don't think the security argument has much to do with unit testing.
A boy turns up to school half an hour late, out of breath as he was running into the schoolyard pushing his bike, "Sorry miss, it took me 45 minutes to run here" he explained to the schoolteacher, "Why are you pushing your bike, why didnt you cycle?" the schoolteacher asks, "I was already late when I left the house, I didnt have time to get on my bike"
As mentioned elsewhere, turning testing and code coverage into a dogmatic religion is obviously a bad idea, but when I talk to people about testing it seems that we error hugely on the side of not testing, when you dont think something is testable, it is usually because you didnt design it to be testable from the outset. There is definitely no easier programming guide I have found than a little light that goes green when I have done the right thing, If I have tested code properly it is an order of magnitude less likely to take on technical debt, huge sweeping refactorings are no longer big scary tasks
The reason for lack of tests I see most often is that cost of set up (before writing any tests) can be high. This is especially true for inexperienced developers, and people using new technology stacks.
Setting up unit tests is (in my experience) considerably cheaper than setting up integration tests. And it's cheaper to set up a test framework for a project when you've already done so for a similar project, but the time spent learning the required techniques might've put management off allowing time for testing.
For example, consider writing an Eclipse plugin. It's very easy to get a plugin started; there's a template to get you going. But there's no template for a project that will run UI tests with SWTBot. Instead, you have an uphill battle. Create a new project; find the magic incantation to run such tests; realise it's not quite applicable to your project; set up a headless X server on the CI server; learn the test API. It's hard to explain how long all these small things take to someone who's never tried it.
The effort is certainly worth it, but it can be very difficult to justify the spending time on familiarisation and integration of a test framework. It depends on the company, of course. It's good to work somewhere where testing is considered part of software development. Others are more short-sighted.
I think lack of testing is part of a more general problem of not wanting to invest in improving skills and processes, because it doesn't align with short-term goals.
Completely agree. People tend to dismiss testing rather than balance the depth of testing that they do. 100% code coverage doesn't mean you've tested every conceivable combination of parameters to a method.
One of the biggest benefits of testing in my mind is improving the design of code. If you have code that is very difficult to test, there is likely something wrong in your design.
Any way you cut it, you need to become knowledgeable about testing to be able to apply it effectively.
At work, for one small subsection of the project I'm on, I wrote the regression test. I'm testing a "program" that consists of 56 processes (what I'm testing) across three machines (and requires around four other processes across two machines to stub out some services we require but aren't technically part of what I'm testing). It can take up to half an hour to set up (one of the reasons it's not fully automated is that the third party network stack we rely upon will shut down if there are too many errors) and it takes around four hours to run (except for two test cases that require manual intervention to run properly).
And that's just for the back-end processing (nearly 300k lines of C/C++ code). Unit testing? Okay, for large values of "unit", and most of the "units" being tested require almost as much set up as the entire "program".
Is something wrong with the design? Given the constraints and how the project evolved, I can't see it being any simpler. And I'm somewhat overwhelmed with the thought of testing the frontend (which requires Android phones).
My metric here is always finding the most valuable way to use my time long term. In the short term, test automation always seems wasteful. But in the long term, it's great. Solid product, little debugging, and minimal manual QA.
You're in a situation with a lot of legacy code. Testing shapes design, but it sounds like you're trying to retrofit testability onto an existing mess. People cut corners for years, and now it's your problem. That sucks.
In your shoes I'd either start improving it or find a new job. I think life's too short to spend my time doing something a computer could and should be doing.
Heh ... it's actually a new project. Yes, the majority of the code is third party software. And while we do have a bit of "legacy code" in the project (in the form of the third party proprietary network stack that literally is in pure maintenance mode) we're mostly working in a legacy system (telephony network) that requires very high degrees of redundancy (hence the number of processes on the number of machines).
And for the most part, I was able to get the regression test for the backend process(es) to run unattended once started (thankfully---I (along with two others) did it once manually and it was horrible). I have no idea of how to do that for the frontend Android cell phone client. Sure, we can run tests on an emulator, but there are issues with the Android emulator (it exhibits different buggy behavior than the the physical hardware) so that only gets you so far. It's an interesting (if somewhat overwhelming) problem.
Personally, most of my testing code is unit tests. I try to isolate my code from third-party services in a variety of ways. I do have some integration tests that check that it all works together, but those are never as useful or as maintainable as I'd like.
I'm not sure. Yes, we can (to some degree) independently test parts, but like I said, each part requires a significant portion of the environment to be up (or simulated). And "unit" testing (that is, testing an individual routine or module) doesn't really make sense given how the code is written (receive a message via SS7---the network stack I mentioned) and convert it to an IP based message). To test the portion that talks to the telephony network requires a telephony network (very hard to mock out---lord knows I would love to) and another major unit we wrote (which is another part I test) to even be testable.
And to test that other part? Well, it requires I mock out the previous unit (or run it), plus three three other parts (one including a cell phone---which is really a simple script at this point). And again, it doesn't really make sense to test individual routines because this takes the translated IP packets from the SS7 module, and makes several queries to other IP based services. So a lot of what's going on is just simple translations (in a multithreaded/multiprocessor environment---more fun!).
I went to an Agile class where the lecturer compared unit tests to double-entry bookkeeping. An accountant doesn't say "oh, I don't need to add up both columns here, I know it's just trivial addition".
Once I got in the habit of writing tests of even the most simple transformations, the code complexity and my test complexity grew at the same rate, so it's much harder to end up with a giant untestable mass.
I once spent over a month tracking down a bug (in a different project than the one I mentioned above) that I have a hard time seeing how unit testing would have caught. The program: a simple process (no threads, no multiprocessing) that would, depending on which system it ran, would crash with a seg fault. The resulting core files were useless as each crash was in a different location.
It turned out I was calling a non-re-entrant function (indirectly) in a signal handler (so technically it was multithreaded) and the crash really depended on one function being interrupted at just the right location by the right signal. That's why it took a month of staring at the code before I found the issue. Individually, every function worked exactly as designed. Together, they all worked except in one odd-ball edge case that varied from system to system (on my development system, the program could run for days before a crash; on the production system it would crash after a few hours). The fix was straightforward once the bug was found, but finding it was a pain.
So please, I would love to know how unit tests would have helped find that bug. Yes, it is possible to write code to hopefully trigger that situation (run the process---run another process that continuously sends signals the program handles) but how long do I run the test for? How do I know it passed?
no, unit testing doesn't tell you if your constructs aren't safely composable. So: it will pretty much never find a threading bug, a concurrency bug, a reentrancy bug, etc.
I only know three ways to detect this sort of bug, and they all suck: 1) get smart people to stare at all of your code 2) brute force as many combinations as possible 3) move the problem into the type system of your language so you can do static analysis of the code
I don't think even the most hardcore TDD zealots would come anywhere close to claiming that testing is a silver bullet. There will always be cases where you didn't think of a particular edge case, or when some environment-based issue makes covering something in a test impossible. That doesn't negate it's benefits in preventing the 99% percent of bugs that aren't an insanely rare edge case.
I don't think you should expect every bug to be caught by unit testing. But where it helps with a problem like that is eliminating a lot of other possible causes of bugs. Debugging something like this is often a needle-in-a-haystack problem, but it's nice if you can rule out most of the hay from the beginning.
In this case, once I discovered the cause of the bug I would have written a unit test that exposed it, probably a very focused one. Then I would have gone hunting for other missed opportunities to test for this, and I imagine my team would have come up with some sort of general rule for testing signal handlers.
Heh, I wonder if we have the same SS7-stack. Your description sounds disturbingly similar to our experience. Do your stack also have a wait of several minutes before reporting that starting it up went ok? (to the logfile of course. It is to good to actually report that to the console or to have service scripts that can be trusted). That wait is very popular with our testers.
Oh well, at least in our case our signals originate in IP and we only have to check against a HLR, which do have a semidecent mockup.
From what I understand, there are only two commercially available SS7 stacks, and the one we use is the better of the two (which I find a frightening thought). So there's a 50/50 chance. I don't know enough of the stack to start it (or restart it) so I can't say for sure if that's how our stack works.
I don't agree. I obviously don't know the specifics of your project and I certainly don't always unit test code either (even though I know better - though I do unit test actual important code, just not my own experimental or prototype code), but your comment sounds to me like your trying to rationalize not testing your code (or you are frustrated by the amount of third party code thats making it hard to test..). Maybe it would be too expensive to test...
receive a message via SS7 and convert it to an IP based message. To test the portion that talks to the telephony network requires a telephony network
I worked on an SMS anti spam/fraud system for a few years and we unit tested and simulated everything.
For unit testing we mocked all the network/hardware stuff so that each part of our code could be tested in isolation. I firmly believe that there is no code which cannot be unit tested[1], though obviously some code is easier to unit test than other code.
For more end-to-end simulation, we wrote a test suite that would simulate the SS7 network and allow us to test our system under all kinds of message flows - testing not just that the system worked for each variant of the message flows, but also stress testing and performance testing our system. It worked with raw SS7 messages received from a number of commercial gateways and also with SIGTRAN messages (which are almost the same thing anyway). This worked pretty well for us.
just simple translations
That should be the easiest type of code to test! Pure functional translation is ideal for testing: if I put in X, I expect to get Y back (for a bunch of X/Y pairs).
You mention multiple machines and multithreading - obviously this makes testing pretty damn hard (though unit testing should generally not be too affected), but possibly also more critical since multiprocessing is hard anyway. Anyway, like I said, I don't know your system.
most of the "units" being tested require almost as much set up as the entire "program"
It sounds to me that the design isn't modular enough (by design or by evolution), or the units are much much too large. Each unit should be fairly simple and reasonably self-contained.
[1] Nowadays I do some embedded systems stuff, which at first I considered really hard to unit test, but changed my mind after reading this book: http://pragprog.com/book/jgade/test-driven-development-for-e... If you can abstract away microcontrollers and other hardware for the purpose of testing in an embedded scenario, you can abstract pretty much anything away.
One flaw with your analogy. I've found based on direct experience that having to write/update tests is like walking. Not having/updating tests is like riding a bicycle. I get to my destination faster and with less work.
Let me just state a fact: every programmer tests code. Whether you're checking a command line output, experimenting in the REPL or reloading a browser, you're testing your code.
What rubs me the wrong way is that, instead of a simple "you know all that ad hoc testing that you do? There's a way to automate that that'll probably save you some time and let you test the same things, in an automated fashion, with the press of a key...", non-testers usually get a condescending "oooh my sweet summer child, what to you know of code?"
This is exactly the problem that I see with the testing/TDD culture. Automated tests are wonderful because they save me time, not because they are the singular Holy Magic Grail Bullet of software development.
My software tends to be obsessively modular as it is, because I’m lazy. I can hold a complex tangled system in my head. It just sucks. So even when I only test manually, I am reasonably assured that nothing’s broken. But there’s no need to do manually what your computer can do automatically.
This is a great point and one I often try to make (with differing levels of success.) I believe Brooks says you're going to spend 50% of your development time testing (25% unit, 25% system) I think. You're going to spend that amount of time in one chunk or you're going to amortize it (and probably make the total time spent testing longer.)
I'm afraid the statement that "every programmer tests code" is not a fact. Not by a long shot.
In fact the vast majority of bugs are the result of not testing changes at all, in any way shape or form. Committing code changes without even running it (or only running it to the most simple and predictable of scenarios) is not an exception.
Many programmers, especially those that don't like writing test, simply assume it works "because it was simple". It's this utterly unrealistic and unprofessional hubris that gets condescending reactions.
Given the damage it does to both the product at hand and our profession in general, I would say condescension is a rather mild response to this behaviour.
> In fact the vast majority of bugs are the result of not testing changes at all
I never said every programmer tests 100% of their code all of the time. Even when you're just checking the output of a "hello world", you're effectively testing your code. It might be the only test ever made, but it is a test.
> In fact the vast majority of bugs are the result of not testing changes at all
Now you're making a bold claim that, AFAIK, isn't backed up by research.
Just to be clear, I like tests, and I like automated tests even more. I just don't think they necessarily merit 50% of my own time because they're not some panacea that will magically auto correct badly designed software. Test are written to find bugs, not to eliminate them altogether; you can't guarantee your code is bug free because of tests.
At one of my past jobs, the threshold for "should I commit this?" was "does it compile?". The rationale was that if there were problems with it, they'd be caught during "acceptance testing" (which was probably 6 months down the road). I didn't stick around for very long...
I've seen PHP committed and deployed to production with syntax errors, which means nobody ever tried the offending pages even once. I also left that shop pretty quickly, because I don't think options keep vesting if I garrote somebody with a network cable.
And I think that is warranted. It's unprofessional and sloppy to not write automated tests, and developers who are unprofessional should be called out on it.
The problem is, in my experience, I've tried your first line and it just doesn't register (for whatever reason). So, I use the second now.
Hey, I have a question for all you TDD fans. In my (still short) programming career, I have only stumbled across stuations where automatic testing looks impossible to do in a sensible way - deploying a patchwork of code against huge platforms like SharePoint or working with APIs like COM that don't lend themselves very well to testing, and where the code is "interface" heavy rather than "logic" heavy. Recently I've been looking into iOS development.
I also get the impression that most IT work today mainly involves working with huge libraries and APIs, and that automatic testing therefore is hard to implement in a sensible manner. I very much get the idea of automatic tests (and I've wished for them quite a few times, where they've been very hard to implement because the API seems to get in the way). But are there really so many applications for them unless you are building a huge, monolithic application where everything is defined in advance? Seems to me, like a lot of people have pointed out, that it will slow you down when prototyping.
This is not a criticism of TDD/automatic testing, but I just don't see how to create good tests when most of your code is just glue between different libraries and most of your time is spent reading documentation and chasing bugs in your library. Would be really cool if someone could point me to an overview of these things. Am I just in the wrong organization?
You're asking for integration tests, and yes those are hard to do, esp when the integration is against a large deployment of third-party or existing API code like you describe.
What you want to do in that case is isolate the "glue code" if you can and test its assumptions in isolation. Wrap the API dependencies in an interface and inject mock objects to play the role of that API. This is really what mocks do well.
If your code is also bootstrapped by some special plug-in hook that is hard to emulate in a test environment, like a MS SharePoint or Dynamics thing, then you should isolate the code in question from the Class that implements that hook, so that a test can boot up that code just like the plugin would. Interfaces are probably a good option on this end as well.
So, you often can't test your production code in an integration environment exhaustively, but that's OK in most cases because A) a truly exhaustive integration test is probably a combinatorial problem and not realistic anyway and 2) you'll just be proving that your third-party API works as guaranteed, which is probably not your highest risk and not worth the trouble.
Your real concern is to test the assumptions of new code and also create the TDD discipline around that code which tends to make for better code.
At Circle, we write three kinda of tests for this sort of thing:
1) "Are we using the API right?". So we have a test that logs a test user into github and reads their API info, for example. This sort of test shows that we have integrated the API correctly. If it passes, we can assume the rest of the API works, because we are using it correctly.
Obviously, we can't truly know this, so it depends a little on the quality of the API - kinda like trusting your compiler or OS. But the APIs we use - EC2 and Github, are largely bulletproof so long as the service isn't experiencing failures.
2) Stubbing out the API code and checking our logic. For example, we need to test that the code which manages how many builds we run simultaneously works, but we don't want to run builds, pull from github, etc. So we make the function calls return fake values, and test the logic.
3) Integration tests: Run the full code, with no mocks, no stubs, across an entire "process": do an entire build from webhook to UI, including starting up machines for it; or maybe selenium tests that the OAuth login works.
I visualize tests as a graph: integration tests and API tests provide thin edges between strongly connected components of unit testing.
I like to think in these scenarios, you write your tests for your code as:
given input foo, my code calls library bar and expects a return of baz
You don't need to test the APIs or the libraries. You can mock the return and stub the original method call (if you need to test in isolation -- think APIs over the network).
I think going back into some production code you wrote years ago and doesnt have tests and putting them in is a really nice way to self-reflect. It would provide at least two benefits, 1) improving the code 2) showing you just how far you've come since writing the code, which in turn shows you how far you can keep going.
> For a lark I'm going to take an old project that has been running for years without a hitch and I'm going to retro-actively add tests to the code.
Be prepared to possibly rewrite most, if not all, of your code. I have found, at least in my old code -- which admittedly was written well before I became familiar with the concept and discipline of testing -- that my code is just not testable. At all.
It is very tightly coupled, and against which is almost impossible to write any sort of meaningful tests.
But maybe you were a better designer than I was, and if so, you may be able to fairly easily add tests to your code.
If your code has been "running without a hitch" and "an old project" (which I assume means you are no longer actively updating it) why are you adding tests?
Out of curiosity, what's the ROI that you see in adding tests to that project?
For some reason clients tend to get upset when things that were working suddenly break (they don't even care if it as legacy code base!).
Joking aside, in order to effectively prevent that, a test suite must be in place to prevent regressions when you change/add code. If there isn't one, you need to decide between cranky clients or the nastyness of adding test to legacy code.
> For some reason clients tend to get upset when things that were working suddenly break (they don't even care if it as legacy code base!).
Another sad truth to consider is that just the process of refactoring your code to add tests can just as easily break functionality. I've learned this the hard way.
> (which I assume means you are no longer actively updating it)
I believe __abc was basing his comment off the assumption that the code wasn't being actively developed, which significantly reduces the ROI of adding tests to an existing project.
Sounds like it's more back to the OP's response below, that it's an exercise in learning.
I think the author is looking at it as more of an experiment than a valuable project with a positive ROI. As jader201 said he will probably discover that it's a pretty tough thing to do when the code wasn't written with testing in mind. I also believe he will find issues exposing parts of his code to be tested.
There's also the Woz method of thinking about the problem so intently that you can simulate the whole system in your head without even looking at any code. Being unable to write your program bug-free would be like not being able to recognize your mother.
In one project, I refactor my code late, late at night. I do it in almost a dream state; it's a process of nearly pure symbolic manipulation, involving none of the complex mental model that we're used to needing to maintain while programming. I've been doing this a few times a month for a year, and have introduced one known bug. I have a very modest test suite.
I'm no programming god, I'm just writing in haskell. Referential transparency, purity, and strong type checking for the win.
I've taken a long, meandering road to appreciating TDD/BDD. When I started programming, I looked up to _why and his hacking approach to coding. Sadly, I could not express his brilliance and my code was not just untested and sloppy, but fragile and inundated with smells. As the scale of the projects I develop increases, I've learned to use testing to decrease the potential breakage and to better understand the libraries and features I'm working on. Of course there is an exploratory spike here and there, with tests coming in later to glue it all together, but those are now exceptions to my normal practice. When debugging legacy applications, simply creating test coverage for problem areas goes a long in solidifying the patches. Testing is not fail proof elixir, but it certainly improves my workflow and my product, and those results are hard to argue with.
Related: I am a big fan of testing in my personal projects, but have never written a test for a client because they don't want to pay the additional hours. I have quoted about 25% - 30% of the time on writing tests , if they want them. Am I wrong? Ditto on documentation, no biters.
If you read Pragmatic Unit Testing[1], it talks about how writing tests actually takes less time than building a project without tests, in the long run.
First of all, it's possible to actually ship a product that was built with tests quicker than one that was not built with tests. This may not always be the case, but adding tests doesn't necessarily mean that it will add time, overall. It may feel like it's quicker to build an app without tests, but often the testing and bug fixing that happens at the end often exceeds the time it would have taken to build tests and eliminate most of the testing/bug fixing at the end.
Second, bypassing testing rarely saves time in the long run, especially for apps that continually require maintenance to existing code. Regression almost always occurs, and sometimes this isn't caught until production.
Unfortunately, it's a hard sell to clients, and depending on how well you are at covering this up, it often goes unrealized.
While I agree with you, it's a very tough sale if you're consulting for a company without a strong programming department. I find that companies plan for best-case scenario and deal with the consequences thereafter. Bugs are usually considered a programming mistake, even while acknowledging that poor planning plays a part.
I think it makes me faster long term, so yes I think you're wrong. I now think charging separately for tests is like charging extra for human-readable variable names or a well-factored code base.
If you're talking about writing tests after you've completed the implementation, then I wouldn't pay for those either. :)
Test-first development is a practice that helps you design software, and ultimately (arguably) reduces the number of defects. It's tied to development, so I wouldn't quote that separately.
If they don't want to pay for documentation, maybe they end up paying for a lot of support calls....
Your customer isn't paying for the code you write, they are paying for a finished product that works. How much code you write should be irrelevant to them.
In fact, one could argue it is far more dishonest to quote without testing since that does not include the maintenance costs that will be incurred by them after the initial product delivery, which will almost certainly be higher than a project with automated tests.
I completely disagree; you're quoting them the number of hours it will take you complete their project with the understanding that it will be at a certain level of quality. As long as you're both on the same page as to the expected quality, how you get to that point is your problem, not the client's. When you bill hours, you do include any time spent sketching some design, or debugging some problem, right? How is ensuring quality in a different fashion and including that time in your quote dishonest?
I would argue it's the same as trying to quote separately for debugging. You could take the approach that the unit tests are part of the implementation, but integration tests can be billed separately.
Someone needs to spell out the difference between "tests" vs. "testing", might as well be me. Tests are something you, a developer, or a test engineer writes. Really important. Testing is something that the dev, a test engineer, or someone off the street can do. That is the most important part towards making working software. Unit tests are important, but are in a vacuum. The only way you can look at your code in context is to have someone use it.
Lots of games (including ones I worked on) ship with hundreds of thousands or millions of lines of code and practically no unit tests (if any). According to this article, they shouldn't be working at all, let alone make billions of dollars. Do I think it's a best practice? No. But on those projects we had an army of QA to test builds and producers obsessed with their features to always be in there making sure that things worked. The most effective testing was "everyone play the game day" (or weekend). Only then can you find edge cases that unit tests can't. Dogfooding is another important take on this concept.
tl;dr: unit tests are important in their own right, but even 100% code coverage can't tell you if the thing as a whole works as the user expects. That's "testing" as opposed to "tests".
Someone in the test industry here. I had a stint in the games industry, as well.
An army of QA is... problematic. Inevitably it turns into a death march, which is a huge waste. Worse, developers feel much less inclined to own the quality of their code (consciously or unconsciously) because "QA will catch it." I suspect that even if you wanted to be more rigorous, the incentives are against you.
Fixing bugs filed by QA is also expensive, relative to fixing the code before it's checked in. By the time a bug makes it to QA, it's a ton of patches later, the developer is working on another feature, and it's not at all obvious which patch introduced the bug. The most expedient strategy of "revert the culprit" is difficult if not impossible, and checking in new code for a fix introduces further risk.
There's certainly something to be said for expert/exploratory testing. Hell, it puts food on my table. :) But as a tester, I'm far less inclined to work on a product where developers aren't concerned about code correctness or quality even at a micro level. It says to me that they don't value my time.
Then again, I suppose in the games industry it matters less what QA thinks, given that the people actually doing the testing are often temporary/contract employees.
Lots of games (including ones I worked on) ship with hundreds of thousands or millions of lines of code and practically no unit tests (if any).
Lots of games are buggy as hell as well. In the past year, I gave up on a number of AAA games because they were simply too buggy and felt unfinished to me. Needless to say, I won't be buying sequels and will think very hard before buying from the same developers again.
Completely agree. Even though I am quite new to coding, I was shocked to see how much more time I spend on debugging a piece of code than actually writing it. The ratio is closer to 3:1 and sometimes even higher. Writing tests has brought that down to roughly 1:1 (varies from project to project).
I still don't do as extensive testing as I want to (primarily because I am lazy) but I have seen the shift in the way I think about solving a problem. Writing tests forces you to assign structure to your code (in my case I put it down on paper). It helps you think in terms of "pipes" as in what is going in and what comes out.
But, I think a lot also depends on the nature of the project. Parsers and frameworks may need thorough tests while simple apps may do without many. And for some people, it may be too much of an overhead at times.
As a long time skeptic, but a new convert to TDD, it also occurred to me that I was also reaping the benefits of immediate feedback after each build analogous to Bret Victor's theme in his talk and demo at CUSEC. It definitely increased my iteration speed.
As a mostly self taught programmer who has been ignoring tests for too long, can anyone provide links to any books/primers/resources that can introduce me to writing tests and doing so effectively?
Anecdote: I rewrote an operating system (back when people did things like that) to repackage it as a library of modules (kernel, drivers, services) that self-configured on each boot.
It took 19(!) tries to get past the 1st line of code in the entry point module.
There is no such thing as a trivial change (tho that was sure not trivial). My mantra is: "If you haven't tried it, it doesn't work" We all know that, deep down. We tell stories over dinner of the time something worked on the first try. Why? Because that hardly ever happens.
Congrats on discovering testing! Getting "test-infected" yielded easily the biggest productivity gain in my career.
However, you've got another big win waiting for you if you'll try something subtly different: write the tests first. Moving to TDD was another large improvement in my overall productivity because it drastically improved the quality of the code I write.
It seemed silly when I first heard of it, but now I won't write code any other way (except for short, exploratory programs). Give it a try!
Wow, this guy guarantees that if it isn't tested, it doesn't work. Important projects that people depend on like the linux kernel must have really good test coverage.
Jacques is stating the obvious here. I see testing code and running code in an organic relationship, like the flesh and the shell of a lobster, wrote more about it in here http://www.douban.com/note/205412385/
I find that my tests often need tests, and sometimes even these second-order tests need to be checked and verified by third-order tests. In experimental programming though, just running the code is a test...
I don't think it was a good read because I already thoroughly agree with you.
In fact, as with a lot of things about programming, even when programmers are widely accustomed to that line of thought, the real problem will be to convince management.
I've been testing obsessively for the past 18 months but just realised that I've got a stack of perfectly tested code that's perfectly wrong for what I need it for. I'm going to have to tear it apart and rebuild.
I've been thinking about what the right balance is between the two. As code matures the value of tests becomes higher and higher but early on when you need to shape and reshape they really kill momentum.