Hacker News new | past | comments | ask | show | jobs | submit login
Slow database test fallacy (heinemeierhansson.com)
69 points by vitosartori on April 30, 2014 | hide | past | favorite | 75 comments



DHH either is being disingenuous, or badly misunderstands unit testing.

He opens with this:

> The classical definition of a unit test in TDD lore is one that doesn't touch the database. Or any other external interface, like the file system. The justification is largely one of speed. Connecting to external services like that would be too slow to get the feedback cycle you need.

No, "unit tests" in TDD -- and long before, TDD didn't change anything about the definition -- are tests that, to the extent practical, test all and only the functionality of the specific unit under test, hence the name. That's the reason why external interactions are minimized in proper unit tests (whether or not TDD is being practiced). TDD observes that such tests are generally fast, and builds the red-green-refactor cycle around that fact, but speed isn't the justification for the isolation, isolation of the functionality being tested from other functionality is the point of unit testing (which is designed not only to identify errors, but to pinpoint them.)


He also seems to misunderstand TDD - which is more about design than testing. Unit testing and TDD are not synonymous. You can, and I certainly do, test-drive code over traditional unit test boundaries.

He also seems completely unaware that there is an entire school of TDD/BDD that doesn't really like mock objects (the Chicago vs London school http://programmers.stackexchange.com/questions/123627/what-a...).

I've just generally stopped listening to what he says on the topic. It doesn't seem to match the reality of what folk actually do.


> He also seems to misunderstand TDD - which is more about design than testing. Unit testing and TDD are not synonymous.

Actually I think it's more his critics which misunderstand this—he argues against TDD, and he doesn't think unit testing is sufficient, but he doesn't argue against testing, or unit testing.


> he argues against TDD, and he doesn't think unit testing is sufficient

But what he describes as "TDD" to argue against it isn't TDD (either in his description of its substance or his description of its rationale), and what he describes as "unit testing" to argue that it isn't sufficient isn't unit testing (again, either in substance or rationale.)

> but he doesn't argue against testing, or unit testing.

He specifically argues that unit testing should be deemphasized if not outright eliminated, and that the reason for this is the elimination of test-first as a design practice. So, yes, he does argue against unit testing (of course, the argument is nonsense, since unit testing was an important practice before test-first practices, and test-first practices are independent of kind of testing -- sure, TDD emphasizes unit test first, but ATDD/BDD focus on acceptance test first; moving toward or away from test-first as a practice is completely orthogonal to the degree of focus on unit testing vs. other kinds of testing.)


You argue that the database should be thought of as a separate unit, and that unit tests should minimize interactions outside their unit. But this argument proves too much; it would also argue that unit tests should avoid using the memory allocator.


Your parent said "to the extent practical". Limiting interaction with external units as fundamental as the memory allocator is far from practical. I actually think DHH would agree with the "to the extent practical", but would think limiting DB access in tests of fundamentally DB-dependent things like ActiveRecord models is impractical. I think the debate is fundamentally about that practicality, and neither side is obviously right.


I think it's impractical to test things that use ActiveRecord without testing the database.

ActiveRecord at its core is meant to generate and run SQL and give you object graphs back, and you need to test that the SQL it generates is correct and does the right thing.


> I think it's impractical to test things that use ActiveRecord without testing the database.

I think one of the fundamental, if not really directly addressed, divides between the sides here is over the question of whether domain models "using" (being tightly coupled to) the persistence layer is sensible architecture even outside of consequences for testing.


I hope my comment that you replied to doesn't make it sound like I disagree with you. I don't. But I do think it is reasonable for other people to disagree with us on that point. But I also agree with dragonwriter that perhaps a more important question is the practicality of having fewer things that "use ActiveRecord", such that you can then test those things without testing the database. I think it's a pretty good idea, but I've had only limited success doing it in practice.


Thank you for this clarification. I don't know who came up with the idea that unit tests must not hit the database (it wasn't DHH). The result was that many 'mocked' unit tests merely tested their own mocks and stubs. I work on a server-side db-centric code base and many (i.e. the most important)of my unit tests involve database round trips. Speed of execution isn't a real problem in this case.


Yes, in fact you can use TDD to test the database itself: http://dbfit.github.io/dbfit/index.html


Am I only the only one that doesn't find 'All tests in 4 minutes, all model tests in 80 seconds' very impressive? It sounds like a really long time to me.

You know what could increase the speed dramatically.... decoupling.

I also think decoupling phrased in the context of the Rails 2 to Rails 3 upgrade, where pretty much everything changed, makes perfect sense. Imagine just having a few wrapper classes that spoke to Rails and only having to adapt them. Sounds good to me!

Bernhardt: Boundaries http://www.confreaks.com/videos/1314-rubyconf2012-boundaries

Weirich: Decoupling from Rails http://www.youtube.com/watch?v=tg5RFeSfBM4

Wynne: Hexagonal Rails http://www.youtube.com/watch?v=CGN4RFkhH2M


You know what has a high chance of introducing a crap ton more bugs? Unnecessary code. :)

Also, did you read the rest of the post? Typical test cycles are much faster than that. But a full test of the entire system takes some time. As it should.

This is also why the test suite for git takes a long time, but has a very good signal to noise ratio.


> You know what has a high chance of introducing a crap ton more bugs? Unnecessary code. :)

If only that "unnecessary code" had tests... Oh wait.


At some point, you will find you have bugs in the tests. It isn't like that code is magically immune to mistakes.

Not to mention you have increased the workload of everyone just to mitigate the impact of additional code. Let me know how that works out for you. :)


Slightly more complex code with tests beats code with no tests every time, and it works just fine. Having 1 more class or function call is not exactly what makes an impact on workload. If you are optimizing your application by reducing function calls you are doing it wrong. But tell me, how does not having tests work out for you? Can you modify old applications, or applications that you didn't develop alone, with a high degree of certainty that you didn't break things in the least expected places?


At no point was it declared there were no tests. So... different topic?

Seriously, straw men not withstanding, this thread is specifically about tested code where a complaint was raised about how long the tests take. And the "how long" was only 4 friggin minutes for a full suite, or 4 seconds for a module.


Five minutes of tests really isn't such a big deal. Not long enough to bother re-architecting a bunch of stuff. But what frustrates me is that DHH seems to bullheadedly continue believing that problems they haven't had with Bootcamp simply aren't real problems. Bootcamp isn't all that big an app, so it isn't particularly surprising that the tests run that quickly. But once you get up into the high tens of minutes for unit tests, and the low hours for a full suite including end-to-end tests, it becomes really really annoying. Changing the architecture isn't going to help much or at all with those end-to-end tests, but it's easy to find yourself thinking "boy I wish we could at least get a bit of confidence in the correct functioning of our application in the face of this little change in less than half an hour - I wonder how we could accomplish that?". The three common answers once you find yourself wondering that are "let's parallelize our tests in the cloud!", "let's reduce external coupling in our tests!", and "let's make small services so any given change affects only a few tests!". All of those answers have fairly tough trade-offs.


> Am I only the only one that doesn't find 'All tests in 4 minutes, all model tests in 80 seconds' very impressive? It sounds like a really long time to me.

I don't really mind if a given branch is marked red on CircleCI. master must always be green, but a branch, not so much.

Say I change a model. I'll probably run the unit test for that model (4 seconds), then just push up the branch and forget about it. 9 times out of ten, I didn't break any other tests and everything's fine. Once in a while, I did, and 3-4 minutes later Circle via HipChat let's me know I need to check it again.

It's a bit meta, but this process is just like actual Rails apps—when the task can happen asynchronously in the background, it usually doesn't matter if it takes a couple minutes.


In a 30 KLOC rails project I find those numbers acceptable. Also you are unlikely to run the full suite after every editor save. Even Jim Weirich runs a single test file and not the full suite.


Its not supposed to be impressive, its supposed to be quick enough to provide a reasonable feedback cycle, with the ability to run specific tests that is drastically reduced, you make a few changes, run the relevant tests (they can run automatically via watching), tests pass you are happy, if its an intrusive change you might want to run the full suite before pushing, then when you push CI runs a far more exhaustive set of tests (possibly across patforms), these can happily take a lot more time.

It differs very much across projects, full pouchdb test suite takes ~4 minutes, full firefox test suite takes hours, for rails ~4 minutes sounds entirely reasonable (kinda faster than I would expect) and wouldnt give me no reason to think about reducing the coverage and reliability of the tests by decoupling anything.


It kinda feels like you didn't read the whole way through.

He mentions per model, which is what you're doing while doing dev, it's 4 seconds.


Yes it is too slow. After having taking the time to build a decoupled Rails application I can tell you that having tests run in under 2 seconds has been a great help in refactoring and allowing us to quickly add new features. I can't imaging how we'd add new features without it.


"Am I only the only one that doesn't find 'All tests in 4 minutes, all model tests in 80 seconds' very impressive? It sounds like a really long time to me."

No.


While reading this, I couldn't help but think of Alan Kay's biting assertion about the pop culture of programming.

I'm not interested in pop culture; I'm interested in being a better developer, and that requires a highly critical process of evaluating my practice. It's not enough if something works once, I want to know why it was effective there, and when I can use it. I want to try practices like TDD just to see how they affect the design, and then decide if I like that force. I'll use hexagonal architecture on side projects just to see how it helps, and if it's worthwhile. In short, I want to continue to study the art of software development rather than trusting emotion-laden blog posts with something as serious as my skill.

I don't believe Rails is so special it warrants revisiting all of the lessons from the past we've learned about modularity, small interfaces, and abstraction. It's just a framework.


Rails isn't special. The debate is about the price of decoupling.

Many codebases don't heavily decouple from their frameworks unless they have a good reason to do so, as they lose the productivity benefits of the framework in the process. The framework you choose to tightly couple against should be your flex point -- you don't have to design your own!

The level of tradeoff depends about the framework in question. I can recall a moderate-sized project where we bound against Hibernate ORM for a couple of years and eventually had to switch to MyBatis for a variety of reasons. But since we were using JPA annotations mostly, the coupling wasn't so tight to make the switch all that hard or brittle.

There are times where Hexagonal architecture makes total sense (immature frameworks, shifting dependencies, etc.) , and times where it doesn't at least for certain "ports" (you're building a moderately complex Rails/AR app, why bother isolating AR).


DHH and Uncle Bob are arguing past each other at this point.

Uncle Bob is saying that Rails is not your application, your business objects that contain all your logic shouldn't inherit from ActiveRecord::Base because that ties you to a specific version of a specific framework (have fun migrating to a new version of Rails!) and means you have to design and migrate your schema before you can run any tests on your model code. You should be able to test your logic in isolation and then plug it into the framework.

DHH is saying that if you're writing a Rails application, of course Rails is your application. Why waste hours adding layers of indirection that make your code harder to understand, just to make your tests run faster?

Of course if it's just a prototype, who cares? But I really agree with Uncle Bob that tightly coupling your application logic to (a specific version of) Rails/ActiveRecord is a bad idea if you want to make a long-lasting, maintainable application of any non-trivial size.


I'm working with code bases that have passed the decade mark now. They're not of trivial size. They're still imminently maintainable. They are proudly Rails Applications.


> your business objects that contain all your logic shouldn't inherit from ActiveRecord::Base because that ties you to a specific version of a specific framework

Any time you introduce an abstraction layer to decouple some code, you're making a prediction. You're saying, "I think it is likely that I will need to change the code on one side of this interface and don't want to have to touch the other side."

This is exactly like financial speculation. It takes time, effort, and increases your code's complexity to add that abstraction. The idea is that that investment will pay off later if you do end up making significant changes there. If you don't, though, you end up in the whole.

From that perspective, trying to abstract your application away from your application framework seems like a wasted effort to me. It's very unlikely you'll be swapping out a different framework, and, even if you do, it's a virtual guarantee that will require massive work in your application too.

Sure, it's scary to be coupled to a third-party framework. But the reality is is that if you build your app on top of one, that's a fundamental property of your entire program and is very unlikely to change any time soon. Given that, you may as well just accept that and allow the coupling.


I agree that in most cases, it's reasonable (and cost-effective) to assume that your Rails app will always be a Rails app.

However, it's not reasonable to assume that your Rails 3 app will always be a Rails 3 app. You will eventually have to upgrade--if not immediately for feature reasons then eventually for security reasons. And upgrading a Rails 3 app to Rails 4 is a non-trivial effort, there are a lot of breaking changes, some of which affect the models (e.g. strong parameters, no more attr_accessible). If you skip versions you will just accumulate more and more technical debt.

I think that ideally, you would have your business logic in classes/modules that don't need to have code changes just because the app framework got a version bump.

But generally speaking you're right, the decision of whether or not to put in the up-front work to decouple your business logic from your application framework, is like an investment decision with costs and benefits. Uncle Bob is saying it's always worth it, DHH is saying it's never worth it, but I think the reality is that it's sometimes worth it, depending on you and your project.


And I can guarantee that the vast majority of long-lasting, maintainable applications out there (10+ years) are are tightly coupled to a framework somewhere.

There is a mindset out there that all coupling is bad. Uncle Bob's point of view that coupling to a specific framework is unwise, is one long held by OO design purists.

I'd rather prefer it thought that tight coupling is a tradeoff. You are trading productivity today for future migration risk.

If you know your app is a Rails app, and will always be a Rails app, then there's little reason to decouple. The question, is do you really know, or how reliable are your guesses?


> The justification is largely one of speed.

Is it ?

I was under the impression that you don't include them because a unit test is testing a very specific piece of code and not the dependencies around it. This is why you'll mock disk/db/network, just like you'll mock other pieces of code.


If something is broken in code that I'm interfacing with, I'd like for my tests to reflect that. I've never understood the "testing little things in isolation" strategy when it precludes a "test everything all working together" strategy.


The reason is that if you can write a test that specifies your unit in terms of its communication with collaborators, then you have achieved a rational and comprehensible unit of code, and so you can be more sure that your design is sound.

With test-driven design (in an ideal world), if a change to an important module can break its own tests only, that means that it has a shallow and clear interface, decoupling its internals from the rest of the application.

Given that the module's code is exhaustively tested, that means there is a full specification of what it's supposed to do, given the various possible scenarios.

If the tests are well-written, that means the specification is comprehensible, which means the interfaces make sense in the domain's discourse.

Good TDD involves working hard to keep the code base clean and comprehensible. If you're only testing that everything works together, there's a chance that you're also less focused on maintaining good architecture.


I don't think it does preclude that, it's just not the focus of unit tests. Integration tests are also vital.


Yeah, I get that. But it makes me wonder why "unit testing" became such a hot thing. I think it's just because it's simpler than integration testing. But integration testing is much more valuable.


True, it's much simpler. It's also much more interesting when you start having a huge codebase, and you want to refactor/add a feature/fix a bug, because you don't need to set and verify the whole input/output of your application but only the one that does the logic you're interested in.

It's also simpler to use unit tests for verifying all possible inputs and outputs of a component.

But, as others have said, unit tests alone are necessary but not sufficient.


> But it makes me wonder why "unit testing" became such a hot thing. I think it's just because it's simpler than integration testing. But integration testing is much more valuable.

TDD focuses on a particular method of leveraging unit testing to improve code quality and avoid bloat, but it doesn't suggest that integration testing should be abandoned. Integration testing is just outside of the scope of the part of the dev process TDD is focussed on improving.


Unit testing has the benefit of telling you more specifically which bit of code is broken. That is more valuable when it finds the error. Integration testing will find more errors. As dragonwriter says, though, the focus of TDD is not the fact the tests find errors but the way they shape your development. For that end, it's not clear to me (at all) which is "more valuable".


I've seen integration tests that, because of the particular path thru the code, trigger multiple bugs that more or less canceled each other out.


> I've never understood the "testing little things in isolation" strategy when it precludes a "test everything all working together" strategy.

It doesn't. TDD is largely about how to use unit tests to drive incremental development, but it certainly doesn't preclude any other form of testing being part of the lifecycle (it just doesn't have anything to say about them -- presuming that you have some integration and acceptance test practices, and that those are out of TDD's scope.)


It's external state that can change the outcome of unittesting


I sense that these posts are written for a specific audience, rebutting a set of arguments familiar to that audience, and that's why they seem so reductive and narrowly-applicable, but I can't quite grasp how much of the argument translates to the rest of the world.


It is aimed at design architectures which seek to separate application level concerns from Rails. The theory being that Rails and your specific application get too tightly coupled. The aspect being talked about here is the testing.


I always assumed the point of mocking a database response was to ensure that you were testing just your code, and not also the existence of a database with the right schema, the ability to connect to it, as well as the correctness of the code that rolls back any side effects.


> and not also the existence of a database with the right schema, the ability to connect to it

I like testing those things on which my model depends. It gives me much more confidence. Why wouldn't I want to test them?

> as well as the correctness of the code that rolls back any side effects.

That's a drawback. No arguments from me on that one.


> I like testing those things on which my model depends. It gives me much more confidence. Why wouldn't I want to test them?

Those things all need to be tested, but if a single unit test fails, it's nice to know that it failed because the code was wrong, not because the database connection happened to die just then. If I have one test for the logic, and another that verifies that the database can be connected to, and a third that verifies the schema is right, then the specific combination of failing tests tells me a lot more about what's wrong and if my code even needs to be changed.


111 assertions in 4 seconds? Why not 4 milliseconds, or 4 microseconds? These must be some pretty huge assertions. I guess I'm missing something about modern programming...


There's something to be said for DHH's point here, even though he's confused about what a unit test is. Integrated and end to end tests are much, much more important than unit tests. They actually test the application, not a contrived, isolated, scenario.

Much of the testing activity and literature of late has been complaining how brittle end-to-end tests are, because all the focus is on pure unit tests. This leads to defect pile-up at release time or at the end of an iteration. Whereas the smoother teams I've worked with did end-to-end and integration tests all the time. Unit tests existed too, but only when there was sufficiently complex logic or algorithms to warrant such a test, or if we used TDD to flesh out interfaces or interactions for a feature.

Many web applications don't have a lot of logic, they have a lot of database transactions with complex variations for updates or queries. So, especially if you have an ORM, which are notoriously fiddly ... it makes sense to have the majority of tests (TDD or not) hit the database, since the code will only ever be executed WITH a database.

Mocking or decoupling the database can introduce wasteful assumptions and complexities that aren't needed in your code base. The only time it makes sense to decouple the database is if expect you'll need polyglot persistence down the road and your chosen persistence framework won't help you.

I have worked with developers that prefer test cases run in under 1 second on every save. To me it helps to have a set of unit tests that are in-memory and very fast, that cover basic sanity checks like model integrity, input validation and any in-memory algorithms. But the bulk of tests really need to test your code as it will be used, which often involves database queries. At worse, use an in-memory database that can load test data and execute tests in a couple of seconds.


"These days I can run the entire test suite for our Person model — 52 cases, 111 assertions — in just under 4 seconds from start to finish. Plenty fast enough for a great feedback cycle!"

4 seconds is really slow, actually, and enough to take you out of flow. With a PORO Person object, decoupled from the system, that number will easily be sub 500 ms and possibly much less.


I am really trying to understand these flow comments that are coming up. Waiting for 4 seconds for tests to run or even a few seconds more just seems like a silly thing to get caught up on. If we are talking minutes than I can see that but single digit seconds?


If you've developed a workflow around the kind of automatic test suites that run on every save to a source file and provide instant feedback, a several second wait would seem to be potentially a significant rhythm break.


Doubly so if that's synchronous (probably the case with vim).


A problem with running all your tests in a single transaction is that that's not actually what happens when your code is ran. You will have multiple transactions (unless for some reason you wrap ever single web request inside a transaction, which I think is a terrible idea).

There's slightly different things that happen: now() will always return the same time, deferrable constraints/triggers are useless, you can't have another database connection looking at the test results or modifying the database (say you are testing deadlocks or concurrent updates, or you have code that opens a new database connection to write data to the database outside the current transaction), etc.

It's fine for simple, vanilla ActiveRecord use where you aren't using lots of database features, I suppose.


What I do for my rails apps are:

* at the start of a test run, create a new database, load the schema into it, load all the 'global' data that all the tests need into the database.

* write that data to a sql file using pg_dump --inserts -a

* before each test runs, I disable triggers (with set session_replication_role=replica), then delete all the data from each table, then load the data from the sql file back into the database.

This allows me to have data quickly cleaned out and restored on every test run and gives me real transactions during tests.


I'm curious, why do you think that wrapping every single web request in a single transaction a terrible idea?


It prevents you from saving anything in the database if there's any errors (unless you start using savepoints). If there's a database error, you are prevented from running any more sql queries until the transaction is rolled back. It keeps transactions open for longer than you want, which could increase blocking or deadlocks. You have to have one open database connection per web request.


Mainly is performance issue, see this SO http://stackoverflow.com/questions/1103363


Sounds like this is pretty specific to some ORM tools!


> Oracle abomination

Okay... PostgreSQL is great but it still has a bit of catching up to do.

> ... run your MySQL

Wait, Oracle is an abomination but MySQL is okay?

> Before each test case, we do BEGIN TRANSACTION, and at the end of the case, we do ROLLBACK TRANSACTION. This is crazy fast, so there's no setup penalty.

You know what is just as easy? Making SQLite databases (aka files) for each test case. Copy a file, open it, delete it. It has the added benefit of allowing you to actually commit changes and not worry about rollback. There are some compatibility issues, and I'm not familiar with all those issues in a Rails context.


> You know what is just as easy? Making SQLite databases (aka files) for each test case.

By doing this, you're breaking the dev-prod parity rule of 12factor apps [0]: you should make sure your dev and prod differ as little as possible. If you're using MySQL in production, you should also use it in dev.

[0] http://12factor.net/dev-prod-parity


Let's not conflate unit testing with integration testing. SQLite should, in most cases, work just fine for unit testing.

First off, many people would argue unit testing should never really require a database. I'm not of that opinion, but I'm not writing CRUD apps, and since there's not really a SQL unit testing framework, using SQLite is a nice compromise, especially as unit testing should necessarily be limited in scope.

Integration testing should definitely be done on a system that is in parity with prod.


"Wait, Oracle is an abomination but MySQL is okay?"

I think he was referring to administrative challenges and setup/startup overhead, not steady-state performance.


For local testing on Postgres where you don't care about database reliability, you can also speed things up a lot by setting `fsync = off` and `synchronous_commit = off`.

(Never do that on a production database, of course!)


You can also have postgres running off /dev/shm.


Testing dependencies is not a bug, if there is a reason to not test them, like you need to test an error condition, or your dependancy is external (oauth etc), then certainly, but if there is no need to mock a dependency other than a dogma of some definition of unit test, then it usually isnt worth it.

With every test the questions should be answered are what bugs is this going to catch and which one will it miss, if you mock a dependency then you are introducing cases in which it will miss bugs and there should be a justification along with it.


The classic justification is that mocking encourages modular code where each unit has shallow dependencies with well-defined interfaces.

This also means that a mistaken change to one important unit will not break the entire test suite. Sure, the entire program will break, but it's nice to get a single failing test.

Mocking also gives a very straightforward way to simulate interactions with collaborators. You just say "given that the HTTP module returned a 404, this request should be sent to the error log," instead of initializing those two modules and arranging for the desired case to happen.

There's a very old discussion about decreasing coupling and increasing cohesion that's super important to the whole motivation behind TDD and that nobody seems to be very interested in anymore...


4 seconds is a long time. I'm reminded of the SVN fans who say things like "I can commit in 2 seconds, that's plenty fast enough". Which it is, until you've experienced the alternative, and then you can't imagine going back.

Also, all that separation isn't free. Sure, I don't need to run all my unit tests every time I make a change - but if they're fast enough that I can, that's much less cognitive overhead than having to think about which tests are relevant and press the correct button.


If using MySQL, and need to run tests, the following option our our DEVELOPMENT server really sped things up:

innodb_flush_log_at_trx_commit = 0


Hitting the database or not, using fixtures introduces coupling into your test suite that's often more trouble than it's worth.

http://interblah.net/the-problem-with-using-fixtures-in-rail...


So here's how I summarise the whole essay: "Hardware is cheap. Instead of making your software perform well, why not just throw more hardware at the problem."

Well, I've tried this before and it didn't work.


runs in 4 minutes and 30 seconds. That's for a 1:1 test:code ratio.

is this claiming 100% test coverage?


I think it's just a LOC comparison.


That seems like a not-very-useful metric.


today I learned that DHH doesn't know what a unit test is.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: