It's amazing how easy it is to write tests that are slow. Taking >1 second per test is absolutely normal.
> Is that a fundamental limitation or an incredibly inefficient test?
That's the million dollar/month question. If an LLM can diffuse a patch in 3 seconds but it takes 3 hours to test then we have a problem, especially if the LLM needs more test feedback than a human would. But is it a fundamental problem or is it "just" a matter of effort?
I mostly work with JVM based apps in recent years and there's lots of low hanging fruit in tests there. JIT compilation is both a blessing and a curse. You don't waste any time compiling the tests themselves (to machine code), but also, the code that does get compiled is forgotten between runs and build systems like to test different modules in different processes. So every test run of every module starts with slow warmup. There is a lot of work being done at the moment on improving that situation, but a lot of it boils down to poor build systems and that's harder to fix (nobody agrees what a good build system looks like...)
In one of my current projects, I've made the entire test suite run in parallel at the level of individual test classes. This took a bit of work to stop different tests messing with each other's state inside the database, and it revealed some genuine race conditions when apparently unrelated features interacted in buggy ways. But it was definitely worth it for local testing. Unfortunately the CI configuration was then written in such a way that it starts by compiling one of its dependencies, which blows up test time to the point where improvements to the actual tests are nearly irrelevant. This particular CI system is non-standard/in house, and I haven't figured out how to fix it yet.
> Is that a fundamental limitation or an incredibly inefficient test?
That's the million dollar/month question. If an LLM can diffuse a patch in 3 seconds but it takes 3 hours to test then we have a problem, especially if the LLM needs more test feedback than a human would. But is it a fundamental problem or is it "just" a matter of effort?
I mostly work with JVM based apps in recent years and there's lots of low hanging fruit in tests there. JIT compilation is both a blessing and a curse. You don't waste any time compiling the tests themselves (to machine code), but also, the code that does get compiled is forgotten between runs and build systems like to test different modules in different processes. So every test run of every module starts with slow warmup. There is a lot of work being done at the moment on improving that situation, but a lot of it boils down to poor build systems and that's harder to fix (nobody agrees what a good build system looks like...)
In one of my current projects, I've made the entire test suite run in parallel at the level of individual test classes. This took a bit of work to stop different tests messing with each other's state inside the database, and it revealed some genuine race conditions when apparently unrelated features interacted in buggy ways. But it was definitely worth it for local testing. Unfortunately the CI configuration was then written in such a way that it starts by compiling one of its dependencies, which blows up test time to the point where improvements to the actual tests are nearly irrelevant. This particular CI system is non-standard/in house, and I haven't figured out how to fix it yet.
This kind of story is typical. Many such cases.