Hacker News new | past | comments | ask | show | jobs | submit login

In my experience, many reasonably large projects suffer the same issue. The "deficiency" is always in the test suite, but that's because a particular code pattern or property wasn't envisioned and isn't properly tested. The test suite is limited by the patterns of the developers, and if you do something that's technically possible but outside of what they are thinking, they could make seemingly innocuous changes that break your code. If you start from the docs every time, following the recommendations, Django is pretty good.

I manage a modest project and there are plenty of issues with corner cases that I knew were possible but never could reproduce. As people stumble upon them and report issues I can update the test suite to avoid stepping on the landmine again, but that doesn't change the fact that we've barely scratched the surface.




I usually work lower down the stack, and, frankly, that level of code quality would not be acceptable. (But what you describe matches my experince writing application level code)

Here is a strawman low level test: Randomly generate a sequence of nonsensical (but legal) API calls by randomly generating some data layout, then feed it into a state machine of legal API calls (eg. CRUD), and check the return values. Once that runs for 10 minutes without crashing, extend the test to be multithreaded and run with 1000 thread for a few hours. Dial back the runtime to 60 seconds, and stick it in the regression suite.

As long as there is a well defined API, this finds most bugs (and I also write targeted tests to exercise tricky / error prone paths).

Any thoughts on why it is so much harder to implement reliable high level frameworks? Is it dynamically typed languages / lack of encapsulation, or something more fundamental?


You are finding crash bugs with your test. Most Django regressions are logic bugs. Like some library overwrote a method and that method got a new param in Django, breaking the library. Good libraries with good tox suites catch this. Not all libraries are good.


The example you gave (wrong number of parameters in an override) would be caught by any sane statically typed language. (Though if you are distributing updates to .so's you need to explicitly check for ABI compatibility)

I had to search for Tox. It seems like analogous tools would be nice for statically typed "systems" languages too, though there is less need for them there (more bugs are caught at build time, instead of after deploy).

Debian sort of does the same thing when it builds packages, but only checks compatibility with current versions of dependencies. It would be nice if they also checked / tracked compatibility breakage (the "not all libraries are good" observation is language independent, and a "n days since we broke users of this library" label would be great).


I put all sorts of asserts in the state machine logic (this is the 'check the return values' part). When sufficiently clever, such checks can confirm a surprising range of high and low level behavior.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: