In my experience, many reasonably large projects suffer the same issue. The "def...

hedora · on Jan 19, 2017

I usually work lower down the stack, and, frankly, that level of code quality would not be acceptable. (But what you describe matches my experince writing application level code)

Here is a strawman low level test: Randomly generate a sequence of nonsensical (but legal) API calls by randomly generating some data layout, then feed it into a state machine of legal API calls (eg. CRUD), and check the return values. Once that runs for 10 minutes without crashing, extend the test to be multithreaded and run with 1000 thread for a few hours. Dial back the runtime to 60 seconds, and stick it in the regression suite.

As long as there is a well defined API, this finds most bugs (and I also write targeted tests to exercise tricky / error prone paths).

Any thoughts on why it is so much harder to implement reliable high level frameworks? Is it dynamically typed languages / lack of encapsulation, or something more fundamental?

brianwawok · on Jan 19, 2017

You are finding crash bugs with your test. Most Django regressions are logic bugs. Like some library overwrote a method and that method got a new param in Django, breaking the library. Good libraries with good tox suites catch this. Not all libraries are good.

hedora · on Jan 20, 2017

The example you gave (wrong number of parameters in an override) would be caught by any sane statically typed language. (Though if you are distributing updates to .so's you need to explicitly check for ABI compatibility)

I had to search for Tox. It seems like analogous tools would be nice for statically typed "systems" languages too, though there is less need for them there (more bugs are caught at build time, instead of after deploy).

Debian sort of does the same thing when it builds packages, but only checks compatibility with current versions of dependencies. It would be nice if they also checked / tracked compatibility breakage (the "not all libraries are good" observation is language independent, and a "n days since we broke users of this library" label would be great).

hedora · on Jan 20, 2017

I put all sorts of asserts in the state machine logic (this is the 'check the return values' part). When sufficiently clever, such checks can confirm a surprising range of high and low level behavior.