Hacker News new | past | comments | ask | show | jobs | submit login
Algorithms we develop software by (grantslatton.com)
270 points by xal 5 months ago | hide | past | favorite | 106 comments



Write everything (generally, new features) twice has turned out to be really good strategy for me, but it doesn't sit well with bizdev or project managers and tends to be perceived as unnecessary slowness.

But if you plow through a feature and get it "working," you'll do much of that work cleaning up the logic and refactoring through your first pass. What rewriting allows you to do is crystalize the logic flow you developed the first time and start cherry-picking in a more linear fashion to meet the blueprint. It also tends to reduce the urge (/ need) for larger scale refactorings later on.


A project manager or bizdev person writes, rewrites, and rewrites again the document they produce do they not? Or do they write the perfect document at first go?


Worst kind of jira ticket is just a link to a document that can be edited anytime. In that case I just replace the link with a new link pointing to the fixed version of the document and inform the author of the ticket/document about it.


Depending on what kind of document it is, I often resort to extracting its contents with pandoc by converting it into markdown, and then replacing the original description. Makes search work properly too.


If I'm writing something, I'm writing it once. That "prewriting" stuff they teach in schools just slows down the process and makes you overthink your choice of words. I take the same "1 take" approach with code with the idea being to write the best solution on the first try. Why waste time writing something bad just so you can fix it later? That doesn't make any sense.

As I'm writing, I do go back and make changes as they pop into my head. But once I'm done writing it, I'm done unless I notice an obvious mistake after the fact.


1. You don’t take any notes or make any bullet points?

2. > As I'm writing, I do go back and make changes as they pop into my head

This contradicts the idea that you only write once.

3. You don’t get feedback on your first finished version and make changes?


Don't think of the first iteration as "prewriting" and instead think of the subsequent iteration as "editing."


How would they know? Are they monitoring your code writing?


Depending on how the dev environment is setup, yes. Maybe you need something to go through a CI/CD pipeline. If you can't test it on your local machine, it can easily be visible to many people.


> it doesn't sit well with bizdev or project managers

To be fair, it makes everything twice as expensive. Managers are always going to reflexively push back against that, even if the new feature covers that cost and more.


> To be fair, it makes everything twice as expensive.

The article argues it makes it less expensive to reach any specific quality level (above some threshold).

The threshold isn’t really addressed in the article, but it is implied that for any realistic quality need, the write twice approach will be cheaper.

To conclude it makes everything twice as expensive, you have to ignore any cost except the initial write. That’s not realistic.


The second time doesn't cost as much as the first.


And the first is cheaper than a ‘regular single version’ bc you build it with the thought that you will throw it away.


You throw it away but in truth much of it survives in a much better design.


From the article:

> Rewriting the solution only took 25% the time as the initial implementation

Seems reasonable.


> Seems reasonable.

Not to the manager, which is the point, not how long it takes to rewrite it.


What hits hard is realizing that most features are not worth it. They just shouldn’t be developed.

Unfortunately biznes wants features and more of them and if possible for free.


I find half my job as a developer is writing code, and the other half is advising clients on which features not to implement.


> Write everything (generally, new features) twice has turned out to be really good strategy for me, but it doesn't sit well with bizdev or project managers and tends to be perceived as unnecessary slowness.

Silo-isation compounds this. If the maintenance costs are borne by another team or if any rework will be funded out of a different project, the managers are not going to care about quality beyond the basic "signed off by uat".


I spent some time doing consulting with an engineering manager who would keep requesting different (correct) implementations of the same functionality until had seen enough and then he'd pick one. This did lead to some high quality software for what needed to be a high reliability product.

I should probably mention that I was doing consulting engineering here because no employees work work for the guy...


> "gun to your head, you have to finish in 24 hours, what do you do?"

PSA: if you are a project manager / owner or some other similar position you do not get to ask this. This is a personal educational excercise not a way to get stuff done faster.


100% this should never be an excuse to push for a faster outcome. I have to admit though, as a personal mental exercise, this has saved me countless of hours from going down the rabbit hole of over-engineering. Some problems just need a simple solution, sometimes even without any changes to code.


So glad someone’s here saying it. I am absolutely dreading tomorrow morning now with the thought our manager has also read this article.


I mistakenly read your comment as "dreading tomorrow morning now that our thought-manager has also read this article" which I must have subconsciously decided was inline with the situation.


Sounds like you need a different manager.


"gun to your head" is maybe not appropriate for work, but the exercise is good for cutting to the core of a task when necessary. It's really the same question as what is the minimum viable product.


Yeah, I've certainly seen cases where something was overbuilt and 90% of the time was wasted.

But I've also worked at places where things were underbuilt (e.g 0 test environments whatsoever except prod). If there was a gun to my head, to finish something in 1 hour, I'd test in prod.

So I think advice that sometimes is useful, sometimes is damaging, isn't really helpful. Not unless there's an easy way to tell which situation is which.


I think the exercise is more about exposing you to other solution options (to 'break your anchoring bias'). You still have to exercise judgement as to which solution is right for the situation.


How about “building is on fire” or “company is dead tomorrow”?


"Here, you have a gun to your head, you have to finish in 24 hours. What do you do?"

"I set aside the slides for the pointless CEO presentation tomorrow and work exclusively on this."

"No, you can't cancel on the CEO. Let's say you have two guns to your head and 24 hours, what do you do?"

"I take lots of coffee, skip sleeping tonight, cancel the group status meeting for Wednesday and focus on these two things."

"If you do that we'll look bad in front of the whole group. Let's say you have three guns to your head..."


at work? just shoot me then


> In practice, none of the day-long plans are actually a day. The gun isn't actually to your head. You can go home and sleep.


Anyone have a reference to this technique? I’d like to learn more.


A good code, in my opinion, is written by appropriate selection of suitably contained abstractions. The problem with this, and the article does try to talk about it, is that for you to select appropriate abstractions, you need to know the "entire" thing. Which is to say, you need to have a knowledge of something that isn't there yet.

In other engineering disciplines like say civil or architecture, this problem is solved by using a good blueprinting paradigm like CAD layouts, but I find a distinct lack of this in software[1]. Ergo this advice which is a rephrasing of "know first and build later". But it is also equally easy to lose oneself in what's called an analysis paralysis i.e. get stuck in finding the best design instead of implementing a modest one. In the end, this is what experience brings to table I suppose, balance.

[1]closest I can think of are various design diagrams like the class diagrams etc.


Very interesting suggestions, all worth trying. Having a very capable coworker can help here, because they can show you what can be done in a short amount of time. Specifically I've noticed that some devs get "winded" by a change and want to take a break before moving on; others simply continue. This ability can be improved with practice, both within and across contexts. Doing things quickly is valuable for many intrinsic reasons that are often overlooked because we descry the poor extrinsic reasons. As with car repair, the odds that you forget how to reassemble the car scales with the time the repair takes. Similarly, if you can execute a feature in a day (especially a complex one that requires changes to many parts of a repo, and/or more than one repo) this is much less risky than taking many days or weeks. (To get there requires that you have firm command of your toolset in the same way a mechanic understands his tools, or a musician understands her instrument. It also requires that externalities be systematically smooth - I'm thinking particularly of a reliable, locally repeatable, fast CI/CD process.)

(The calculus here is a little different when you are doing something truly novel, as long periods of downtime are required for your brain to understand how the solution and the boundary conditions affect each other. But for creating variations of a known solution to known boundary conditions, speed is essential.)


> Write everything twice

There's an enhancement in a software I use/maintain that I wrote once and lost (the PC I wrote kaput and I was writing offline so I also didn't backup). It was an entire weekend of coding that I got very in the zone and happily coded.

After I lost that piece of code I never could get the will to write that code again. Whenever I try to start that specific enhancement I get distracted and can't focus because I also can't remember the approach I took to get that working and get lazy to figure it out again how that was done. It's been two years now.


That's a good point. Particularly good pieces of work are hard to rewrite.

I remember rewriting some piece of infrastructure once when I moved to another job, but I failed to summon the energy to rewrite it a second time at another job.


Every time I've pushed through that feeling and rewritten it anyway, the end result was better than the original. The memories eventually come back once I get into the problem and hindsight makes clear how much stuff past-me missed.


Especially in cases where its a project that got derailed or interrupted and I have to start over, the biggest problem for me is inability to concentrate the second time largely from overwhelming and vertiginous deja vu.

Namely, at any given moment my memories of doing the same thing before interfere with my current reality of trying to do it again like intrusive thought microphone feedback.


This is one of the best "programming advice" posts I've ever read, right up there with the grug brained developer.


https://grugbrain.dev/ for those not familiar, by the creator of HTMX


Likely most people don't know what you're referring to and assumed you thought the advice was bad.


Not sure why this was downvoted?


Surprised me too..


I appreciated the comment (author here)


> If, after a few days, you can't actually implement the feature, think of what groundwork, infrastructure, or refactoring would need to be done to enable it. Use this method to implement that, then come back to the feature

really good, this is key. building a 'vocabulary' of tools and sticking to it will keep your velocity high. many big techs lose momentum because they dont


Agreed. I've also heard this stated as:

> for each desired change, make the change easy (warning: this may be hard), then make the easy change"

(earliest source I could find is @KentBeck on X)

I love the idea of that vocabulary of tools and libraries, too. I strongly resist attempts to add to or complicate it unnecessarily.


I really like the footnote that indirectly says that sometimes you just need to spin up a background thread to figure something out. Resonates heavily with my experience, to the point where I feel like a lot of the value my experience brings is identifying this class of problems faster. You stumble onto it, recognize it's the think about it passively type and move on to other things in the meanwhile. It would be easy to bang your head on it and get nowhere, sometimes you just need to let it sit for a bit.


Dan Abramov talks about WET (write everything twice) [1] as generally a good approach, primarily because you often don’t know the right abstraction up front, and a wrong abstraction is way worse than a copy/paste.

He has some good visuals that illustrate how incorrectly dependent and impossible to unwind wrong abstractions can become.

[1] https://youtu.be/17KCHwOwgms


> Write everything twice

I’d say « Write everything three times » because it usually take 3 versions to get it right: first is under-engineered, second is over-engineered and third is hopefully just-right-engineering


Damped oscillation, approaching the right value?


I remember seeing somewhere a popular list of top 10 algorithms used in systems, and it's kinda depressing to realize that the most recent algorithm on the list, Skip List, was invented roughly 30 years ago, and every single one of them was taught in an introductory data structure course. That is, we most likely do not need to study the internals of algorithms nor need to implement them in production. For such a long time in history, smart and selfless engineers already encapsulated the algorithms into well abstracted and highly optimized libraries and frameworks.

Of course, there are exceptions. ClickHouse implemented dozens of variations of HashTable just to squeeze out as much performance as possible. The algorithms used in ClickHouse came from many recent papers that are heavy and deep on math, which few people could even understand. That said, that's just exception instead of norm.

Don't get me wrong. Having a stable list of algorithms is arguably a hallmark of modern civilization and everyone benefits from it. It's just that I started studying CS in the early 2000s, and at that time we still studied Knuth because knowing algorithms in-depth was still a core advantage to ordinary programmers like me.


Did you read the article?


You're right. I rushed and assumed. Thanks



> start over each day This reminds me of "spaced repetition" in learning theory. Drilling the same problem from scratch is a great way to get better at iterating through your rolodex of mental models, but so many people prioritize breadth because they think it is the only way to generalize to new problems.


I usually won't rewrite the whole thing twice, but would rewrite parts of it multiple times. For the very least, the second time around I would format things and add comments to make things easier to be understood. Code should be written for comprehension.


I find the follow approach quite useful.

1. First write down a bunch of idea of how I might tackle the problem - includes lists of stuff that I might need to find out.

2. Look at ways I break the task down to 'complete-able in a session'.

3. Implement, in a way the code is always 'working' at the end of session.

4. Always do a brain dump into a comment/readme at the end of the session - to make it easy to get going again.


> Another heuristic I've used is to ask someone to come up with a solution to a problem. Maybe they say it'll take 4 weeks to implement. Then I say "gun to your head, you have to finish in 24 hours, what do you do?"

Pretend to be capable of doing this, and in the short moment where the other person is not attentive, get the gun and kill him/her. This satisfies the stated criteria:

> The purpose here is to break their frame and their anchoring bias. If you've just said something will take a month, doing it in a day must require a radically different solution.

> The purpose of the thought experiment isn't to generate the real solution.

:-)

---

Lesson learned from this: if you can't solve the problem that the manager asks you for, a solution is to kill the manager (of course you should plan this murder carefully so that you don't become a suspect).

:-) :-) :-)


"you have 24 hrs" and "write everything twice" ......they go hand in hand don't they? You're definitely going to rewrite it if you slap code out there.


I like the "gun to the head" heuristic but I would probably rephrase it to be something like "If you only had 24hrs to solve this or the world would come to an end".


Most software has a finite lifetime of a few years. You rewrite everything eventually.

What you should be worried about is the code that hasn't been rewritten in ten years.


My blogging engine [1] is almost 25 years old now. Have I rewritten it? If by "rewritten" you mean "from scratch", then no. I haven't. It has, however, seen several serious workings and refactorings over the years (the last great one was the removal of all global variables [2] a few years ago). Starting over would have been just too much work.

[1] https://github.com/spc476/mod_blog

[2] As therapy for stuff going on at work.


Weird, i would actually have the opposite conclusion. Can you say more?

>What you should be worried about is the code that hasn't been rewritten in ten years.

Why would I worry? it's been running for 10 years without significant changes. Isn't that a sign it's more or less accomplishing its purpose?


Well, there's bitrot.

Needs shift. Expectations shift. The foundations that the code relies upon shift.

And familiarity with how things actually work inside of the black box evaporates leaving things distressingly fragile when the foundation finally gives way.

It's like when an old dam has "stood the test of time". More and more people (and business practices) wind up naively circle their wagons around presuming it will remain in operation forever and the consequences of what will happen when it finally does fail add up faster than unchecked credit card debt.


The people that wrote it probably moved on, so ownership and fit must be weak.


"Write everything twice" is a great heuristic. Extreme programming and unit tests is a dumb and wasteful technique. You end up cornering yourself.


"Write everything twice" is sometimes called a "spike."[0]

> A spike is a product development method originating from extreme programming that uses the simplest possible program to explore potential solutions.

In my career, I have often spiked a solution, thrown it away, and then written a test to drive out a worthy implementation.

0. https://en.wikipedia.org/wiki/Spike_(software_development)


Sorry, are you saying unit testing is dumb? Not that you'd be the first to say such a thing, but I've never really understood this if people find them valuable. 100% test coverage is one thing, but having some interdependent functions that do one small thing is a perfect use case for unit tests.


Unit tests are a waste of time.

Design by Contract + system tests are a far superior technique that take less time and find more bugs.


The last comprehensive study I read indicates that they improve internal and external code quality by 76% and 88% respectively while reducing productivity some[1]. If you have papers that indicate your claim I'd be interested in reading them or in ones that refute the metastudy linked below.

1. https://doi.org/10.1016/j.infsof.2016.02.004


TDD and using unit tests are not the same thing, though.

I'm very much a TDD sceptic, but believe unit tests have a place.


Confused, how do you ensure that a change to your implementation of some function isn't going to break clients after deployment if you don't have a unit test?


The client doesn't run your unit tests, but will run you contracts (because contracts deploy with you software while unit tests dont).

You might have realized users of software do things the engineers don't expect, which are not covered in unit tests.


> The client doesn't run your unit tests

That's kinda the whole point... you run them to catch bugs and fix them so your clients never see the bugs you caught in the first place.

> but will run you contracts (because contracts deploy with you software while unit tests dont).

So you'd prefer your contracts to blow up your bugs in your clients' faces, rather than catch bugs yourself prior to releasing the code to them?!


You would obviously test the software before it goes to the users. With contracts you would be able to have information from your users that make it MUCH easier to fix. Without contracts you will have a much harder time fixing the bugs. Contracts catch bugs unit tests don't.


> You would obviously test the software before it goes to the users

That... is literally what unit tests do. Test the software before you give it to users.

You seem confused what the debate is over. Nobody is arguing against contracts. They're awesome. They're just not substitutes for unit tests (or vice versa for that matter). You guys have been arguing against unit tests, which is an absurd position to take, hence this whole debate.


It's not absurd. Unit test are expensive and slow you down while at the same time not targeting the part of software where bugs occur. It's usually HOW functions are composed that cause bugs, not the functions themselves. Contracts rest the interaction of components while unit tests don't.


To be fair, "functions themselves" seldom form a valid unit, so in practice you'd end up having to test the composition of said functions.


This is definitely absurd... but okay, riddle me this then. Say you're reimplementing string.indexOf(string) with your fancy new algorithm so you can deliver a nice speed boost to users. How do you do your best to minimize the chances of it blowing up on everyone without writing unit tests?


First, users don't use string.indexOf(string). Users use software that uses string.indexOf(string). Second, I would write a test that takes generates a variety of inputs to cover the domain of string.indexOf(string) and have it call the old version and the new version. I would then collect stats around the performance of each call and make sure the new implementation is faster by the threshold I meant for it to be. Since the PURPOSE OF THE REFACTOR WAS PERFORMANCE. Then I would delete the test after I'm satisfied it worked well on the target hardware, etc.

Let's look at the chrome unit tests.

https://chromium.googlesource.com/v8/v8/+/3.0.12.1/test/mjsu...

This test checks one case and then does a loop over to make sure various sizes work. This is a POOR test because it doesn't cover the domain of the function and exceptional cases. This test is pointless to have and run every build. Why keep it, it does nothing useful.

My point is that if you are refactoring a function, you are going to have to write a NEW TEST anyway because the purpose of a refactor is different each time. The old tests are going to be useless. This is why contracts are so useful because they will ALERT YOU when the software does something you didn't assume. Unit tests don't do this.

From an information theory perspective, the old tests would not provide you with new information. If you want to make sure your refactor works like the old function, then compare the outputs of your new function with the old one over the domain of the function. Then if it's good, delete the tests because why keep them.

Experience shows that users will do things to your program you did NOT expect and therefore your tests will not cover. And more likely than not, refactoring requires rewriting tests anyway so the old tests are usually always useless.

Now SYSTEM tests ARE useful. Tests that wire up many components and test them are useful. But a UNIT test, which tests one function is often just a waste of time to keep.


Honestly... I can see it's going to be genuinely exhausting for me to rebut point-by-point here, so I'll just let others read this and continue if they're interested. I really appreciate the response though, at least it explains your thought process well.

I have one question for you that I feel I have to ask: have you actually practiced commercial software development on a team (say, 5+ developers on the same codebase) in this manner that you describe for a significant period of time (say, 2+ years)? and if so, do you feel the resulting software has been robust and successful?


I've been doing this style of software development for 16 years (out of 28 years programming) across various projects.

If you want a small project that I wrote to look at, see henhouse https://github.com/mempko/henhouse. I wrote an article talking about design by contract and why it's better than TDD here https://mempko.wordpress.com/2016/08/23/the-fox-in-the-henho...

I've built a computer vision system that processed petabytes of data with only one or two bugs in production for any given year. At any given time we kept a bug list of zero. For the last five years I built a trading system using this same process. Again, we don't keep a bug list because if there were bugs, the system wouldn't even run. And if there is a bug we have to fix it immediately. We do have tests, but they are system tests. The worst bug we had that took too long to catch was in a part of the system that DIDN'T use contracts.

Design by Contract is a secret weapon people don't seem to know about.

Also see https://github.com/mempko/firestr, which I used Design by Contract extensively. It's a GUI program so the user can do crazy things.

Learn Design by Contract. Do system testing. Do exploratory testing. Do fuzz testing. Keep a bug list of zero. Don't waste time on unit testing.

If you are looking for popular software built using Design by Contract to various degree.

See SQLite (uses assertions heavily), .NET framework from Microsoft, Many of the C++ Boost libraries, Parts of Rust, Parts of Qt. The Vala programming language, Ada GNAT... and many others.

Here is a research paper from Microsoft that shows the advantage of contracts and code quality. https://www.microsoft.com/en-us/research/wp-content/uploads/...


I feel there's something missing in what you describe, and I'm trying to pinpoint what it is...

I can see it working if you have very few developers touching each piece of code, or if you get to exert control over the final application. But I don't see how it can work for large codebases or teams over long periods of time (read: large businesses)... especially not for library development, where your team is only responsible for providing library functionality (like string.indexOf(string) in my example, or matrix multiplication, or regexes, or whatever libraries do) and you don't necessarily even know whom the users are. There is no "system" or "integration" at that point, you're just developing one layer of code there, which is the library -- a piece of code responsible for doing just one thing. How the heck do you make sure arbitrary team members touching code don't end up introducing silly bugs over time, if not with unit tests?

Have you built any commercial libraries in this manner, rather than applications? i.e. where everyone on your team is jointly responsible for your library's development (like the implementation of string.indexOf(string) in my example), and other folks (whether inside or outside the company) are the ones who piece together the libraries to create their final application(s)?


I updated my original post with examples. I also included this research paper from Microsoft that shows a clear advantage of contracts and code quality. https://www.microsoft.com/en-us/research/wp-content/uploads/...

Note also types are a contract. TypeScript is basically introducing contracts at a high level to javascript.


> I updated my original post with examples.

Awesome, I'll dissect them and show you exactly where you're drawing the wrong conclusion ;)

> I wrote an article talking about design by contract and why it's better than TDD here

Nobody is advocating for TDD or disagreeing with that. Having unit tests != TDD.

> https://github.com/mempko/henhouse

So far as I can see, you're literally the only developer here -- which exactly illustrates my point. You can ignore a LOT of good engineering practices and be sloppy about a ton of things if you're the only developer (or one of a handful of developers), because hardly anything ever changes underneath you or without your knowledge. (Source: I've done it too.)

> See SQLite (uses assertions heavily)

SQLite has like... 3 developers? The vast majority, again, being 1 person. (I didn't even bother verifying your claim that they don't have unit tests, FWIW.)

> .NET framework from Microsoft

.NET absolutely has unit tests, here's one example: https://github.com/dotnet/runtime/blob/ebabdf94973a90b833925...

> Many of the C++ Boost libraries

All the ones I recall ever seeing have unit tests... which "many" don't? Here's one that does: https://github.com/boostorg/regex/blob/develop/test/regress/...

> Parts of Rust

Again, you're gonna have to cite what you're talking about because Rust definitely has unit tests: https://github.com/rust-lang/rust/blob/master/library/core/t...

> Parts of Qt. The Vala programming language, Ada GNAT... and many others.

I'm not gonna keep digging up their unit tests, you (hopefully) get the point above.

> I also included this research paper from Microsoft that shows a clear advantage of contracts and code quality

As I said above, nobody is arguing against contracts! A paper showing they're awesome doesn't mean they're substitutes for unit tests in every situation. Your paper only mentions the phrase "unit test" twice, and neither of them is saying DbC substitutes for them.


Replying here to your last post here because HN won't let me reply to it (maybe too nested?). You say nobody is arguing against contracts.

> So you'd prefer your contracts to blow up your bugs in your clients' faces, rather than catch bugs yourself prior to releasing the code to them?!

That's you arguing against contracts. Contracts need to blow up when users use the software (including you and your testers before you ship). You should ship if you find no contracts blowing up. But you need to let them blow up in user faces too. They provide valuable information AFTER SHIPPING. Otherwise they lose a lot of their value.

Saying contracts shouldn't run in shipped code misses the whole point about what contracts are.

> That... is literally what unit tests do. Test the software before you give it to users.

No, unit tests test a portion of the software before shipping. My argument is they aren't worth the cost and provide little value. Most of the value comes from SYSTEM tests, integration tests, exploratory testing, fuzz testing, etc. Unit tests are the weakest form of testing.

Here is a great argument against them called Why Most Unit Testing is Waste by Coplien. It's a dense argument which I agree with.

https://wikileaks.org/ciav7p1/cms/files/Why-Most-Unit-Testin...


>> So you'd prefer your contracts to blow up your bugs in your clients' faces, rather than catch bugs yourself prior to releasing the code to them?!

> That's you arguing against contracts.

No. Notice what I wrote earlier? Where I very specifically said "contracts are awesome but not substitutes for unit tests"?

That's exactly the same thing I was saying here. I was arguing against relying on contracts to catch the bugs unit tests would've caught. Nobody was ever telling you to avoid contracts anywhere. Like I said, they're awesome, and both are valuable. I'm just saying they don't substitute for your unit tests. Just like how screwdrivers don't substitute for hammers, as awesome as both are.

> Saying contracts shouldn't run in shipped code misses the whole point about what contracts are.

I never said that, you're putting words in my mouth.

> No, unit tests test a portion of the software before shipping. My argument is they aren't worth the cost and provide little value. [...]

I just gave you a detailed, point-by-point explanation of what you've been missing in the other thread with your own purported counterexamples: https://news.ycombinator.com/item?id=41287473

Repeating your stance doesn't make it more correct.


I'm not mempko, but I can answer that question---yes. For over a decade on the same code base with a team that ranged from 3 to 7 over the years. We did have what would be considered end-to-end tests, or maybe integration tests that tested the entire system from ingress (requests coming in) to egress (results going out) that ensured the existing "business logic" didn't break.

There were no unit tests, as a) the code wasn't written in a style to be "unit" tested, and b) what the @#$@Q#$ is a unit anyway? Individual functions in the code base (a mixture of C89, C99, C++98 and C++03) more or less enforced "design by contract" by using calls to assert() to assert various conditions in the code base. That caught bugs as it prevented the wrong use of the code when modifying it.

Things only got worse when new management (we were bought out) came in, and started enforcing tests to the point where I swear upper management believed that tests were more important than the product itself. Oh, the bug count shot up, deployments got worse, and we went from "favorite vendor" to "WTF is up with that vendor?" within a year.


Thanks for the reply. See my reply here: https://news.ycombinator.com/item?id=41287310

It sounds like you, too, were doing application development rather than library development. By which I mean that -- even if you were developing a "library" -- you more or less knew where & how that library was going be used in the overall system/application.

That's all fine and dandy for your case, but not all software development has the luxury of being so limited in scope. Testing the application fundamentally misses a lot more edge cases than a unit test would ever miss. And setup/teardown takes so much longer when every single change in some part of the codebase requires you to re-test the entire application.

When your project gets bigger or the scope becomes open-ended (think: you're writing a library for arbitrary users, like Boost.Regex), you literally have no application or higher-level code to test the "integration" against -- unit tests are your only option. How else are you going to test something like regex_match?

> what the @#$@Q#$ is a unit anyway?

https://res.cloudinary.com/practicaldev/image/fetch/s--S_Bl5...

P.S. I have to also wonder, how much bigger was the entire engineering team compared to the 3-7 people you mention? And if it was significantly bigger, how often were they allowed to make changes to your team's code? It seems to me you probably tight control over your code and it didn't see much flux from other engineers. Which, again, is quite a luxury and not scalable.


I learned early on to automate the test system. It went from a 30-minute setup to run a 5-hour test (which I wrote) to one command that took maybe a minute to run all tests (which I also wrote, and by the end, you could specify just what tests you wanted to run). And yes, that one command generated all the test data and ran all the processes required to test.

Towards the end, management was asking to test for negatives ("Write tests to make sure that component T doesn't get a request when it isn't supposed to," when component T was a networked component that queried a DB not under our control). Oh, and our main business logic made concurrent requests to two different DBs and again, I had to write code to test all possible combinations of replies, timeouts and dropped traffic to ensure we did The Right Thing. Not an easy thing to unit test, as the picture you linked to elegantly showed (and, you side stepped my question I see).

The entire engineering team for the project was maybe 20, 25 people, but each team (five total) had full control over their particular realm, but all were required for the project as a whole. Our team did C and C++ on Solaris; three teams used Java (one for Android, and two on the server side) and the final team did the whole Javascript/HTML/CSS thang.

You're right that we didn't see much flux from the other teams, nor from our customer (singular---one of the Oligarchic Cell Phone Companies), but that's because the Oligarchic Cell Phone Company doesn't move fast, nor did any of the other teams want do deal with phone call flows (our code was connected to the Phone Network). We perhaps saw the least churn over the decade simply due to being part of the Phone Network---certainly the other teams had to deal with more churn than us (especially the Android and JS/HTML teams).

Also, each team (until new management took over) handled deveopment differently; some teams used Agile, some scrum, some none. Each team had control. Until we didn't. And then things fell apart.

If I was developing a library, the only tests I might have would be to test the public API and nothing more. No testing of private (or internal) code as that would possibly churn too much to be useful. Also, as bugs are discovered, I would probably keep the code that proves the error to prevent further regressions if the API doesn't change.

One thing I did learn at that job is never underestimate what crap will be sent your way. I thought that the Oligarchic Cell Phone Company would send good data; yes for SS7 (the old telephony protocol stack), but not at all for SIP (the new, shiny protocol for the Intarweb age).


Unit tests are only worthwhile when your internal machinery is part of the contract.


How do you know the contracts are being followed?


Either through assertions, through fuzz testing, or by formal verification.


Maybe I’m confused on the jargon here, but aren’t unit tests assertions?


No, a unit test is a stand alone bit of code that sets up the environment, runs a bit of code (the unit under test) and checks if the code worker as expected. This does tend to use assertions.

The idea of using assertions, is to put the assertions inside the 'unit under test' so they are checked every time the code is run (sometimes with a way to disable for performance). Then you can run the code normally, and don't need to write a separate bit of code, that has to set up a proper environment (usually with a lot of 'mock' objects).

This style can probably test less, but still works well for 'design by contract'. You can confirm the caller stuck to any requirements of the input, and you can confirm the code stuck to any post-conditions on it's results.


Assertions?


But assertions only tell you you broke someone after the fact (and only if they're enabled)? They don't do anything to help prevent you from breaking them in the first place.


Of course they don't, how could they? How could anything, for that matter? (Apart from guarantees you could ensure statically, natch.)


Assertions triggering in integration tests are different from unit tests. They don't involve writing mocks. They are generally closer to the code, especially with design by contract, so they are more likely to be fixed during a refactor. They encode what you believe your code should do more directly. And they can be used as a basis for formal verification.


Hey, I'm arguing in favour of contracts and assertions here... XD

I even considered formal verification, if under another name.

Ed.

I think I hit a posting limit. After reading dataflow's answer to gp I think I interpreted ggp differently from both of you.

I meant quite literally neither assertions nor unit tests prevent you from breaking things. Of course they'll help you to catch those mistakes before release. I didn't think that counts as prevention.


> Of course they don't, how could they? How could anything, for that matter?

Uh, that's literally what unit tests do? They tell you if you broke any of the cases they test for, before you release your implementation to other people...


Well, that's a mutual misunderstanding then. I meant quite literally neither assertions nor unit tests prevent you from breaking things. Of course they'll help you to catch those mistakes before release. I didn't think that counts as prevention, my bad. Assertions should trigger during system testing if any contracts are broken, so under your interpretation they do prevent breakage as much as units test (meaning before shipping).


System tests prevent breakage as much as unit tests? Really?

How do you "system test", say, string.indexOf(string)? Could you write a system test for this example to show us what you mean and help us see how they're superior to unit tests?

And how long and complicated are your system tests to catch breakages "as much as" a unit test would, for this example?


You add an assertion in the indexOf method that checks the substring is indeed present at that index. Then you run your system tests where the indexOf function is called a bunch of times.

The unit tests might be better at testing more edge-cases. But they are a lot of cumbersome work (not in this example, but it is a rather toy example). Unit tests are also less self-documenting than using assertions. They say what the code should do far away from the code itself. An assertion is right there in the code, it says what should be true, and unlike comments, you will be told if they become false.


Tests have the same effect right?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: