Hacker News new | past | comments | ask | show | jobs | submit login

I definitely agree on the false dichotomy between performance and readable code.

> 1) Most code, i.e. at least 80% of the code in a codebase, will never be a performance hotspot and does not need specific optimizations (i.e. as long as the code does not do stupidly inefficient things, it's probably good enough).

The hard part is knowing what "stupidly inefficient things" are. If you never do excessive optimization, its easy to drift towards less efficient over time because your baseline for what performance is possible slows down. Knowing how to do performance optimization that is not worth it and knowing what the best level of efficiency to shoot for is the mark of a good engineer.




Yeah, absent an effort to speed up a code base, it will tend to get slower, unless the team has a culture of performance, everyone is looking for easy performance wins and taking them, and common performance mistakes and getting rid of them.

To do that the team needs experience with performance, and most of internet programmer culture is just to lecture people about how programmers are cheaper than hardware if anyone asks or talks about performance. So many people don't get that experience.


> So many people don't get that experience.

This is very true. The misuse of the "premature optimization" mantra and the thought that valuable programmer time should never be spent on performance have done a disservice to a generation or two of developers. To the point that so many developers don't even believe performance has any significance and the cloud just magically makes it fast.

Spending a week optimizing some function to shave off a few seconds may be totally worth it, just depends how often it runs. If it is a code path being used by millions of people, it can quickly become very valuable.

Cost is another factor. Sure you might be able to effortlessly spin up dozens or hundreds more instances on demand to handle the load because the code is so slow one instance can barely handle a few dozen requests per second, but you're paying for that. Make an instance capable of handling a few thousand rps and suddenly you've save very real money.

And battery life on all portable devices also benefits from more efficient CPU usage.


The sad part is the cloud means web based UI, and web based UI is rarely fast and lightweight.


> Yeah, absent an effort to speed up a code base, it will tend to get slower, unless the team has a culture of performance

My experience with JVM over the last few years says otherwise. Most teams I have come across try too hard to achieve "performance" where very simple idiomatic code would have sufficed. Idiomatic code improves in performance every release. Instead they picked up some dogmas based on JVMs of the past and their "peformance culture" is a farce.


Funnily, what I've seen most frequently with the JVM is most optimizations are on the order of "You should have used a HashSet here, not a List that you keep sorting and removing duplicates from!"

The majority of performance issues I've ran into aren't "Oh, I need just the right structure of code to make things go fast" but rather "I used a Map<String, Object> instead of a PoJo, why is my code slow?"

I think a big problem is the "don't prematurely optimize" has been taken to mean "never think about how your algorithm will perform". It allows someone to write an n^2 or n^3 algorithm when the n algorithm is just as readable (usually moreso!)


The problem with all this is that performance optimization is nothing without measurement. How many items are in your list / map in the regular case? Do you know? And if the number is small, how fast does Map go compared to a list with that many elements?

Oh this function is slow? How often is it called? If it’s only called once, maybe it’s fine.

Performance optimization work should be guided by real data, and knowledge of your users. What features matter the most? Are your users on network constrained devices? Are they running out of disk space? Are GC pauses a problem? What is the normal / maximum size of the input to your software?

Without knowing this stuff, you can’t focus your attention on the right things re: performance. You’ll end up spending time and energy making the wrong things fast with marginal benefit.


I'd say significantly reducing big-O complexity (ex: N^2 to NlogN) is worth it without measurement. That's because even if it is not significant during your tests doesn't mean it won't be a catastrophe later as N increases, maybe even a vulnerability. Using the best algorithm in term of big-O removes a potential problem.

Keeping the suboptimal algorithm is actually the hard optimization path, because it requires you characterize your data and maye keep a note somewhere explaining why you chose that algorithm and its limitations.


I loathe big-O arguments, and so will happily disagree with this advice.

Big-O is by definition only useful when N is large/very large. In many problem spaces, N is bounded, or ought to be bounded at e.g. the input validation level.

That doesn't even capture all the details that big-O glosses over, like how many ops are actually executed, cache coherency, etc etc.

Prefer simpler code until profiled; that's it. Once a bottleneck is identified, big-O might be a useful tool for discussing improvements, far behind just measuring it.


This all presupposes that reducing big oh increases code complexity. In my experience, that's simply not true. Unless you are writing C without libraries, you are almost certainly dealing with a language rich in datastructures that easily reduce big oh notation while clearly communicating intent.

A dictionary will generally beat storing values as key/value in a list, de-duplicating and linearly searching on the key. If you don't believe that, then go ahead and pollute your codebase with List<Pair<Key, Value>>. See how fast that will be.

Heck, go ahead and use and manage a Pair<Key, Value>[]. After all, can't prematurely optimize by using a data structure that automatically manages growth.

Maybe just store everything as a giant String. You can just keep concatenating new values and searching for others in it. Why let those pesky types get in the way of easy to understand string manipulation and regexs? After all, don't prematurely optimize.

Hey, perhaps you should always just bundle and distribute perl with your applications and fork off new instances of perl since it has such nice regex capabilities. After all, don't prematurely optimize.

Have you measured that performance? Halt before you say it's a bad idea, because obviously we need to profile everything and make sure that starting new perl processes are actually slow. Can't be prematurely optimizing.


> Unless you are writing C without libraries, you are almost certainly dealing with a language rich in datastructures that easily reduce big oh notation while clearly communicating intent.

Just to name it, I’m increasingly finding myself in positions where off the shelf data structures don’t solve the problems I have. I’m writing a novel CRDT for collaborative text editing. Some data structures I’ve needed:

- A generic run-length encoded vector that I can append to and binary search.

- A generic run-length encoded b-tree which indexes by the current total length of the sub tree.

- A gap buffer which stores run-length encoded items. Items can be split by a freshly inserted item. And each item has an array of states with a size set when the data structure is created. I might end up moving to a skip list of gap buffers.

Needing a lot of custom data structures is pretty rare. And if I was forced to, I could probably cobble together something that would perform well enough on top of the built in container classes. But doing it all custom is fun. My performance results justify the approach - my system performs orders of magnitude better than any other implementation.


> A dictionary will generally beat storing values as key/value in a list, de-duplicating and linearly searching on the key. If you don't believe that

I don't believe that, although "generally" is doing heavy lifting here. It depends on the key, value, number of kv, how good the hash algo is, etc. Yeah, standard impls are pretty good nowadays. If you don't want to bother understanding the problem space fully (which like you said is fine, otherwise we'd get nowhere), use a standard tool to solve the problem.

So yes, by all means use a hash map if it makes it easier to read/reason about. Hash maps after all are a common tool for this. Just don't start with the big-O crap.


the examples you are responding to are not trading off large-n performance against small-n performance or performance against some other virtue such as clarity

rather, they are unnecessarily slow for all n and also unnecessarily complicated to understand and maintain


> The problem with all this is that performance optimization is nothing without measurement.

Not what I'm arguing. I agree. When making code go fast, you should do it in terms of what the profiler tells you to fix.

However, when you are writing code, it is not wasted time to use the right data structures for the problem.

I've literally seen code like this

  public void addItem(List<Foo> foo, Foo item) {
    if (!foo.contains(item)) {
      foo.add(item);
    }
  }
Is that code easier to read or understand than

  Set<Foo> foo = new HashSet<>();
  foo.add(item);
No (and that's the change I made).

I contend that the earlier version was bad code. The dev wasn't thinking about runtime efficiency, they just liked lists and kept using them. This kind of thinking is fairly common (particularly with young devs that aren't familiar with what's available).

And this is where the "No, never ever think about performance while writing code" that you are espousing is, IMO, detrimental to the craft. It is a basic question "Is there a data structure that meets the need of this code" that "PREMATURE OPTIMIZATION EVIL!!!!!" gets in the way of that sort of consideration.

It's sloppy development work.

The only time you SHOULD do the first solution is when you've found that in a tight loop where this method is called the items are limited and cache consistency ends up trumping algorithmic complexity. That is, data should back using a n^2 algorithm over an n algorithm, not the other way around. Especially when the difference in both solutions is a different datastructure.

But further, the code's intent is clearer in the second solution. So it's not only (generally) faster, it also does a better job of communicating intent with future developers. win-win.

I hate that so many devs cargo cult here.

The full quote is

> Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

Knuth wrote that in an era when premature optimization meant breaking out some hand crafted assembly which obscured the intent of the optimization. He was not talking about optimizations which improve debugging and maintenance. He was talking about improvements like TFA OP posted.

The optimizations I'm talking about I do not call "premature". Just like it's not premature to use the SDKs sort function vs your own bubble sort implementation because "oh, don't prematurely optimize!". An optimization that makes code easier to maintain and debug is never premature.


> And this is where the "No, never ever think about performance while writing code" that you are espousing is, IMO, detrimental to the craft.

That’s not my position. I think we broadly agree. I believe most software that is fast was designed fast. And the way you get there is by understanding your problem space well, and benchmarking with real world data as you build. There are almost always ways to take advantage of knowledge of the data to make your code faster.

As an example, in some performance sensitive rust code I’m working on, I have a lot of sets of numbers. The sets are tiny - most of them only contain 1 element. 95% have 1 or 2 elements. I could use HashSet, but instead I’m using SmallVec<[usize; 2]> - which inlines up to 2 items in a list. It’s uglier and more complicated, but in this case it is much faster than the alternative. I know to use this approach because I know my input data.

I agree with other commenters in this thread. Big-O analysis is often misleading. It leads you to pick a good algorithm, but often at the expense of optimizations that could give you 1-2 orders of magnitude better performance if you go looking for them. Things like cache aware memory layouts, reducing allocations via arena allocators and object pools, SIMD, and fast paths in complex functions for common calls.


> To do that the team needs experience with performance, and most of internet programmer culture is just to lecture people about how programmers are cheaper than hardware if anyone asks or talks about performance.

But this is an absolute truth, isn't it?

I mean, what's more important to your business: avoid the need to launch an EC2 instance, or implement that user flow in one day instead of one week? Do you care if it takes 50ms to show your dialog box instead of 40ms? Do you really care if you pass all your objects by value, with a few deep copies, instead of passing everything by reference?

The truth of the matter is that in general all performance bottlenecks are caused by software architecture and algorithms, not "clean code" implementations.


> programmers are cheaper than hardware

Did you mean the reverse of this, I assume?


Depends on your install base/target.

At Apple scale, paying a few performance engineers 1M/y is much cheaper than shipping bibber and more powerful CPU in each iPhone.

Same thing for AWS or Azure.


Modern macos is pretty unusably slow on old apple hardware. The same hardware ran earlier versions of the OS just fine. I’m not sure what’s going wrong at apple, but I don’t think they have enough performance engineers working on the Mac.

I run a CPU monitor in my dash (and have for years). The efficiency cores are very often busy doing some rubbish tasks I didn’t ask for while my computer is idle. Things like reindexing files, or re-scanning my photos for faces or something like that. I’ve started to think of them as the CPU cores apple put in my computer to keep their useless internal projects from ruining their hardware.

Linux annoys me regularly, but at least my cores are almost totally idle whenever my hands aren’t on the keyboard.


Window server pegs CPU on older hardware doing any sort of trackpad gesture etc. Wasn't the case in Mojave and adjacent releases. Big Sur on are noticeably worse than the earlier releases. Disabling animations/transparency saves a bit, but it's still remarkably poor.


> your baseline for what performance is possible slows down.

This is a problem that absolutely plagues our field. If your project isn't some wrapper around some incredibly complicated math or handling files well into the GB range, there's no excuse for a desktop application to take noticeable time to do anything. Yet here we are, with it taking about a half second just to open the search bar on windows. What is it doing? Why does it take so long? Who knows?


IO on the UI thread.


> The hard part is knowing what "stupidly inefficient things" are.

There are things that are quite obvious non-critical in terms of performance.

One time I had a junior engineer insisting in a PR that we should use a few low-level performance tricks in a code because it was fast, and the code was to open a dialog box.

Also, in general performance bottlenecks are caused by architectural and algorithmic choices, not readability choices. If we're in a domain where, say, inheritance is an unacceptable performance bottleneck and we need to unroll loops and we can't extract functions because the cost of pushing the call stack is prohibitive... We are in an extremely performance-sensitive part of the code.

But most of the time we're far from any of that.


If inheritance and function calls are a bottleneck, it is probably the time to change programming language.


LLVM had that problem. But if you are already writing in C++, where do you switch to? (Spoiler: it's code generation).


I can't fathom how inheritance could be the issue. Virtual inheritance maybe?


Yep, it's entirely possible to get into a scenario where you can't determine where things are inefficient, because the inefficiency is everywhere. Once you're in that spot, you're in for a bad time.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: