The idea that "memory doesn't matter" and "we're mostly waiting on the network a...

rdtsc · on July 16, 2015

Memory didn't matter in the past. At some point caches weren't important, CPU were not running as fast compared to main memory. Then CPUs were getting much faster and caches started to become very important.

Then at some point networking was very slow and it didn't matter if you wrote your code in a scripting language, if all you did was wait for network packets. That was around Pentium 4 days or so. CPU speed was doubling quickly and 1Gpbs cards and switches were still kind of fancy and expensive.

But then it all kind of changed. Caches are important now. You thrash your cache around and you can take a serious performance hit. Even kernel code can't keep up with network wire speed at 10G range.

Long story short, a lot of performance heuristics and folk knowledge about it has be re-evaluated periodically.

pekk · on July 16, 2015

What you need is really much more about the particular workloads you have. It is still often true that code is waiting on network packets or disk I/O. That didn't just change with the decade. This kind of folk wisdom is pretty useless on the whole and we should push people much harder to measure and find their specific problems rather than operating by rule of thumb.

agentultra · on July 16, 2015

Totally agree.

Hence why I find glib comments about FP languages and memory management being unnecessary to be disingenuous at best.

You should still be aware of these things even if you're freezing a thunk off to a queue while your thread processes some other stack until that network message comes back. The hardware is your friend! Feed it right and it will reward you.

taeric · on July 17, 2015

At what point in the past did memory not matter?

rdtsc · on July 17, 2015

Memory speed didn't matter back during early 486-586 days. You just didn't think about cache misses as much because the speed disparity wasn't that great.

taeric · on July 17, 2015

I question whether memory was literally not a concern, though. Were you as likely to outrun memory by the CPU, no? Did you still try and minimize the amount of data that went through memory for overall speed? I would think so.

And this is ignoring the fact that hard drives were still ridiculously slow. So, really, the concern has always been that there are large chunks of memory that are not fast. Over time, "not fast" has changed in definition. But practical considerations have remained that keeping a small data set will be faster than a larger one.

SparklingCotton · on July 16, 2015

But "real memory" is neither what C presents or the copy semantics that is used in FP.

The CPU will keep memory in 64-byte cache lines. There is a complex bus protocol to shuffle cache lines and subparts of cache lines to main memory.

There are additional complex protocols for cache coherence.

The cost of reading 64 bytes from memory into a cache line and when doing a write-back, storing it at a different location in main memory is zero.

Memory is always being copied into our L1 and L2 cache.

Copying data eliminates most of the cache coherence protocols that are complex and costly.

Yes we get a lot of this for free in the CPU implementation, but there is a lot of complexity that goes into imposing what is really beginning to be an unnatural model (mutable memory) on a hierarchical memory system.

FP is using a log-based model. You write to fresh memory, no aliasing, no coherence, no conflicts. You then, during GC, remap memory.

Current hardware cant "remap memory" efficiently, but it seems like the FP approach, in one form or another, is the better approach for dealing with high scalability and deep memory hierarchies: write to fresh memory, then expose a remap operation at the hardware level.

SSD disks are a bit like that internally.

millstone · on July 17, 2015

On the other hand, pointer chasing is decidedly not memory friendly. And mutable memory has a great deal of mechanical sympathy! A function's entire stack frame can live in the L1 or L2 cache. Imagine something like determining the length of a linked list, and then GCing all of the intermediate values.

I am not sure what the cost is of the cache coherency hardware, but if it were high, then presumably single-core CPUs would have a big advantage on single-threaded workloads. That doesn't seem to be the case.

nickpsecurity · on July 16, 2015

I agree that it's foolish to ignore memory issues. We've had horrific performance and scaling in much software that did that. Anyone thinking otherwise can feel free to disable their processor or app cache to see how unimportant memory concerns are. ;)

Any HLL should give performance-concerned designers a clear, mental model of the performance aspects of their code. C and C++ programmers, for instance, understand the costs of their abstractions, how to code in a cache-efficient way, and so on. I read on the old LISP web sites and mailing lists where they similarly could estimate the cost of certain constructions and had tricks to squeeze out extra performance that they hid behind macros, etc. So, I'm sure these other functional languages can do something similar.

nudpiedo · on July 17, 2015

I found it correct: what the says is that we should focus first on the actual bottlenecks before we worry about other optimizations; not that we should just look to another side.

If your bottle neck is in that point (copy of data structures) then there are functional data structures as well. And if still your performance is so critical in that point you can use a mutable data structure which is thread safe, and separate data from code in the rest of your code having still a big chunk of functional code where it is easy to reason about.

agentultra · on July 17, 2015

I am referring to data-oriented design[0], not optimization: thinking about your data, access patterns, and transformations is not a premature optimization in the Knuth-ian sense. It's just plain, old engineering.

It's not incorrect per-se, just misleading in my opinion.

[0]http://dataorienteddesign.com/site.php