> As such, you can have good memory access patterns even in a high language like Java or C#.
CPU and data intensive heavy lifting is rarely done in such programs, it is delegated either to specialised libraries or some middleware in the form of a RDBMS. Most of these programs spend most of their time waiting for some IO event, so the few microseconds gained from the vector with a few hundred elements are negligible.
> But you can read a file from beginning to end.
That's what most programs actually do most of the time because files are essentially a stream abstraction. Programs that jump around a file would map it into memory, then the CPU and the kernel would do their best to cache the hot regions, even if the access to these regions is temporally or spatially distant.
> CPU and data intensive heavy lifting is rarely done in such programs
That's absurd. They aren't as high-performance as C or C++, but Java and C# both have screamingly fast JIT compilers and plenty of high-performance code is written in them. We're not talking about Prolog here. And memory access patterns absolutely makes a huge difference in performance in these languages.
Sure, you CAN ignore that kind of stuff if you want to, but good programmers don't.
Other posters have discussed the first part of your post.
But the second...
> That's what most programs actually do most of the time because files are essentially a stream abstraction. Programs that jump around a file would map it into memory, then the CPU and the kernel would do their best to cache the hot regions, even if the access to these regions is temporally or spatially distant.
Just an FYI: you should always use mmap (Linux / POSIX), or File-based Section Objects (Windows). I don't think streams have any performance benefit aside from backwards compatibility, and maybe clarity in some cases.
MMap and the Windows equivalent allows the kernel to share the virtual memory associated with that file across processes. So if the user opens a 2nd, or 3rd version of your program, more of it will be stored "hot" in the RAM.
Since mmap and section objects only use "virtual memory" (of which we have 48-bits worth on Intel / AMD x64 platforms), we are at virtually no risk of running out of ~256TB of virtual RAM available to our platforms.
CPU and data intensive heavy lifting is rarely done in such programs, it is delegated either to specialised libraries or some middleware in the form of a RDBMS. Most of these programs spend most of their time waiting for some IO event, so the few microseconds gained from the vector with a few hundred elements are negligible.
> But you can read a file from beginning to end.
That's what most programs actually do most of the time because files are essentially a stream abstraction. Programs that jump around a file would map it into memory, then the CPU and the kernel would do their best to cache the hot regions, even if the access to these regions is temporally or spatially distant.