Interesting. I've used this same file for a test on: - I7-7920HQ, MacOS Rust ver...

cb321 · on Oct 18, 2020

Cool. And, yeah..mmap()ing the whole file would also auto-reject non-regular files at the OS layer fixing that TOCTTOU issue..I almost did that, but I'm sure there are various other tricks, too.

I mostly thought Nim deserved to be seen and then it happened to also be faster..perhaps giving folks a slight Bayesian update on presumptions of performance. :-)

Measter · on Oct 19, 2020

Would you be willing to test my implementation[0] also? It'd be interesting to see the overhead of the (overkill) string normalization I do, along with the much reduced pressure on the allocator compared to yours.

[0] https://gist.github.com/Measter/e2e287ee21311d34ea8eb8cd9d57...

cb321 · on Oct 19, 2020

Well, I got 93 ms on the same Tale Of Two Cities file, 2x slower than dga's version and 18x slower than my last 5.0 ms Nim version (done to properly avoid hanging forever if someone does a "mknod foo.txt p") mentioned elsewhere in this thread ( https://news.ycombinator.com/item?id=24822429).

Really, though, all the code for all 6 versions (two Nim, one C, two Rust, one C++) as well as the input file is available to all. So, you should/could double check yourself. As dga mentions in his updated blog there is a lot of compiler/CPU sensitivity.

Measter · on Oct 19, 2020

Huh... that's actually faster than I expected. I figured that the normalization processing overhead would be higher than that.