Thank you for writing the obvious.
Instruction Byte count is the wrong metric here 100%.
Instruction Count (given reasonable decoding/timing constraints) is the thing to optimize for and indeed variable length encoding is very bad.
Instruction byte count matters quite a lot when you're buying ROM in volume. And today, the main commercial battleground for RISCV is in the microcontroller space where people care about these things.
For those of us without the expertise, could you elaborate on why that is?
On the one hand we have byte count, with its obvious effect on cache space used.
But to those of us who don't know, why is instruction count so important?
There's macro-op fusion, which admittedly would burn transistors that could be used for other things. Could you elaborate why it's not sufficient?
And then the fact that modern x86 does the opposite to macro-op fusion, by actually splitting up CISC instructions into micro-ops. Why is it so bad if they were more micro-ops to start with, if Intel chooses to do this?
For those not understanding the context of the parent's comment, this HN post originally linked to @damageboy's https://twitter.com/damageboy/status/1194751035136450560 tweet showing a 20% performance hit, but was later changed by mods to link to the phoronix.com article.
Fault? He is getting free publicity to the point he is even on the front page of HN (not that he care about this specifically). Show me the last time this happened with a French book up for a prize.
Would be interesting to see how xsv compared to miller (https://johnkerl.org/miller/doc/index.html) in terms of perf, this tool comes exactly as I am about to munge 1TB of gzipped csv files.
Unfortunately, the main operation I need is not supported by xsv...
What a load of crap...
you just wasted 2gb of address space not of DRAM...
You'll "waste" exactly up to the amount of stack each thread uses, rounded up to PAGE_SIZE which is usually 4kb