Cool. You could probably simultaneously speed it up and simplify it by changing from linked list collision to open addressing with linear probes (but you may need a larger TSZ). Besides being more cache friendly for larger files, at the end you can then just qsort your table instead of first building a flat copy. :-) You could also use the unreduced hash value as a comparison prefix (only calling strcmp on different values). But it's not bad for potato code. ;-)
I didn't do open addressing because I didn't want to write code to resize the whole table. :)
I thought about also keeping the top 10 as I go instead of copying the whole table. But I'm guessing that virtually all the time this program spends is in I/O.