I believe this is because the code `counts[word]++` basically works by:
1. Hash word, retrieve the value from the counts map
2. Increment the value
3. Hash word, store the value into the counts map
By change counts to a map of strings to pointers of ints, you only need to perform #3 once per unique word. Once you've read the int pointer, you need only write to the pointer and not re-hash the word and write that to the map, which can be expensive.
I am surprised this doesn't get optimized by the compiler. I assume this is necessary in the general case (map could be modified concurrently, causing the bucket for the key to move) but it obviously isn't here.
https://github.com/benhoyt/countwords/blob/master/optimized....