Me too. Finally I understand how the leakage works: no actual reading of kernel memory is taking place.
Instead, the read-ahead/speculative logic causes one of two addresses in user space to be read, and thus placed in the cache. So, by reading both of them, and checking the time it took, the exploit can indirectly determine one bit (0 or 1) of kernel memory. Scary!
Well that value has to still be read into the cache, since the "kern_mem[address]&0x100" calculation is speculatively carried out. I don't think the MMU can do any bit level computations.