The inner loop of those comparisons is indeed the spot where you can still speed...

JoachimSchipper · on Jan 21, 2013

I concur that doing this directly in your code is extremely ugly; but note that you can get this speedup by just dropping in a call to glibc's memchr(), hiding the ugliness behind a well-known interface.

jacquesm · on Jan 21, 2013

Good point about memchr, I missed an opportunity there.

edit: because it's clearer and faster, that's a no-brainer, updating the blog post with a remark to that effect and a link to parent.