If you're interested in a rough benchmark, I compared musl's “two way” with a variation on this algorithm for my SIMD optimized libc linked in a sister comment.
I'd say it's a decent improvement, if not spectacular. musl does particularly well with known length strings (memmem), whereas I'm allowed to cheat for unknown length strings (strstr) because the Wasm environment allows me to skirt some UB.
The NUL terminated string curve ball wrecks many good algorithms.
Looking at them, would need a little work to make them safe (not reading past needle or haystack). Probably not too much effort, may need a different search for data mod block size at the end.
Now we just need a name, and we can add it to the smart shootout. Wonder how it compares to the best SIMD algos