Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I built Hyperscan with my team in Australia (we've since moved on, or been moved on, from Intel). There are no shenanigans to prevent Hyperscan from using AVX2 on non-IA processors, although we didn't have any high performance examples of same when we were last at Intel (2017).

The mechanism is trivial to control IIRC and would not be difficult to patch. It is, after all and unlike MKL, open source.



Thanks for responding. Please excuse my weariness, then. (I think it’s fair to say it’s pretty well earned at this point, but its certainly not you or your team’s fault.)


Kudos for making one of the most impressive pieces of software ever. The level of optimization you achieved is truly rare these days.


I don't know anything about Hyperscan, but it surprises me that it has a truly rare level of optimization. How does the optimization compare with typical tuned linear algebra, for instance?


I don't know why you're being downvoted either, for what seems like a legitimate (if slightly oddly framed) question.

Linear algebra libraries have a ridiculously simple task (by comparison) that can be optimized incredibly well - there seems to be no amount of intellectual effort that isn't worth doing to get 1% more on BLAS given the massive usage of it.

By contrast, Hyperscan has many different subsystems (string search of various kinds, NFA, DFA, various quick SIMD 'lookarounds', different acceleration schemes for NFA and DFA etc. etc). It's enormously complicated already (and a bit unpredictable in terms of the different tradeoffs that can happen) and the optimizations have to be best-effort. We couldn't put in nearly as much optimization into any given hot loop as the linear algebra guys can.

That being said, we've done quite a lot, and many people have picked up useful techniques from our codebase. For example, the string search in Hacker News user burntsushi's regex scanner is an entirely legit (i.e. he freely credits where it comes from) clone of our "Teddy" literal matcher, for example.


Rather than just downvoting, could someone answer the genuine question about how the optimization that's been done compares with the work on linear algebra which underpins so much? As a specific example, there's libxsmm for small matrices.

There's no aspersions cast on Hyperscan at all, just a query about what makes it "truly rare" for the benefit of people who don't have time to study it. It would also be interesting to know how it compares with hardware regex implementations, of which I haven't heard anything recently in connexion with bioinformatics where the interest was.


I might be the only other person to solve multipattern regex searching. It’s easy conceptually, and fiendishly tricksy when you try to get the tests to pass. You want to take advantage of commonality between patterns but the matches must still be independent. The engine I wrote uses a classical interpreter backend, optimized well enough to beat TRE and the old PCRE handily. I’d read about the bit parallel algorithms and play with them a bit, but the pattern sets we’d throw at any given BP algorithm would break it, as any given BP algorithm comes with several caveats.

The genius of the HyperScan team is that they threw damn near every algorithm at their pattern sets, including some they invented, and divvied up the patterns into subsets that fit the particular algorithms well. Usually getting the tests to pass with one backend is the last act of any regex author—once it ain’t broke, you’re too exhausted to fix it. Contemplating the complexity that comes with a multi-algorithm backend makes me want to cry. But HyperScan did it.

So, to put it in perspective with linear algebra, imagine a library that first spent time analyzing given matrices, and then split them into submatrices and operated on them with different algorithms according to the particular patterns detected therein. That’s kind of insane to contemplate in a linear algebra—it’s really not a domain that compares well at all—but it’s how HyperScan works... and that ignores all the crazy ISA tricks it used.


The vast, vast majority of code is not optimized to the level of some linear algebra code. Pointing at MKL or comparable linear algebra packages as an example of why that level of optimization is not "rare" is silly.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: