High school is out so I am learning SIMD instruction sets, like AVX2 and SSE, and using these to speed up Hamming/Levenshtein distance calculations in Rust. Preliminary testing shows a 20x speedup using vectorized SIMD operations! The end goal is a full Rust library for edit distance routines.
Though I probably won't implement the different weighting schemes, I currently have alignment traceback and searching (allow "free shifts" for the pattern string) features.
I took a look at the code, and read the paper. It seems that they directly calculate the entire 2D DP array, but use SIMD to allow each cell to contain multiple values, one for each query string. My approach uses anti-diagonals instead, but it is fast for one vs one comparisons, instead of handling multiple query strings.
Regardless, my goal was to learn some SIMD and Rust (first time for both), so I did not read many background papers.
One thing to keep in mind is for SIMD memory locality is very important; a diagonal vector with a standard 2D DP grid is gonna lead to a lot of cache misses. Just something else to learn about.
Since I am storing the entire DP matrix as diagonal vectors that are flattened, I don't think there will be many cache misses. Each diagonal only depends on its previous two diagonals, and each diagonal is stored contiguously in memory.
The problem with handling diagonals is that indexing cells and comparing characters on the diagonal becomes complex. Dealing with this without many branches (less branch mispredictions) is the hard part.
Sneak peek of the code: https://twitter.com/daniel_c0deb0t/status/124224838155819008...