FANN: Vector Search in 200 Lines of Rust

phillipcarter · on June 15, 2023

Our top_k search in our own product is in Go and it's ~25 lines of code. Super simple stuff, does the job of funding the top_k in 1ms, no index, no framework, nothing. We benefit from the corpus we embed being relatively static and we can just store them in redis, so it's not a hard problem to solve. Anyways, often you don't need some turnkey solution to this stuff.

zainhoda · on June 15, 2023

Same — I just asked ChatGPT to generate Go code to do cosine similarity and I was surprised when the code it generated worked and the worst performance I get is like 10ms

phillipcarter · on June 15, 2023

Yep. If you parallelize with goroutines it'll be faster, but we're talking about shaving a few ms. The effectiveness of a for loop crunching floats in a single thread is not to be underestimated.

huac · on June 15, 2023

I have gotten 10x speedups with SIMD on modern hardware - goes from fast to blazing. https://github.com/stillmatic/gollum/blob/main/vectorstore.g...

FWIW - Weaviate's pure Go implementation of cosine similarity is probably as fast as you can possibly get in pure go, without SIMD -- https://github.com/weaviate/weaviate/blob/2f44dbee52196656bd...

nearestn · on June 16, 2023

Weaviate actually does use SIMD for AMD64 with Go assembly (cosine distance is just normalized dot product) https://github.com/weaviate/weaviate/blob/master/adapters/re...

phillipcarter · on June 15, 2023

Yep, we'd go with what they did, except OpenAI embedding vectors are already normalized, so you can just do the dot product.

AA-BA-94-2A-56 · on June 16, 2023

Do you need anything special to run a Rust program with SIMD enabled?

nikhilgarg28 · on June 15, 2023

What is the size and dimensionality of your corpus?

phillipcarter · on June 15, 2023

Size can be up to ~140MB of vectors, so not exactly enormous. But it's nowhere near enough to stress a single node running a for loop on the data and doing a dot product, then sorting by relevancy.

a_e_k · on June 15, 2023

It shouldn't be too hard to extend this to a fairly efficient exact k-nearest neighbor search (at least for Euclidean distance).

The classic approach is to keep your k candidates in a max heap ordered by distance (replacing the farthest candidates whenever the heap is full and you find nearer items in the leaves), and visit the far side of the hyperplane after visiting the near side if either the distance to the hyperplane is less than the distance to the farthest item in the heap or you don't yet have k candidates.

A way to visualize it is that your k candidates define a ball (with radius defined by the worst candidate) around the search point that shrinks as you find better candidates. Then you recursively visit all the nodes in the tree that the ball still touches. Any nodes outside the ball can't possibly improve on the candidates within it and so can be excluded.

In graphics rendering, this sort of thing was the approach used by photon mapping [0] to find the closest photons for density estimation.

[0] https://en.wikipedia.org/wiki/Photon_mapping

nikhilgarg28 · on June 16, 2023

One of the authors of the post here. I've never heard of this approach but it sounds really cool. Will definitely try it out someday!

huac · on June 16, 2023

does that scale to high-dimensions? IIUC the photon mapping procedure typically uses a kd-tree to find closer/farther candidates, and if you consider something like ada embeddings (1536 dim), that would be very computationally intense. photon mapping feels like a 3D problem

carterschonwald · on June 15, 2023

This random hyper plane algorithm is actually a pretty elegant algorithm. It has the following amazing property: as your dimension increases, the quality of the random hyper planes actually gets better!

irwt · on June 15, 2023

What exactly is elegant about this algorithm? The approach seems computationally quite inefficient to me.

zX41ZdbW · on June 15, 2023

It is inefficient to build (unless on GPU), but good for search. Moreover, the search naturally benefits from data locality after ordering - this allows using MergeTree tables in ClickHouse. See https://news.ycombinator.com/item?id=36341754

And it is around 10 lines of code in SQL: https://presentations.clickhouse.com/meetup74/ai/#35

irwt · on June 15, 2023

That's the thing though, it's very easy to perform approximate nearest neighbor search in an efficient way. There are plenty of simple solutions for that. The real challenge is to design an algorithm that can scale & that remains robust even when there are additions and deletions to the set.

gavinray · on June 15, 2023

That presentation was pretty cool, thanks for posting.

esafak · on June 15, 2023

How did they manage to write this article without once mentioning the pioneering LSH algorithm from 1999?

https://en.wikipedia.org/wiki/Locality-sensitive_hashing

nighthawk454 · on June 15, 2023

> The approach we use here is based on a family of algorithms called "Locality Sensitive Hashing" used in the popular library annoy. This article's goal isn't to introduce a new fancy algorithm/library but to describe how vector search works using real code snippets.

Right at the top...

esafak · on June 15, 2023

The article was edited in response to this thread: https://news.ycombinator.com/item?id=36344870

nikhilgarg28 · on June 15, 2023

Yes, it wasn't initially there mostly due to an oversight but we later edited the article.

nighthawk454 · on June 15, 2023

Oh ok, cool you're checking feedback and updating. Might put in an 'updated at' or something :P. But thanks, enjoyed the post!

nighthawk454 · on June 15, 2023

My bad! That's some fast updating

ashvardanian · on June 15, 2023

Always pleasure to see short implementations, but I’d still take HNSW over such approach. Its core implementation can also be quite compact. In our case with USearch [1] its about 2k LOC, even with a custom implementation of priority queues and custom numeric types like uint40_t for indexes beyond 4B points.

Obviously, together with bindings for Rust, Python, JavaScript, Java, Objective-C, Swift, and Wolfram it gets slightly larger, but on the bright side, you can pass JIT-compiled functions and store/search elements of different dimensionality, which allows much broader applicability [2].

[1]: https://github.com/unum-cloud/usearch [2]: https://ashvardanian.com/posts/abusing-vector-search

moab · on June 15, 2023

> I’d still take HNSW over such approach

Why? Sorry, but your comment is not very insightful without more explanation, and the rest of your comment is just an advertisement for your (admittedly impressive looking) work.

ashvardanian · on June 15, 2023

Because (A) it allows similarity search between auxiliary objects, not just equidimensional vectors, and (B) can be implemented in a similarly short fashion, while providing great performance.

huac · on June 15, 2023

I'd be curious how this stacks up on ANN-benchmarks (https://ann-benchmarks.com/).

FWIW exhaustive search is still probably good enough for most use cases. IMO the exhaustive search should use a heap https://github.com/fennel-ai/fann/blob/main/src/main.rs#L12 as you're only looking for top-k, it reduces time and memory complexity greatly. On a relatively unoptimized Golang implementation (though much beefier hardware) I get ~100ms to process 1M vectors of 300 dim. Still quite a bit slower than approximate, of course, but in absolute terms probably good enough for most use cases, especially because many use cases don't have 1M vectors.

zX41ZdbW · on June 15, 2023

I've recently presented the approach with exhaustive search and heap, then iteratively adding simple indices, using ClickHouse. It works pretty well despite the overall usage simplicity.

Video: https://www.youtube.com/watch?v=hGRNcftpqAk

Presentation: https://presentations.clickhouse.com/meetup74/ai/

PS. It should work at the same speed as "bruteforce-blas" in the ANN benchmarks.

huac · on June 15, 2023

nice, if I'm reading this right, you do random projection as a first stage filter, then exact search over the filtered subset? that's clever, did you test precision/recall at all?

zX41ZdbW · on June 17, 2023

Yes, this is the idea. I set the threshold for approximate search with a margin so that on a few test queries, I get 100% recall, and then it is filtered by precise search.

But I didn't evaluate it thoroughly. The threshold for Hamming distance over bit masks was set almost arbitrarily, and it will be interesting to test further.

Lichtso · on June 15, 2023

Isn't this kd-trees with arbitrarily oriented (not axis-aligned) hyperplanes?

esafak · on June 15, 2023

The hyperplanes are randomized to ensure the hyperplanes are probabilistically independent and to deal with high dimensionality, so the data are less likely to be clustered in the same bucket.

https://en.wikipedia.org/wiki/Random_projection

a_e_k · on June 15, 2023

It's a binary space partitioning (BSP) [0]. A kd-tree is a special case of a BSP with axis-aligned hyperplanes.

[0] https://en.wikipedia.org/wiki/Binary_space_partitioning

bagavi · on June 15, 2023

https://www.researchgate.net/profile/Vivek-Kumar-Bagaria-2/p...

This idea improves the scaling from linear in dimension to logarithmic when very little loss using ideas from multi-arm bandits

kristjansson · on June 15, 2023

The actual arXiv link for the curious: https://arxiv.org/abs/1805.08321

yujian · on June 15, 2023

Nice, this is a cool version of ANN search. I like that at the end there is commentary on what's needed for production as well - things like parallelization, RAM considerations, and the consideration for balancing the trees. It's really the production level considerations that would steer you toward a vector database like Milvus/Zilliz

Solvency · on June 15, 2023

"Build a hyperplane (analog of a "line" in higher dimensions) that passes through C and is perpendicular to the line segment connecting A and B."

Isn't that just a fancy way of saying a line from O (a tuple of zeroes) to C to infinity?

medium_spicy · on June 15, 2023

No, this is only true if that hyperplane contains the origin; imagine an infinite number of hyperplanes that contain A and B; there are an infinite number of such planes for dimensions higher than 2. Now imagine the same, but connecting O and C; most of those AB hyperplanes are not orthogonal to those OC hyperplanes, it’s only coincidence if they are, though for dimensions higher than 2 you can always find a point C that happens to lie along the orthogonal line from the AB plane to the origin.

dgacmu · on June 15, 2023

No, it's more restricted than that. Think of 2d planes in 3 space: There are an infinite number of planes that you can construct on a line from A to C (they're all rotations of a plane extending out from that line).

Solvency · on June 15, 2023

Okay I'm trying to map this visually in my brain. So, in 3 space, say I extrude a random plane out from a line segment. I keep going, now I imagine I've extruded every possible plane approaching infinity. This eventually forms... a cylinder, no? So a hyperplane is what, a cylinder? Or is it shorthand for "every possible plane..."

wongarsu · on June 15, 2023

In an n-dimensional space, an infinite n-1-dimensional thing is a hyperplane. Or put more simply: it's something that divides the space in two. In 1 dimension that's a point, in 2 dimensions it's a line, in 3 dimensions it's a plane (hence the name), for more dimensions we just call it a hyperplane.

Now there are various ways to define which hyperplane you speak about. The easiest to reason about is to give a couple points that are on the hyperplane (1 point defines a point, a line can be defined by any two points on that line, a plane by three points on that plane etc.). A slightly weirder but equally useful way is to say "it's orthogonal to this line, and intersects this point". That's easy to reason about in three dimensions, but apparently works in higher dimensions as well.

DougBTX · on June 15, 2023

No, rotate by 90 degrees. The plane is normal to the line, so the line doesn’t lie in the plane.

maccaw · on June 15, 2023

This is great! See also Voy, A WASM vector similarity search written in Rust:

https://github.com/tantaraio/voy

nikhilgarg28 · on June 15, 2023

One of the authors of the post here. Happy to take any questions!

linhvn · on June 15, 2023

Wondering how much details you have considered on using Johnson-Lindenstrauss transform for dimensionality reduction on ANN search?

e.g. https://www.cs.princeton.edu/~chazelle/pubs/FJLT-sicomp09.pd...

nikhilgarg28 · on June 15, 2023

We hadn't considered that at all, will definitely check it out and explore further. Thanks!

tipsytoad · on June 15, 2023

I couldn't find anywhere in the article, how long does it take to add all 10,000 embeddings to the index? That seems quite important for many applications

zX41ZdbW · on June 15, 2023

It should take less than one second on CPU.

For example, see my comment above: https://news.ycombinator.com/item?id=36341754

nikhilgarg28 · on June 15, 2023

We did some benchmarking and found that an index of 1M vectors using 15 trees can be built up in ~115 seconds (8 CPU, 8 GB RAM). https://github.com/fennel-ai/fann/pull/8.

Given this, around a second to build an index of 10K items sounds about right.

cormacrelf · on June 15, 2023

The code for the Hyperplane type is unfortunately omitted. It’s the most important code in the whole thing.

un_ess · on June 15, 2023

isn't it in https://github.com/fennel-ai/fann/blob/main/src/search.rs#L1... ?

cormacrelf · on June 15, 2023

Yes, but it should be in the article if everything else is.

nikhilgarg28 · on June 15, 2023

Apologies, that got omitted by mistake. I've now included the post to include that piece of code as well.

fhaltmayer · on June 15, 2023

How is this different than ANNOY?

esafak · on June 15, 2023

It is cited, but buried half-way into the article. There is a link in it that leads to https://erikbern.com/2015/10/01/nearest-neighbors-and-vector...

nikhilgarg28 · on June 15, 2023

The article isn't meant to present a new library competing with Annoy or other state of the art approaches, but instead, describe how vector search works using short code snippets.

I've now updated the article to clearly mention this and also cite annoy.

sfpotter · on June 15, 2023

The dot product isn’t a distance

mathisfun123 · on June 15, 2023

My favorite kind of pedantry is pedantry that's naive - on like day 1 of topology 101 you learn that the "dot" product is an inner product and every inner product naturally induces a metric so every inner product space is a metric space (with the induced metric).

cochne · on June 15, 2023

It’s not pedantry. Just because an inner product induced a metric space does it make it the same thing as that metric. The performance of similarity searches can depend very much on whether you use a dot product or a distance.

sfpotter · on June 15, 2023

A distance is a function d(x, y) where d(x, y) = 0 if x = y, if x = 0, or if y = 0 (among other properties). If x = y and neither x = 0 nor y = 0, and if we let d(x, y) = x*y, then d(x, y) != 0 in general, so the dot product is certainly not a distance. Your comment isn't relevant, but thanks for calling me a pedant. ;-)

mathisfun123 · on June 15, 2023

`is != induces`

tipsytoad · on June 15, 2023

Dot product is equivalent to cosine similarly for normalized vecotrs

wongarsu · on June 15, 2023

Given normalized vectors, it's also proportional to the square of the Euclidian distance. Meaning for ranking (and thus k-nearest-neighbors) it doesn't matter if you use Euclidian distance or cosine similarity, you get the same result.

sfpotter · on June 15, 2023

It isn't equivalent to it. See my other comment.

tipsytoad · on June 15, 2023

Dude.

Cos(x, y) = x . y / ||x|| * ||y||

If x and y are normalized:

Cos(x, y) = x . y

sfpotter · on June 16, 2023

x*y/|x||y| is a very different function from x*y

It also isn't a distance...

marginalia_nu · on June 15, 2023

It is for unit vectors.

Solvency · on June 15, 2023

My math is weak. I thought dot product measures alignment. Help me grok what "distance" means here?

mecsred · on June 15, 2023

Unit vectors have a normalized length, so the only difference between two unit vectors is the "heading". The angle between them can be interpreted as the the "difference" or distance, between them.

Since you can compose any vector in the space out of unit vectors, you can extend the concept. See above comment inner product -> metric space.

marginalia_nu · on June 15, 2023

Hmm, I think I got my math wrong too.

What is a proper distance function is (1 - |a*b|). That's proportional to the distance between those points on the plane spanned along the circumference of the circle/sphere/hypersphere as it's projected onto that plane.

sfpotter · on June 15, 2023

The dot product of two unit vectors is the cosine of the angle between them. Since cos^-1 (u*v) is the angle between them, or the geodesic distance between u and v regarded as points on the sphere, which is a distance. But the dot product is not.

If u = v, then u*v = 1, but for two points not equal to zero, the distance between them should always be zero. Remember that u*v = 0 if u and v are perpendicular. If neither u nor v are 0, then for them to be perpendicular, they must certainly not be equal. The dot product does not behave like a distance.