More

fzliu · 2024-02-01T16:37:51 1706805471

Both are methods to reduce the overall size of your embeddings, but from what I understand, quantization is generally better than dimensionality reduction, especially if training is quantization-aware.

fzliu · 2024-01-19T20:20:44 1705695644

"AGI" will never be achieved without building a model that a) _continually_ learns, and b) learns from not just text, but from combined auditory and visual (multimodal) sensory information as well.

The reason a 16-year-old can learn how to drive much quicker than existing self-driving models is because the 16-year-old already has built up 16 years worth of prior knowledge about the physical world.

saltcured · 2024-01-19T21:00:10 1705698010

Don't discount the millions of years of evolution to provide the "blank slate" human learner with perceptual systems, physics-based reasoning, and motor systems ready to be fine-tuned for this slightly different variant of goal-forming, planning, and locomotion.

anon84873628 · 2024-01-20T22:05:21 1705788321

Though it does seem like robots are reaching this baseline level soon.

jazzyjackson · 2024-01-19T20:27:27 1705696047

(imo) c) is made to be aware of its own death

The 16 year old has a lot of motivations to learn how to drive, including the pursuit of reproduction (a cope for mortality)

daxfohl · 2024-01-19T20:37:47 1705696667

d) thinks. unprompted, unstimulated. Decides for itself what's important to think about, makes new connections by that process alone, and understands the implications of those new connections and how to use them.

Here's also where I see it ending. It will need energy--likely a LOT, paid by someone, to do this. Who is going to pay that bill for it to maybe, maybe not, come up with something useful, likely mixed with mostly noise and distraction, over undefined timescales, of largely non-measurable value, when there's far greater value, less cost, less risk, in simply training it deterministically?

tim333 · 2024-01-20T13:05:53 1705755953

Various religious types think they'll live on in heaven. I'm not sure that stuff correlates much with learning to drive.

jart · 2024-01-21T03:53:33 1705809213

That would be like a bird saying humans aren't a Natural General Intelligence because they can't fly. How much vision and audio is required to be intelligent? There's a lot of electromagnetic radiation we can't see and audio bands we can't hear. Would you say that Helen Keller wasn't generally intelligent?

Tenoke · 2024-01-20T23:12:01 1705792321

Okay but if that's the case we are no more than a decade away from integrating those into a newer and bigger model.

password54321 · 2024-01-20T21:48:33 1705787313

I think you are underestimating just how many challenges there are in self-driving: https://www.youtube.com/watch?v=kcKchbfn1VY

ascorbic · 2024-01-21T14:48:18 1705848498

Nobody will accept self-driving cars that are as dangerous as a teenage driver.

fzliu · 2024-01-17T18:23:29 1705515809

In my mind, what's more crucial here is code for downloading/scraping and labeling the data, not the model architecture nor training script.

As much as I appreciate Mis(x)tral, I would've loved it even more if they released code for gathering data.

declaredapple · 2024-01-17T20:10:48 1705522248

I'm speculating they are attempting to avoid controversy about their datasources. That and a possible competitive edge depending on what specific sets/filtering they're using.

ssgodderidge · 2024-01-17T20:17:10 1705522630

To avoid controversy AND potential lawsuits.

declaredapple · 2024-01-17T20:22:41 1705522961

Yup.

I think many countries (japan already has) will allow IP for training data.

They just need to buy time until then.

wruza · 2024-01-18T02:47:56 1705546076

It’s common for third party model testers to not disclose what they mean by “Refusal” parameter as well, for obvious reasons. The world is full of witch-hunting maniacs now and will stay so for an indefinite amount of time. Just wait until the whole thing becomes more widely known and they realize. All AI companies have to hurry up before the doors shut.

PeterisP · 2024-01-18T02:00:43 1705543243

IMHO much of the key training data can't simply be downloaded/scraped/labeled, no matter what code you had - it's not like it's freely accessible to everyone and just needs some code to get it and process it. You can't scrape all of Google Books archive or all of Twitter, and quite a few things that could be scraped at one point may actively prevent you from scraping them now.

pk-protect-ai · 2024-01-17T22:27:51 1705530471

I don't mind to have ready to use datasets instead the code for downloading/scraping and labeling. It will save a lot of time. It is not complicated to write some code for gathering the data, it might be sometimes impossible to replicate the datasets after all if some parts of the data which you have to scrape are already gone (removed because of various reasons).

fzliu · 2024-01-17T17:37:28 1705513048

You need different indexing algorithms for different use cases - brute-force indexing, for example, is "SOTA" when it comes to recall (100%). If you have multiple use cases or if you might have domain shift, you'll want a vector database that supports multiple indexes.

Here's my 2¢:

- If you're just playing around with vector search locally and have a very small dataset, use brute-force search. Don't worry about indexes until later.

- If you have plenty of RAM and CPU cores and would like to squeeze out the most performance, use ScaNN or HNSW plus some form of quantization (product quantization or scalar quantization).

- If you have limited RAM, use IVF plus PQ or SQ.

- If you want to maintain reasonable latency but aren't very concerned about throughput, use a disk-based index such as DiskANN or Starling. https://arxiv.org/pdf/2401.02116.pdf

- If you have a GPU, use GPU-specific indexes. CAGRA (supported in Milvus!) seems to be one of the best. https://arxiv.org/abs/2308.15136

All of these indexes are supported in Milvus (https://milvus.io/docs/index.md), so you can pick and choose the right one for your application. Tree-based indexes such as Annoy don't seem to have a sweet spot just yet, but I think there's room for improvement in this subvertical.

fzliu · 2024-01-12T18:39:18 1705084758

Throwing a few more on here (mix of beginner and advanced):

- Wikipedia article: https://en.wikipedia.org/wiki/Vector_database

- Vector Database 101: https://zilliz.com/learn/introduction-to-unstructured-data

- ANN & Similarity search: https://vinija.ai/concepts/ann-similarity-search/

- Distributed database: https://15445.courses.cs.cmu.edu/fall2021/notes/21-distribut...

rammy1234 · 2024-01-12T18:46:19 1705085179

Throwing one more - https://www.pinecone.io/learn/vector-database/

fzliu · 2024-01-12T18:31:08 1705084268

Like IVF, Annoy partitions the entire embedding space into high-dimensional polygons. The difference is how the two algorithms do it - IVF (https://zilliz.com/learn/vector-index) uses centroids, while Annoy (https://zilliz.com/learn/approximate-nearest-neighbor-oh-yea...) is basically just one big binary tree.

fzliu · 2023-12-12T22:09:42 1702418982

Most breakthroughs are discovered accidentally and retroactively, so I'd think that having multiple breakthrough papers is fairly uncommon.

fzliu · 2023-11-27T19:34:08 1701113648

Regarding "future-proof-ness": we've been building production-grade vector search since 2018 and have a number of organizations running it at billion+ scale in production environments. It's all open source too.

https://milvus.io

fzliu · on Oct 4, 2023

I was at Yahoo almost a decade ago, when vector search within Vespa was first being rolled out in production use cases. It was already serving similarity search requests for Flickr back then.

Even though I'm with Zilliz/Milvus now, I wholeheartedly support and recommend folks check out and try Vespa. Congrats to the Vespa team!

EDIT: For folks on Twitter, you should follow Jo (https://twitter.com/jobergum) from Vespa if you aren't already. Great combo of technical content, hot takes, and vector database memes!

lanstein · on Oct 4, 2023

Huge congratulations to JKB, Frode, Kim, and the rest of the Vespa team! We are infinitely grateful for all of your help and advice. We are lucky enough to have Vespa as the foundation for our developer-focused enterprise search product.

Having worked with both Solr and Elastic in past search companies, it’s incredible to be able to deploy into Fortune 50 enterprises without any doubts about stability and with all the benefits of a cutting-edge hybrid engine.

Can’t wait to watch you on the next leg of your journey!

The Atolio team

jkb79 · on Oct 4, 2023

Thank you for the shout-out Frank!

90-00-09 · on Oct 4, 2023

Off topic... looking forward to more engineers moving to Mastodon. I have Twitter/X blocked at DNS level and still fairly frequently encounter interesting accounts that I can't check out.

srameshc · on Oct 4, 2023

Every time I see Twitter/X here, I want to say this.

skrebbel · on Oct 5, 2023

How is it other people’s problem that you block a site that they’re on?

(Fwiw I have Twitter blocked too, though not for moral reasons)

AdamN · on Oct 5, 2023

Nobody's saying it's anybody else's problem - they're just looking forward to more content being on Mastodon (or elsewhwere) as people move away from X.

skrebbel · on Oct 5, 2023

, in response to a thread sharing the Twitter profile of a key person mentioned in the article, who also comments here.

The subtext is clearly “jkb, please move to Mastodon cause I blocked Twitter”. Not spelling that out doesn’t make it a lot less weird IMO.

90-00-09 · on Oct 5, 2023

It's not that weird, you are reading too much into the subtext. The web is in a transitory state, platforms change, people move. Wishing for more content to be available on a specific platform without blaming the author is an acceptable comment in my view.

SkyMarshal · on Oct 5, 2023

Is there a particular mastodon server or set of servers that the engineering community is favoring? I know it technically doesn’t matter in a federated network, but curious anyway.

tristan957 · on Oct 5, 2023

I'm on fosstodon. floss.social is also big.

manp2 · on Oct 4, 2023

any resource for one to learn on vector search? any textbook or whitepaper recommendations?

I am learning lsh right now and find it fascinating

fzliu · on Oct 4, 2023

We have a "Vector Database 101" series that covers vector search and vector indexes as well: https://zilliz.com/learn/what-is-vector-database

I've been meaning to dive into SCaNN and DiskANN as well, but haven't gotten around to it yet.

swyx · on Oct 5, 2023

> I've been meaning to dive into SCaNN and DiskANN as well, but haven't gotten around to it yet.

quick TLDR on both vs HNSW?

ashu1461 · on Oct 5, 2023

any ideas what do cloud providers like pinecone user underneath ?

gk1 · on Oct 5, 2023

I'm from Pinecone. We use proprietary indexes. We could've used HNSW but decided the high memory consumption (ie, costly at scale) and slow index updates (ie, data gets stale) won't cut it for production use cases.

eskimo87 · on Oct 5, 2023

Oh I read a lot of HNSW stuff on your/Pinecone blog series. (Great learning resource btw, well done!) So I assumed you were using HNSW already. It's a news to me that you don't use it.

fzliu · on Oct 4, 2023

Shameless self-plug for milvus-lite:

   $ pip install milvus
   $ python
   >>> import milvus
   >>> milvus.start()

m00x · on Oct 4, 2023

Gonna add some information here since this isn't very descriptive.

milvus-lite is a bit like sqlite where it runs in-process. Here are some scenarios you'd want to use it in:

- You want to use Milvus directly without having it installed using Milvus - Operator, Helm, or Docker Compose etc. - You do not want to launch any virtual machines or containers while you are using Milvus. - You want to embed Milvus features in your Python applications.