More

anthlax · on July 31, 2023

Coming at this from a diffeeent angle, does anyone have any links to tutorials for use-cases? I’d love to see what vectorDB hype is about but as a regular engineer I’m unable to even grasp how to use a vectorDB

empath-nirvana · on July 31, 2023

I'll give you an example of something i did with a vector database.

I was playing around with making my own UI for interfacing with chatgpt. I saved the chat transcripts in a normal postgres DB, along with the open AI embeddings for each message in a vector db, with a pointer to the message id in postgres in the vector DB metadata.

Then as you chatted, i had chatgpt continuously creating a summary of the current conversation you were having in the background and doing a search in the vector db for previous messages about whatever we're talking about, and it would inject that into the chat context invisibly. So you can do something like say: "Hey do you remember when we talked about baseball" and it would find a previous conversation where you talked about so and so hitting a home run into the context and the bot would have access to that, even though you never mentioned the word "baseball" in the previous conversation -- home run is semantically similar enough that it finds it.

If you're using openai embeddings as your vectors, it's _extremely_ impressive how well it finds similar topics, even when the actual words used are completely different.

gk1 · on July 31, 2023

We made an entire learning center for interested folks like you: https://www.pinecone.io/learn/

I recommend starting at https://www.pinecone.io/learn/vector-database/

simonw · on July 31, 2023

I wrote one here: https://simonwillison.net/2023/Jan/13/semantic-search-answer...

estreeper · on July 31, 2023

I recently wrote a tutorial on making a vector driven semantic search app using all open source tools (pgvector, Instructor, and Flask) that might be helpful: https://revelry.co/insights/open-source-semantic-vector-sear...

ZephyrBlu · on July 31, 2023

Not a tutorial, but TLDR vector DBs are specialized DBs that store embeddings. Embeddings are vector representations of data (E.g. text or images), which means you can compare them in a quantifiable way.

This enables use cases like semantic search and Retrieval-Augmented Generation (RAG) as mentioned in the article.

Semantic search is: I search for "royal" and I get results that mention "king" or "queen" because they are semantically similar.

RAG is: I make a query asking, "tell me about the English royal family", semantically similar information is fetched using semantic search and provided as context to an LLM to generate an answer.

anthlax · on July 25, 2023

The articles conclusion is explicitly that there is no valuable literature on the subject

GravelRocks · on July 26, 2023

https://en.m.wikipedia.org/wiki/Betteridge's_law_of_headline...

How anyone could read that article, even just skim it, and come out thinking that it was presenting evidence that this is a well researched topic is beyond me. The entire point of the article is that they couldn't fine studies on the topic...

anthlax · on July 10, 2023

This theory makes sense but since the loan is essentially risk free, why are private student loan interest rates roughly 10%?

jedberg · on July 10, 2023

Those are the private loans not backed by the government. The government ones are basically at the prime rate or less to fund the administration of the program.

anthlax · on June 26, 2023

Perhaps the above comment means that the secondary factors affect the denominator - i.e more people go hiking on Saturday, so entering the Saturday lottery is worse than entering the Wednesday one

csharpminor · on June 26, 2023

Exactly, and it impacts the denominator by a great margin.

The value of the lottery ticket to me is a function of the cost, odds, timing and interest I have in the destination.

If the odds of one trailhead are 1000x lower than another one that is comparable across the other variables then I'm needlessly overpaying, and Booze Allen pockets that inefficiency.

anthlax · on June 12, 2023

How do you know?

anthlax · on June 12, 2023

Perhaps you meant slashdot.org - this is a typo squatter

anthlax · on June 12, 2023

Slash dot

anthlax · on June 11, 2023

Link to this?

tyingq · on June 11, 2023

This one seems to have the most source material:

https://9to5google.com/2013/01/23/court-docs-reveal-email-ex...

Including the exchange with Schmidt talking about a "paper trail" and thus the preference to "do it verbally". Though that part of the discussion seems Google internal and not between Apple and Google.

https://9to5google.com/wp-content/uploads/sites/4/2013/01/pa...

anthlax · on June 12, 2023

Thanks!

anthlax · on June 5, 2023

I’ll be moving to SF in the next month. Anecdotally rents are still hovering at ~3.3k in the mission to ~4k in e.g polk gulch.

1b1b, 750sqft

anthlax · on June 2, 2023

Call for regulation then threaten to pull out of EU because of regulation?