Coming at this from a diffeeent angle, does anyone have any links to tutorials for use-cases? I’d love to see what vectorDB hype is about but as a regular engineer I’m unable to even grasp how to use a vectorDB
I'll give you an example of something i did with a vector database.
I was playing around with making my own UI for interfacing with chatgpt. I saved the chat transcripts in a normal postgres DB, along with the open AI embeddings for each message in a vector db, with a pointer to the message id in postgres in the vector DB metadata.
Then as you chatted, i had chatgpt continuously creating a summary of the current conversation you were having in the background and doing a search in the vector db for previous messages about whatever we're talking about, and it would inject that into the chat context invisibly. So you can do something like say: "Hey do you remember when we talked about baseball" and it would find a previous conversation where you talked about so and so hitting a home run into the context and the bot would have access to that, even though you never mentioned the word "baseball" in the previous conversation -- home run is semantically similar enough that it finds it.
If you're using openai embeddings as your vectors, it's _extremely_ impressive how well it finds similar topics, even when the actual words used are completely different.
Not a tutorial, but TLDR vector DBs are specialized DBs that store embeddings. Embeddings are vector representations of data (E.g. text or images), which means you can compare them in a quantifiable way.
This enables use cases like semantic search and Retrieval-Augmented Generation (RAG) as mentioned in the article.
Semantic search is: I search for "royal" and I get results that mention "king" or "queen" because they are semantically similar.
RAG is: I make a query asking, "tell me about the English royal family", semantically similar information is fetched using semantic search and provided as context to an LLM to generate an answer.
How anyone could read that article, even just skim it, and come out thinking that it was presenting evidence that this is a well researched topic is beyond me. The entire point of the article is that they couldn't fine studies on the topic...
Those are the private loans not backed by the government. The government ones are basically at the prime rate or less to fund the administration of the program.
Perhaps the above comment means that the secondary factors affect the denominator - i.e more people go hiking on Saturday, so entering the Saturday lottery is worse than entering the Wednesday one
Exactly, and it impacts the denominator by a great margin.
The value of the lottery ticket to me is a function of the cost, odds, timing and interest I have in the destination.
If the odds of one trailhead are 1000x lower than another one that is comparable across the other variables then I'm needlessly overpaying, and Booze Allen pockets that inefficiency.
Including the exchange with Schmidt talking about a "paper trail" and thus the preference to "do it verbally". Though that part of the discussion seems Google internal and not between Apple and Google.