Hacker Newsnew | past | comments | ask | show | jobs | submit | jeadie's commentslogin

We’re building vector indexes into Datafusion for search (starting with S3 vectors).

Open source at https://github.com/spiceai/spiceai


This is one of the ideas behind using DuckDB in github.com/spiceai/spiceai


That looks like an amazing "swiss army knife"...!


Looks very cool! I will take a look, tysm!



This is a common feature now. If anything, for being so early to vector databases, Pinecone was rather late to integrating embeddings.

Timescale most recently added it but, yes a bunch of others: Weaviate, Spice AI, Marqo, etc.


A difference between Pinecone and many of the others you listed is that we host both embedding and reranking models in a serverless fashion. You pay for what you use while we manage the entire stack.


Do any of the others also handle reranking?


Qdrant does with its ‘Query API’.

https://qdrant.tech/documentation/concepts/hybrid-queries/

And handles embedding creation with its fastembed package.

https://github.com/qdrant/fastembed



I don't know about them, but Manticore does.

https://manticoresearch.com/use-case/vector-search/


Why not just federate Postgres and parquet files? That way the query planner can push down as much of the query and reduce how much data has to move about?


This looks functionally similar as using http://github.com/spiceai/spiceai with a postgreSQL data accelerator.


Spice AI | Senior Software Engineer | GMT+10 (e.g. Australia) through GMT-7 (e.g. Seatle/SF/LA) | Remote | Full Time

Spice AI provides building blocks for data and AI-driven applications by composing real-time and historical time-series data, high-performance SQL query, machine learning training and inferencing, in a single, interconnected AI backend-as-a-service.

We just launched github.com/spiceai/spiceai, a unified SQL query interface and portable runtime to locally materialize, accelerate, and query data tables sourced from any database, data warehouse, or data lake.

We're hiring experienced software engineers, ideally with Rust and/or Golang production experience. We're focused on large data and distributed systems, experience in these is important too. More details: https://spice.ai/careers#section-open-positions


it says remote but the open positions are mostly hybrid


And yes, Iceberg is very high up on our list


Yes! It can connect to FlightSQL compatible servers (see https://docs.spiceai.org/data-connectors/flightsql ) and its also a FlightSQL compatible server


We also have a Grafana plugin we'll continue to improve to make it super easy to connect to Grafana, and Spice has a metrics endpoint and example Grafana dashboard for monitoring itself https://github.com/spiceai/spiceai/blob/trunk/monitoring/gra...


Have you seen github.com/marqo-ai/marqo? It does all this wrapping, and you don't even need to pay for OpenAI or pinecone


+1 to Marqo. It's documents in documents out rather than vectors in vectors out. Much easier end to end.


Sounds cool, will take a look.

Are there other embedding APIs that you like or have used?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: