Hacker Newsnew | past | comments | ask | show | jobs | submit | ibains's commentslogin

Prophecy.io | Think Cursor for Data Analysts | US (onsite SF, remote) Prophecy is used by data engineers in the largest enterprises (SeriesB, $120M raised). Our new product for data analysts is agentic and multi-modal (docs, low-code, code). - AI Research Engineer - AI Agentic Software Engineer Looking for top tier AI engineers who are excited to ship products at tremendous pace email me - raj at prophecy dot io


Prophecy.io will be the way everyone does data transformation soon, just a zero compromise solution that beats legacy ETL and code by miles


Yes, but in addition to being ACID compliant (serializable ANSI), you need to support a million transactions per second - it’s not just about data size.


What’s the cloud infra like?


Why is this just not ETL, why do you need anything here? There is no new category or product needed here.


Just saw the video you shared on the other comment using prophecy, very cool

Generally I don’t care much about the embedding and retrieval and connectors etc for playing with the LLMs, I imagined much more robust tools were available already indeed, my focus was more on the prompt development actually, connecting many prompts together for a better chain of thought kinda of thing, working out the memory and stateful parts of it and so on, and I think there might be a case for an “LLM framework” for that, and also a case for a small lib to solve it instead of an ETL cannon

However, I am indeed not experienced with ETLs, have to play more with the available tools to see if and how can I do the things I was building using them


hmmm, just had a chat with GPT-4, it didn't seem convinced that ETLs would do well the same things that LiteChain is trying to achieve: https://chat.openai.com/share/88961bd1-8250-45f0-b814-0680ba...

I'd be happy to see some more examples of LLM application building on ETLs like the video you shared though


Totally! As a person driving a project like https://github.com/DAGWorks-Inc/hamilton I couldn't agree more!


It is pointless - LlamaIndex and LangChain are re-inventing ETL - why use them when you have robust technology already?

1. You ETL your documents into a vector database - you run this pipeline everyday to keep it up to date. You can run scalable, robust pipelines on Spark for this.

2. You have a streaming inference pipeline that has components that make API calls (agents) and between them transform data. This is Spark streaming.

Prophecy is working with large enterprises to implement generative AI use cases, but they don’t talk so much on HN.

Here’s our talk from Data+AI Summit: Build a Generative AI App on Enterprise Data in 13 Minutes https://www.youtube.com/watch?v=1exLfT-b-GM

Here’s a blog/demo https://www.prophecy.io/blog/prophecy-generative-ai-platform...


We also do platform & customer work there (cool pipelines to feed louie.ai or real-time headless versions), and agreed those pipelines have simple uses of LLM where langchain is mostly useful just for a vendor neutrality. Think BYO LLM as it is now a zoo. Basically apache nifi or spark streaming with simple LLM & vector DB call outs. Our harder work here is more at the data engineering level.

But....a lot of our louie.ai work happens for less trivial scenarios where it isn't just the ETL NLP 2.0 tier . That logic is much more complicated, so structured programming abstractions matter a LOT more for AI-style business logic. Think talk to your data and generate on-the-fly analytics pushdown with an interactive data viz UI. That's.. a lot of code.


I agree that it's a little silly, but I mostly use it to abstract over BYO LLMs and extract information from documents. It's nice to be able to quickly prototype something and swap out the underlying language model than set up a whole pipeline with Apache Tika, ETL, etc. Once the idea is feasible, then sure.

That said, langchain is really inefficient and I often find I can re-implement the pieces I need much faster than dealing with langchain's bugs and performance issues.


That’s assuming you’re not using low-code. There are inbuilt connectors to read data, transform data, read/write to pinecone, make api calls to LLMs. It is much faster to prototype with Prophecy.io


Yes but this is just ETL - LlamaIndex and LangChain are re-inventing it - why use them when you have robust technology already?

1. You ETL your documents into a vector database - you run this pipeline everyday to keep it up to date. You can run scalable, robust pipelines on Spark for this.

2. You have a streaming inference pipeline that has components that make API calls (agents) and between them transform data. This is Spark streaming.

Prophecy is working with large enterprises to implement generative AI use cases, but they don’t talk so much on HN. Here’s our talk from Data+AI Summit:

Build a Generative AI App on Enterprise Data in 13 Minutes

https://www.youtube.com/watch?v=1exLfT-b-GM

Here’s a blog/demo

https://www.prophecy.io/blog/prophecy-generative-ai-platform...


Cool! Lets say I have thousands of documents that I want questions and answers for. Would your solution work for this? I wouldn’t know which documents to send with the prompts though as I want info on the aggregate (like trends and most mentioned phrases or words).


A lot of B2B startups can technically the cloud API to provide value added applications to Enterprises, but often the banks and healthcare companies will not want their data running through startups pipes to OpenAI pipes.

We provide a low code data transformation product (prophecy.io), and we’ll never close sales at any volume, if we have a to get an MSA that approves this. Might get easier if we become large :)


Prophecy (prophecy.io) | ML Engineer | Palo Alto, CA

Work on LLMs and Knowledge Graphs fir English -> ETL data pipelines

You should have previous experience in NLP and solid programming. Shipping a project from scratch is important.

If you have only worked at large companies on tuning a small cog, it will not work

reach out to raj@



are you serious? they invested in US Treasuries, the highest rated and the most conservative asset class.

Banks are in the business of lending money to home buyers, etc - how is loaning to the US Govt. criminal?

They have a cash crunch due to interest rate rise, and VC slowdown - bad risk management, but everything is above board.

Startup lines of credit should not happen from deposits, too risky an asset class


Your didn’t answer my question. If I had an account in SVB, and I put a 100k in it, why would SVB take my 100k and invest it? That was the whole issue with FTX. Investing it whatever asset class, my money shouldn’t have been invested unless I give permission to SVB period.


It’s how literally every bank has worked for all of time. They have to cover interest you are payed for keeping your money at the bank. Where do you think that money comes from? They take the money people deposit, and invest it, either through loans to others, or through other investment vehicles, like treasury bonds in this case. When you put your money in a bank, you are giving them permission to reinvest it somehow. If you just want your money to sit there, your only option is to start stuffing wads of bills under your mattress.


That's how every bank works?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: