> Along the way, we’ve learned some useful lessons on how to build these kinds of applications, which we’ve formulated in terms of patterns.
* Use a text template to enrich a prompt with context and structure
* Tell the LLM to respond in a structured data format
* Stream the response to the UI so users can monitor progress
* Capture and add relevant context information to subsequent action
* Allow direct conversation with the LLM within a context.
* Tell LLM to generate intermediate results while answering
* Provide affordances for the user to have a back-and-forth interaction with the co-pilot
* Combine LLM with other information sources to access data beyond the LLM's training set
Really? I feel like in most uses the stream is close to, if not slightly faster than my reading. I actually prefer that over an instant full-page response. It helps me keep my place in the text and feels like reduced cognitive load.
Also, the vibrations when LLM “types” in OpenAI iOS app are so satisfying for some reason. It may be a stupid subjective thing, but from my experience in game dev, that's exactly the kind of detail that creates the overall user experience feeling.
Actually, it's annoying because as you start reading the first lines, the content keeps scrolling (often with jagged movements). I always have to scroll up immediately after the stream begins to disable this behavior.
That looks really nice. I do wish you would consider one-time pricing for those bringing their own API key on the “Dev” plan, though :) I’d pay ~$20-30 for a nice desktop app like this but won’t enter into another $48/year sub.
Anyone know of any good “tolerant” JSON parsers? I’d love to be able to stream a JSON response down to the client and have it be able to parse the JSON as it goes and handle the formatting errors that we sometimes see.
There's no bulletproof solution to this. JSON5 (https://www.npmjs.com/package/json5) gets you slightly more leniency, as does plugging the currently streamed content into another smaller LLM. I also wrote a deterministic parser more tailored towards these partially-complete LM outputs. Not perfect certainly but handles the 99% of cases well: https://github.com/piercefreeman/gpt-json. In particular the "streaming" functionality here might be of interest to you.
This looks really cool, thanks for open-sourcing this. I’ve been similarly parsing and validating output from OpenAI’s new functions using a schema defined on a custom Pydantic class, but I can see that your code has a lot of niceties coming from proper battle-testing, including elegant error handling, transformations and good docs.
I’d like to incorporate this in a production workflow for generating schema-compliant test data for use in few-shot promoting - would you mind saying a few words about your medium term plans for the library? The LangChain API is changing all the time at the moment so we’re trying to figure out where it’s safe to stand. No expectations, of course, just curious.
Sure - I'm using it in a few different internal tools and know others are using it in production. The API should be relatively stable at this point since I intentionally kept the scope pretty limited. The main changes over time will be improved robustness and error correction as issues report different JSON schema breaks that we can fix automatically. Let me know if you see more cases that can be addressed here, would love to collaborate on it.
Thanks! Absolutely, will do. I’ll have a play with it today and reach out with a PR any time it makes sense to do so.
I noticed some occasional funkiness from GPT-4 around sending back properly formatted dates yesterday but haven’t yet dug into it properly. Might be a good candidate for a transformation.
In a non chat setting where the LLM is performing some reasoning or data extraction it allows you to get JSON directly from the model and stream it to the UI (updating the associated UI fields as new keys come in) while caching the response server side in the exact same JSON format. It’s really simplified our stream + cache setup!
Still not a reasonable way if you're expecting a structured data in the response, like JSON or something that you're required to parse before showing to the user.