*> Along the way, we’ve learned some useful lessons on how to build these kinds ...

manojlds · on June 29, 2023

The short courses from dl.ai are better at driving these points - https://www.deeplearning.ai/short-courses/

frankgrecojr · on June 29, 2023

> Stream the response to the UI so users can monitor progress

This is a game changer to the UX

senko · on June 29, 2023

It's a crutch to minimize the user annoyance at having to wait up to a minute for the response. It sure beats the spinner but it's still a crutch.

thelittleone · on June 30, 2023

Really? I feel like in most uses the stream is close to, if not slightly faster than my reading. I actually prefer that over an instant full-page response. It helps me keep my place in the text and feels like reduced cognitive load.

golergka · on June 30, 2023

Also, the vibrations when LLM “types” in OpenAI iOS app are so satisfying for some reason. It may be a stupid subjective thing, but from my experience in game dev, that's exactly the kind of detail that creates the overall user experience feeling.

behnamoh · on June 29, 2023

Actually, it's annoying because as you start reading the first lines, the content keeps scrolling (often with jagged movements). I always have to scroll up immediately after the stream begins to disable this behavior.

tobr · on June 29, 2023

That’s totally fixable, though. ReadyRunner handles it simply by scrolling all the way from the start, leaving space for the message to grow.

trafnar · on June 29, 2023

Hey, that's my app! https://www.readyrunner.ai

darkteflon · on June 30, 2023

That looks really nice. I do wish you would consider one-time pricing for those bringing their own API key on the “Dev” plan, though :) I’d pay ~$20-30 for a nice desktop app like this but won’t enter into another $48/year sub.

jamifsud · on June 29, 2023

Anyone know of any good “tolerant” JSON parsers? I’d love to be able to stream a JSON response down to the client and have it be able to parse the JSON as it goes and handle the formatting errors that we sometimes see.

icyfox · on June 29, 2023

There's no bulletproof solution to this. JSON5 (https://www.npmjs.com/package/json5) gets you slightly more leniency, as does plugging the currently streamed content into another smaller LLM. I also wrote a deterministic parser more tailored towards these partially-complete LM outputs. Not perfect certainly but handles the 99% of cases well: https://github.com/piercefreeman/gpt-json. In particular the "streaming" functionality here might be of interest to you.

darkteflon · on June 30, 2023

This looks really cool, thanks for open-sourcing this. I’ve been similarly parsing and validating output from OpenAI’s new functions using a schema defined on a custom Pydantic class, but I can see that your code has a lot of niceties coming from proper battle-testing, including elegant error handling, transformations and good docs.

I’d like to incorporate this in a production workflow for generating schema-compliant test data for use in few-shot promoting - would you mind saying a few words about your medium term plans for the library? The LangChain API is changing all the time at the moment so we’re trying to figure out where it’s safe to stand. No expectations, of course, just curious.

icyfox · on June 30, 2023

Sure - I'm using it in a few different internal tools and know others are using it in production. The API should be relatively stable at this point since I intentionally kept the scope pretty limited. The main changes over time will be improved robustness and error correction as issues report different JSON schema breaks that we can fix automatically. Let me know if you see more cases that can be addressed here, would love to collaborate on it.

darkteflon · on June 30, 2023

Thanks! Absolutely, will do. I’ll have a play with it today and reach out with a PR any time it makes sense to do so.

I noticed some occasional funkiness from GPT-4 around sending back properly formatted dates yesterday but haven’t yet dug into it properly. Might be a good candidate for a transformation.

anentropic · on June 30, 2023

> I’d love to be able to stream a JSON response down to the client and have it be able to parse the JSON as it goes

why though?

jamifsud · on June 30, 2023

In a non chat setting where the LLM is performing some reasoning or data extraction it allows you to get JSON directly from the model and stream it to the UI (updating the associated UI fields as new keys come in) while caching the response server side in the exact same JSON format. It’s really simplified our stream + cache setup!

n4te · on June 30, 2023

My JsonReader in libgdx and JsonBeans can parse a more relaxed version of JSON. It uses Ragel.

huydotnet · on June 29, 2023

Still not a reasonable way if you're expecting a structured data in the response, like JSON or something that you're required to parse before showing to the user.

jelled · on June 30, 2023

On one of my apps I ask for a plain text response first and then make a second call to parse the original response to json.