You can just use the normal GPT3 API for this. It costs on the order of 1 cent p...

shagie · on Jan 13, 2023

The playground works also for this, though takes a bit of making sure you've got the prompt set up correctly.

I'd start working from https://beta.openai.com/playground/p/default-micro-horror?mo...

    Write a paragraph describing the battle between an African Lion and a bison.

    The battle between an African Lion and a bison was fierce and intense. The bison, weighing nearly a ton, attempted to use its immense size and strength to overpower the lion. Despite his size, the lion was no match for the bison's strength and agility. The lion used its superior speed, agility and razor sharp claws to try to inflict damage on the bison. Although the lion was able to land multiple scratches on the bison with its claws, it was not enough to take down the massive animal. After a long and grueling fight, the lion ultimately retreated in defeat as the bison emerged victorious from this epic battle of nature.

https://imgur.com/a/z6il6IZ

shrimpx · on Jan 13, 2023

Thanks! Is this basically what chatgpt is, a thin frontend on GPT3? Or is there more to it?

shagie · on Jan 13, 2023

ChatGPT has more "memory" to it. If you tell it something, it "remembers" it across the chat session.

GPT3 is stateless. You give it the prompt and it responds back. Context doesn't carry between requests unless you (the user) pass it along. That can make "chat" sessions with GPT (not ChatGPT) expensive as passing along the past context of the chat session consumes a lot of tokens as it gets longer and longer.

gamegoblin · on Jan 13, 2023

It had a context window of 8192 characters (2x as long as GPT3).

It is possible they are using GPT itself to summarize the past transcript and pass that in as context. So your “remember X” would probably get included in that summary.

That said, I have not tried to tell it to remember X and then exceeded the 8192 token contest window.

Source: https://twitter.com/goodside/status/1598882343586238464

shrimpx · on Jan 14, 2023

Btw the GPT3 codex model `code-davinci-002` has a 8k token limit, too.

stavros · on Jan 13, 2023

While this is true, I feel like it's worth pointing out that ChatGPT "just" feeds your previous conversation to GPT-3 every message. I don't really think there's any difference between the two under the hood.

shrimpx · on Jan 13, 2023

Makes sense. Do you know if there exits a re-implementation of a ChatGPT-like stateful conversational interface on top of the GPT-3 API? A cursory search doesn't turn up anything.

shagie · on Jan 13, 2023

If you look at

    https://beta.openai.com/playground/p/default-chat?model=text-davinci-003
    https://beta.openai.com/playground/p/default-friend-chat?model=text-davinci-003
    https://beta.openai.com/playground/p/default-marv-sarcastic-chat?model=text-davinci-003

you can see a chat "session".

The issue is that to maintain the state, you need to maintain the history of the conversation.

For example, the friend chat starts out at 28 request tokens. I submit it, get a response back, and it's at 45 tokens. I add in a bit of my own commentary... and it's 59 tokens. I hit submit and its 80 tokens.... and so on.

https://imgur.com/a/wJ7LNAf

That session wasn't particularly insightful, but you get the idea.

The issue is that I did three requests with 28 + 59 + 88 tokens. As I keep going, this can add up. To maintain that context, my previous history (and the chat response) remains in the prompt for the next message growing arithmetically.

---

(edit)

Working from the chat instead of friend chat... a bit more interesting conversation.

https://imgur.com/Dog2jOT

Note however, that at this point the next request will be at least 267 tokens. We're still talking fractions of a penny ($0.002 / 1k tokens) - but we are talking money that I can look at rather than even smaller fractions. And as noted - it grows each time you continue with the conversation.

shrimpx · on Jan 14, 2023

Thank you, that's really insightful! Presumably you could get it to summarize the conversation when the conversation reaches max tokens.

shagie · on Jan 14, 2023

Likely yes. You could even switch to a less intensive model for doing that (e.g. Curie). The Curie model is 1/10th as costly as Davinci to run. Running 8k tokens through Davinci is $0.16 while Curie would only be $0.016 - and that's likely showing up in back end compute and should be considered if someone was building their own chat bot on top of gpt3.

doctoboggan · on Jan 13, 2023

It’s my understanding that its more like GPT3.5 since it was fine tuned on human feedback