It’s how LLMs work - they are effectively recursive at inference time, after eac... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

sshumaker 10 months ago | parent | context | favorite | on: OpenAI and Microsoft Azure to deprecate GPT-4 32K

It’s how LLMs work - they are effectively recursive at inference time, after each token is sampled, you feed it back in. You will end up with the same model state (not including noise) as if that had been the original input prompt.

DelightOne 10 months ago [–]

LLMs sure. My question is whether it is the same in practice for LLMs behind said API. I found no official documentation that we will get exactly the same result as far as I can tell.

And no one here touched how high a multiple the cost is, so I assume its pretty high.

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact