Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is a strange document. They don't mention supervised instruction finetuning anywhere. You need (and can!) only really apply "prompt engineering" of this kind to a foundation model which just completes text. An instruction tuned model is no longer a text completer, it is something which models an agent and understands what you ask it to do. No need or possibility for prompt engineering of this kind. (The foundation model for GPT-4 was not made publicly available, by the way, and for GPT-3.5 it was removed from the API a few weeks ago.)

It it is worth mentioning that the instruction tuned models are not necessarily better, since they can exhibit "mode collapse", a loss in entropy, where they e.g. tend to produce content which is very similar in style.



I know I can't be the only one that finds text completion much more useful and powerful than an agent that wants to chat with me.


No you're not. I too enjoy working with text the completion LLMs have been able to do for some time. The issue with text completion is that most people don't want to be forced to think about possible document headers when they want inferred answers.


Another problem is that OpenAI doesn't want their customers to access them anymore. They may be considered too dangerous, since they are not just not instruction tuned, but also not censored (RLHF'd). So people have to use less powerful base models, which cancels out their increased flexibility.


> since they can exhibit "mode collapse", a loss in entropy, where they e.g. tend to produce content which is very similar in style.

so wait, is this why all these chatgpt answers in HN comments sound so similar and are thus easy to detect?


I guess so, at least this is what people are reporting who have a lot of experience with language models, like janus (see link in sibling).

Though I should mention that mode collapse doesn't just come from supervised instruction tuning (which let the model reply to requests instead of treating them as completion prompts), but also from things like RLHF, which bias the model to give certain replies rather than others.


Very interested in this topic but haven’t experimented much with it yet, do you know of any good resources or writeups?



> GPT-3.5 it was removed from the API a few weeks ago.

context/what are you referring to?


See here: https://news.ycombinator.com/item?id=35242069

What the commenters there didn't realize at the time is that code-davinci-002 has nothing to do with the "Codex API" specifically. It is simply the GPT-3.5 foundation model without fine-tuning applied to it. See

https://platform.openai.com/docs/model-index-for-researchers




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: