The entire point of RLHF training is to do this. Every model since GPT-3.0 has been trained specifically for this purpose.
But of course the model can only generate text in one direction and can't take time to "think" or undo anything it's generated.
The entire point of RLHF training is to do this. Every model since GPT-3.0 has been trained specifically for this purpose.
But of course the model can only generate text in one direction and can't take time to "think" or undo anything it's generated.