It’s going to be very difficult to come up with any rigorous structure for autom...

sroussey · on May 14, 2023

Llama cpp and others use perplexity:

https://huggingface.co/docs/transformers/perplexity

RockyMcNuts · on May 14, 2023

hmmh, if we have the reinforcement learning part of reinforcement learning with human feedback, isn't that a model that takes a question/answer pair and rates the quality of the answer? it's sort of grading itself, it's like a training loss but it still tells us something?