"In our experiments a second random trainer preferred critiques from the Human+CriticGPT team over those from an unassisted person more than 60% of the time."
Of course the second trainer could be wrong, but when the outcome tilts 60% to 40% in favour of the *combination of a human + CriticGPT that's pretty significant.
From experience doing contract work in this space, it's common to use multiple layers of reviewers to generate additional data for RLHF, and if you can improve the output from the first layer that much it'll have a fairly massive effect on the amount of training data you can produce at the same cost.
From the article:
"In our experiments a second random trainer preferred critiques from the Human+CriticGPT team over those from an unassisted person more than 60% of the time."
Of course the second trainer could be wrong, but when the outcome tilts 60% to 40% in favour of the *combination of a human + CriticGPT that's pretty significant.
From experience doing contract work in this space, it's common to use multiple layers of reviewers to generate additional data for RLHF, and if you can improve the output from the first layer that much it'll have a fairly massive effect on the amount of training data you can produce at the same cost.