tldr: GPT-4 Turbo have worse score on synthetic benchmark of the first attempt b...

Kuinox on Nov 9, 2023 | parent | context | favorite | on: Benchmarking GPT-4 Turbo – A Cautionary Tale

tldr: GPT-4 Turbo have worse score on synthetic benchmark of the first attempt because they speculate it's a smaller model, and isn't able to memorize as well the response.