> That’s testing whether the model can emit a huge structured output without error, under a context window limit.
Agreed. But to be fair, 1) a relatively simple algorithm can do it, and more importantly 2) a lot of people are trying to build products around doing exactly this (emit large structured output without error).
Agreed. But to be fair, 1) a relatively simple algorithm can do it, and more importantly 2) a lot of people are trying to build products around doing exactly this (emit large structured output without error).