I'm not asking for actual examples, but what kind of thing is in your internal r... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		sebzim4500 on June 20, 2024 \| parent \| context \| favorite \| on: Claude 3.5 Sonnet I'm not asking for actual examples, but what kind of thing is in your internal reasoning benchmark?

freediver on June 20, 2024 [–]

Things like “summarize this text in exactly 14 words”, programming questions, unstructured data to structured data transformations and so on…

sebzim4500 on June 20, 2024 | [–]

Do you let it use CoT? I think that first one is pretty hard if you have to produce it directly one token at a time, but I guess that's kind of the point.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact