Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
sebzim4500
on June 20, 2024
|
parent
|
context
|
favorite
| on:
Claude 3.5 Sonnet
I'm not asking for actual examples, but what kind of thing is in your internal reasoning benchmark?
freediver
on June 20, 2024
[–]
Things like “summarize this text in exactly 14 words”, programming questions, unstructured data to structured data transformations and so on…
sebzim4500
on June 20, 2024
|
parent
[–]
Do you let it use CoT? I think that first one is pretty hard if you have to produce it directly one token at a time, but I guess that's kind of the point.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: