Hacker News new | past | comments | ask | show | jobs | submit login

Interesting. I was skeptical about some of their claims regarding longer context, since it's been my experience that these models just get lost after enough of it.



Yeah, degraded performance on long contexts has been observed in plenty of other models [https://arxiv.org/abs/2307.03172] so I was cautious too. Unfortunately I don't have access to 4-32k. I would have liked to test that out too.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: