Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They use an LLM to summarize the chats, which IMO makes the results as fundamentally unreliable as LLMs are. Maybe for an aggregate statistical analysis (for the purpose of...vibe-based product direction?) this is good enough, but if you were to use this to try to inform impactful policies, caveat emptor.


For example, it's fashionable in math education these days to ask students to generate problems as a different mode of probing understanding of a topic. And from the article: "We found that students primarily use Claude to create and improve educational content across disciplines (39.3% of conversations). This often entailed designing practice questions, ..." That last part smells fishy, and even if you saw a prompt like "design a practice question..." you wouldn't be able to know if they were cheating, given the context mentioned above.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: