Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It doesn’t seem that surprising; compared to entire NYT articles, poems are short, structured and more likely to be shared in multiple places across the web.

I’m more surprised that it can repeat 100 articles; if that behaviour is consistent in larger sample sizes and beyond just NYT dataset (which might be repeated on the web more than other sources, causing overfitting), that would be impressive.

You could imagine at some point a large enough GPT5 or 6 or 7 will be able to memorize verbatim every corner of the web.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: