Whenever I read something like this I do definitely think "you're using it wrong...

Whenever I read something like this I do definitely think "you're using it wrong". This question would've certainly tripped up earlier models but new ones have absolutely no issue making this with sources for each question. Example:

https://chatgpt.com/share/69160c9e-b2ac-8001-ad39-966975971a...

(the 7 minutes thinking is because ChatGPT is unusually slow right now for any question)

These days I'd trust it to accurately give 100 questions only about Homer. LLMs really are quite a lot better than they used to be by a large margin if you use them right.