AI Search Has a Citation Problem

PaulHoule · 2025-03-13T20:08:50 1741896530

Microsoft Copilot falls down completely when it comes to citing things: even when the answer is correct, the cited documents frequently don't say what it says they do.

randcraw · 2025-03-14T02:48:15 1741920495

If GenAI search results or summarizations can't be verified by citing their essential sources, how can they ever be deemed trustworthy and free of legal liability for fabricating falsehoods? Somehow this Achilles heel must be eliminated or they won't be usable in a great many domains where validation is required (like medical diagnosis, citing legal evidence, investment advice, etc).

In fact, it's hard to imagine all that many uses for GenAI that won't _eventually_ require some sort of calibratable measure for accuracy, and the capacity for validation thereof, before they can be widely adopted.

mwachs · 2025-03-14T06:43:28 1741934608

Even though it’s in project instructions to use verifiable sources, Claude still makes things up for me.

So after it spits something out I paste: “ Are you 100% sure that every story, quote, fact, and source is accurate and verifiable?”

It then fesses up and asks if I want to rewrite with only verifiable things. I say yes, it does, and so far that seems to work!

czk · 2025-03-13T23:44:12 1741909452

The only time I've seen extensive citing from an AI is when I use Deep Research with the ChatGPT Pro plan. Otherwise, they are mostly few and far between. Perplexity was one of the early services I used that seemed to try to make an effort to get this right, but I haven't used it in quite some time.

bamboozled · 2025-03-13T23:20:51 1741908051

100% my greatest issue with these tools, I really need a "source".

tim333 · 2025-03-15T10:37:40 1742035060

I was going to say Perplexity usually seems good but reading the article it was interesting they are being naughty and using forbidden sources while pretending not to.

mediumsmart · 2025-03-14T05:50:05 1741931405

This is a feature enabling our content derived species to enter the age of full spectrum story driven history fiction as proposed by parkerblubtan at the third matzmum tecdot convention in 2017

egberts1 · 2025-03-14T01:51:28 1741917088

Problem with the LLM is that you need a bucket/hash list for each and every 8/16/32-sized token.

And each URL are quite lengthy.

And a good number of citations can be found hanging off of a token.

ForTheKidz · 2025-03-14T00:14:27 1741911267

I've caught AI outright making up sources (like, fabricating urls and names) to satisfy me in the past.