> Google search absolutely does hallucinate completely fictitious results No, it...

throwaway09223 · on Dec 4, 2022

> "it is not Google hallucating it. It really exists on the internet,"

The exact same thing is true for chatGPT, or any other computer system. It is providing information and associations based on the input dataset.

> "GPT is doing it."

And google is "doing it," when google decides there is an association between my query and a bad response. Both systems are analyzing a corpus, drawing associations, and returning parts of that corpus. The output is deterministic based on the input.

The types of associations differ in their depth, but there is no fundamental difference in terms of agency or outcome.

lossolo · on Dec 4, 2022

> The exact same thing is true for chatGPT, or any other computer system. It is providing information and associations based on the input dataset.

No, it's not, for example if you ask google to show you papers about some topic with words in quotes you think you remember from the paper it will show you the proper link IF it exists and language model will just generate you a result that doesn't exist.

If I search something on Google that doesn't exist or that it have no answer I can see looking at list of search results that probably either what I look for doesn't exist or my assumption is false but language model will generate you plausible explanation/answer that can be 100% false and it doesn't know or understand that it's false and you will have no way to know if it's false or true and no point of reference because ALL the results you will receive could be hallucinated.

throwaway09223 · on Dec 4, 2022

> "if you ask google to show you papers about some topic with words in quotes you think you remember from the paper it will show you the proper link IF it exists and language model will just generate you a result that doesn't exist."

Not really.

What will actually happen is Google may link me the actual research paper, along with thousands of other associated pages which may or may not have all kinds of fake, false information. For example, if I search for "study essential oils treat cancer" I get an estimated 190 millon matching documents. A huge percentage of these have false and misleading information about using oils to treat cancer.

> "language model will generate you plausible explanation/answer that can be 100% false and it doesn't know or understand that it's false and you will have no way to know if it's false or true and no point of reference because ALL the results you will receive could be hallucinated"

With google it is third parties "hallucinating" the wrong answers (or worse: intentionally answering wrong in order to exploit and profit). The overall dynamic is not different. Google is providing these wrong answers, written by others.

The overall dynamic of providing information of questionable veracity is generally the same - because the question of who creates the associations and incorrect content is not particularly germane.

lossolo · on Dec 4, 2022

You are mistaking again model with dataset. To be on the same knowledge depth they both have to use the same dataset, the difference is that in search engine you have points of references, rankings and reputation of sites, human discussion and the most important source of the answer, so a lot of signals on which you can also rank the answer yourself. In language model you have none of that, ZERO signals and not only it can return answer that was NOT in the dataset but it can make it plausible looking.

throwaway09223 · on Dec 5, 2022

> "You are mistaking again model with dataset. "

No, what I'm doing is considering the web content (incentivized by the ad systems Google provides) with the web search systems.

> "and not only it can return answer that was NOT in the dataset but it can make it plausible looking."

This is exactly what SEO spammers do. It's the same.

Pxtl · on Dec 4, 2022

This is a distinction without a difference. Both Google and the AI can give you clever but fake results. So at this point the question is which produces fewer fakes?