Hacker Newsnew | past | comments | ask | show | jobs | submit | slagsnag's commentslogin

You must have missed the ridiculous graphs, or the Bernoulli error, while these corpo techno fascists were buying your dinner.

https://news.ycombinator.com/item?id=44830684

https://news.ycombinator.com/item?id=44829144


The graphs were nothing to do with model hallucination, that was a crap design decision by a human being.

The Bernoulli error was a case of a model spitting out widely believed existing misinformation. That doesn't fit my mental model of a "hallucination" either - I see a hallucination as a model inventing something that's not true with no basis in information it has been exposed to before.

Here's an example of a hallucination in a demo: that time when Google Bard claimed that the James Webb Space Telescope was first to take pictures of planet outside Earth’s solar system. That's plain not true, and I doubt they had trained on text that said it was true.


I don't care what you call each failure mode. I want something that doesn't fail to give correct outputs 1/3 to 1/2 the time.

Forget AI/AGI/ASI, forget "hallucinations", forget "scaling laws". Just give me software that does what it says it does, like writing code to spec.


Along those lines, I also want something that will correct me if I am wrong. The same way a human would or even the same way Google does because typing in something wrong usually has enough terms to get me to the right thing, though usually takes a bit longer. I definitely don't want something that will just go along with me when I'm wrong and reinforce a misconception. When I'm wrong I want to be corrected sooner than later, that's the only way to be less wrong.


You might find this updated section of the Claude system prompt interesting: https://gist.github.com/simonw/49dc0123209932fdda70e0425ab01...

> Claude critically evaluates any theories, claims, and ideas presented to it rather than automatically agreeing or praising them. When presented with dubious, incorrect, ambiguous, or unverifiable theories, claims, or ideas, Claude respectfully points out flaws, factual errors, lack of evidence, or lack of clarity rather than validating them. Claude prioritizes truthfulness and accuracy over agreeability, and does not tell people that incorrect theories are true just to be polite.

No idea how well that actually works though!


Shill harder, Simon!

Otherwise they may refuse to ask you back for their next PR chucklefest.


Wasn't expecting a "you're a shill" accusation to show up on a comment where I say that LLMs used to suck at spell check but now they can just about do it.


So 2 trillion dollars to do what Word could do in 1995... and trying to promote that as an advancement is not propaganda? Sure let's double the amount of resources a couple more times who knows what it will be able to take on after mastering spelling.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: