Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The problem is that LLM output is so incredibly confident in tone. It really sounds like you're talking to an expert who has years of experience and has done the research for you - and tech companies push this angle quite hard.

That's bad when their output can be complete garbage at times.





I think eventually humans are going to need to disregard confidence as any kind of indicator of quality, which will be very difficult. Something about us is hardwired to believe a confident delivery of words.

I wish they would. But ultimately it's built into our dna as a social group. When uncertain many want some sense of authority, even if said authority is completely making it up.

Exactly, you are absolutely right!

Kidding aside, it's a good rule of thumb to distrust someone who appears very confident, even with people. Especially if they can't explain their reasoning. There are so many experts who are going to confidently tell you how it is based on their three decade old outdated knowledge that's of very questionable accuracy.


It makes me really sad how Google pushes this technology that is simply flat out wrong sometimes. I forgot what exactly I searched for, but I searched for a color model that Krita supports hoping to get the online documentation as the first result and the under several Youtube thumbnails the AI overview was telling me that Krita doesn't support that color model and you need a plugin for that. Under the AI overview was the search result I was looking for about that color model in Krita.

And worse of all is that it's not even consistent, because I tried the same searches again and I couldn't get the same answer, so it just randomly decides to assert complete nonsense sometimes while other times it gives the right answer or says something completely unrelated.

It's really been a major negative in my search experience. Every time I search for something I can't be sure that it's actually quoting anything verbatim, so I need to check the sources anyway. Except it's much harder to find the link to the source with these AI's than it is to just browse the verbatim snippets in a simple list of search results. So it's just occupying space with something that is simply less convenient.


The AI is also indiscriminate with what "sources" it chooses. Even deep research mode in gemini.

You can go through and look at the websites it checked, and it's 80% blogspam with no other sources cited on said blog.

When I'm manually doing a Google search, I'm not just randomly picking the first few links I'm deliberately filtering for credible domains or articles, not just picking whatever random marketing blog SEO'd their way to the top.

Sorry Gemini, an Advertorial from Times of India is not a reliable source for what I'm looking for. Nor is this xyz affiliate marketing blog stuffed to the brim with ads and product placement.

Some of that is due to that's probably 90% of the internet, but weren't these things trained on huge amounts of books, and published peer-reviewed works? Where are those in the sources?


It's trained on them, yes. But is it trained to prefer them as sources when doing web search?

The distinction is rather important.

We have a lot of data that teaches LLMs useful knowledge, but data that teaches LLMs complex and useful behaviors? Far less represented in the natural datasets.

It's why we have to do SFT, RLHF and RLVR. It's why AI contamination in real world text datasets, counterintuitively, improves downstream AI performance.


The next time you're working on your car google bolt torque specs and cross reference the shit their "AI" says with the factory shop manual. Hilarity ensues.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: