Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What do you ask them then?


I'll respond to this bait in the hopes that it clicks for someone how to _not_ use an LLM..

Asking "them"... your perspective is already warped. It's not your fault, all the text we've previously ever seen is associated with a human being.

Language models are mathematical, statistical beasts. The beast generally doesn't do well with open ended questions (known as "zero-shot"). It shines when you give it something to work off of ("one-shot").

Some may complain of the preciseness of my use of zero and one shot here, but I use it merely to contrast between open ended questions versus providing some context and work to be done.

Some examples...

- summarize the following

- given this code, break down each part

- give alternatives of this code and trade-offs

- given this error, how to fix or begin troubleshooting

I mainly use them for technical things I can then verify myself.

While extremely useful, I consider them extremely dangerous. They provide a false sense of "knowing things"/"learning"/"productivity". It's too easy to begin to rely on them as a crutch.

When learning new programming languages, I go back to writing by hand and compiling in my head. I need that mechanical muscle memory, same as trying to learn calculus or physics, chemistry, etc.


> Language models are mathematical, statistical beasts. The beast generally doesn't do well with open ended questions (known as "zero-shot"). It shines when you give it something to work off of ("one-shot").

That is the usage that is advertised to the general public, so I think it's fair to critique it by way of this usage.


Yeah, the "you're using it wrong" argument falls flat on its face when the technology is presented as an all-in-one magic answer box. Why give these companies the benefit of the doubt instead of holding them accountable for what they claim this tech to be? https://www.youtube.com/watch?v=9bBfYX8X5aU

I like to ask these chatbots to generate 25 trivia questions and answers from "golden age" Simpsons. It fabricates complete BS for a noticeable number of them. If I can't rely on it for something as low-stakes as TV trivia, it seems absurd to rely on it for anything else.


Whenever I read something like this I do definitely think "you're using it wrong". This question would've certainly tripped up earlier models but new ones have absolutely no issue making this with sources for each question. Example:

https://chatgpt.com/share/69160c9e-b2ac-8001-ad39-966975971a...

(the 7 minutes thinking is because ChatGPT is unusually slow right now for any question)

These days I'd trust it to accurately give 100 questions only about Homer. LLMs really are quite a lot better than they used to be by a large margin if you use them right.


I was not trolling actually, thanks for your detailed answer. I don't use LLMs so much so I didn't know they work better the way you describe.


Fwiw, if you can use a thinking model, you can get them to do useful things. Find specific webpages (menus, online government forms - visa applications or addresses, etc).

The best thing about the latter is search ads have extremely unfriendly ads that might charge you 2x the actual fee, so using Google is a good way to get scammed.

If I'm walking somewhere (common in NYC) I often don't mind issuing a query (what's the salt and straw menu in location today) and then checking back in a minute. (Or.... Who is playing at x concert right now if I overhear music. It will sometimes require extra encouragement - "keep trying" to get the right one)


I have a lot of fun creating stories with Gemini and Claude. It feels like what Tom Hanks character imagined comic books could be in Big (1988)

I play once or twice a week and it's definitely worth $20/mo to me


You either give them the option to search the web for facts or you ask them things where the utility/validity of the answer is defined by you (e.g. 'summarize the following text...') instead of the external world.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: