I think most tech folks struggle with it because they treat LLMs as computer pro...

dartos · 2024-12-31T23:59:23 1735689563

> Don't use LLMs where accuracy is paramount.

Then why do people keep pushing it for code related tasks?

Accuracy and precision is paramount with code. It needs to express exactly what needs to be done and how.

simonw · 2025-01-01T00:04:08 1735689848

Code is the best possible application of LLMs because you can TEST the output.

If the LLM hallucinates something the code won't compile or run.

If the LLM makes a logic error you'll catch it in the manual QA process.

(If you don't have good personal manual QA habits, don't try using LLMs to write your code. And maybe don't hit "accept" on other developer's code reviews either?)

dartos · 2025-01-01T01:04:45 1735693485

> Code is the best possible application of LLMs because you can TEST the output.

This is an overly simplistic view of software development.

Poorly made abstractions and functions will have knock on effects on future code that can be hard to predict.

Not to mention that code can have side effects that may not affect a given test case, or the code could be poorly optimized, etc.

Just because code compiles or passes a test does not mean it’s entirely correct. If it did, we wouldn’t have bugs anymore.

The usual response to this is something like “we can use the LLM to refactor LLM code if we need” but, in my experience, this leads to very complex, hard to reason about codebases.

Especially if the stack isn’t Python or JavaScript.

simonw · 2025-01-01T01:06:30 1735693590

So code review LLM-generated code and reject it (or require changes to it) if it doesn't fit your idea of what good code looks like.

dartos · 2025-01-01T01:35:19 1735695319

Or… yknow… I could just write the code…

Instead of going through a multi step process to get an LLM to generate it, review it, reject it, and repeat…

I wonder why you reply to these comments, but not my other asking what you use LLMs for and specifically explaining how they failed me.

simonw · 2025-01-01T01:50:45 1735696245

Found that comment here, about to reply: https://news.ycombinator.com/item?id=42562394

BeetleB · 2025-01-01T02:01:31 1735696891

> Then why do people keep pushing it for code related tasks?

They don't. You are likely experiencing selection bias. My guess is you work in SW, and so it makes sense that you're the target of those campaigns. The bulk of ChatGPT subscribers are not doing SW, and no one is bugging them to use it for code related tasks.

dartos · 2025-01-01T03:21:46 1735701706

I mean people in the software field absolutely push for LLMs to write code…

Obviously people not in the software field wouldn’t care…

danielbln · 2025-01-01T13:59:53 1735739993

Because there are other ways to validate the output, types being one of them, tests another. Or simply running the code. It's easy enough to validate the output given the right approach that code generated by an LLM (usually as the result of a conversation/discussion about what should be accomplished) is a net positive.

If you zero-prompt and copy-paste the first result into your codebase, yeah, the accuracy problem will rear its ugly head real quick.

jaredsohn · 2024-12-31T23:57:38 1735689458

A similar use case for me - I wrote some technical documentation for our wiki about a somewhat complicated relationship between ids in some database tables. I copied my text explanation into an LLM and asked it to make a diagram and it did so. Took very little time from me and it was fast/easy to verify that the quality was good.

layer8 · 2024-12-31T23:53:55 1735689235

I think there’s the added reason that a lot of folks went into tech because (consciously or unconsciously) they prefer dealing with predictable machines than with unreliable humans. And now that career choice begins to look like a bait and switch. ;)

aleph_minus_one · 2025-01-01T21:08:40 1735765720

> Instead, think of an LLM as the equivalent of giving a human a menial task. You know that they're not 100% reliable, and so you give them only tasks that you can quickly verify and correct.

The problem is: for the tasks that I can give the LLM (or human) that I can easily verify and correct, the LLM fails with the majority of them, for example

- programming tasks of my area of expertise (which is more "mathematical" than what is common in SV startups), where I know how a high-level solution has to look like, and where I can ask the LLM to explain the gory details to me. Yes, these gory details are subtle (which is why the task can be menial), but the code has to be right. I can verify this, and the code is not correct.

- getting literature references about more obscure scientific (in particular mathematical) topics. I can easily check whether these literature references (or summaries of these references) are hallucinations - they typically are.

simonw · 2025-01-01T21:14:11 1735766051

LLMs on their own are effectively useless for references or citations. They need to be plugged into other systems for that - search-enabled ones like https://gemini.google.com or ChatGPT with search enabled or Perplexity can do this, although at that point they are mostly running the exact same searches you would.

BeetleB · 2025-01-03T17:36:37 1735925797

Your first task is definitely not what I would call a "menial" task.

Your second task is not a "task", but a knowledge search. LLMs are not good with searches (unless augmented - like RAG).

zahlman · 2025-01-01T03:42:59 1735702979

> Don't use LLMs where accuracy is paramount. Use it to automate away tedious stuff.

My programmer mind tells me that "tedious stuff" is where accuracy is the most important.