Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The thing about AI is that it doesn't work, you can't build on top of it, and it won't get better.

It doesn't work: even for the tiny slice of human work that is so well defined and easily assessed that it is sent out to freelancers on sites like Fiverr, AI mostly can't do it. We've had years to try this now, the lack of any compelling AI work is proof that it can't be done with current technology.

You can't build on top of it: unlike foundational technologies like the internet, AI can only be used to build one product, a chatbot. The output of an AI is natural language and it's not reliable. How are you going to meaningfully process that output? The only computer system that can process natural language is an AI, so all you can do is feed one AI into another. And how do you assess accuracy? Again, your only tool is an AI, so your only option is to ask AI 2 if AI 1 is hallucinating, and AI 2 will happily hallucinate its own answer. It's like The Cat in the Hat Comes Back, Cat E trying to clean up the mess Cat D made trying to clean up the mess Cat C made and so on.

And it won't get any better. LLMs can't meaningfully assess their training data, they are statistical constructions. We've already squeezed about all we can from the training corpora we have, more GPUs and parameters won't make a meaningful difference. We've succeeded at creating a near-perfect statistical model of wikipedia and reddit and so on, it's just not very useful even if it is endlessly amusing for some people.



> [LLMs] won't get any better.

Can you pinpoint the date which LLMs stagnated?

More broadly, it appears to me that LLMs have improved up to and including this year.

If you consider LLMs to not have improved in the last year, I can see your point. However, then one must consider ChatGPT 4.5, Claude 3.5, Deepseek, and Gemini 2.5 to not be improvements.


Sept 2024 was when OpenAI announced its new model - not an LLM but a "chain of thought" model built on LLMs. This represented a turn away from the "scale is all you need to reach AGI" idea by its top proponent.


If September 2024 marks the date in your mind stagnation was obvious, surely the last improvement must have come before?

Whatever the case, there are open platforms that give users a chance to compare two anonymous LLMs and rank the models as a result [1].

What I observe when I look for these rankings is that none of the top ranked models come from before your stagnation cut off date of September 2024 [2].

[1] https://arxiv.org/abs/2403.04132

[2] https://lmarena.ai/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: