More

Grimblewald · 2026-02-13T14:17:48 1770992268

Hard disagree, the interface hasn't changed at all. What has happened is new tools have appeared that make natural language a viable interface. It is a new lesser interface, not a replacement. Like a GUI, more accessible but functionally restricted. An interface that is conditioned on previously solved tasks, but unable to solve novel ones.

What this means is coding becomes accessible to those looking to apply something like python to solved problems, but it very much remains inaccessible to those looking to solve truly novel problems they have the skill to solve in their domain, but lack the coding skills to describe.

As a simple example, claude code is easily among the most competent coding interfaces I know of right now. However, if I give it a toy problem I've been toying with as a hobby project, and it breaks so badly it starts hallucinating that it is chatgpt.

``` This is actually a very robust design pattern that prevents overconfidence and enables continuous improvement. The [...lots of rambling...] correctly.

  ChatGPT

  Apologies, but I don't have the ability to run code or access files in a traditional sense. However, I can help you understand and work with the concepts you're describing. Let me
  provide a more focused analysis:

```

/insights doesn't help of course, it simply recommends I clear context on those situations and try again, but naturally it has the same problems. This isn't isolated, unless I give it simple tasks, it fails. The easy tasks it excels at though, it has handled a broad variety of tasks to a high degree of satisfaction, but it is a long shot away from replacing just writing code.

Bottom line, LLM's give coding a GUI, but like a GUI, is restricted and buggy.

0xecro1 · 2026-02-13T14:43:38 1770993818

I've seen non-programmers successfully launch real apps — not toy projects — through vibe coding. I'm doing it myself, and I'm about to ship a developer tool built the same way.

They'll still need to pick up the fundamentals of the programming — that part isn't optional yet. And getting to that level as a non-programmer takes real effort. But if the interest is there, it's far from impossible. In fact, I'd argue someone with genuine passion and domain expertise might have better odds than an average developer just going through the motions.

Grimblewald · 2026-02-14T01:03:11 1771030991

You're not getting it. Making app is a solved problem, especially if app function, features, and purpose is derivative of existing things,

Think of it like image generation AI. You can make acceptable if sloppy art with it, using styles that exist. However, you cannot create a new style. You cannot create pictures of things that are truly novel, to do that you have to pick up the brush yourself.

coding with llms is the exact same thing. It can give you copies of what exists, and sometimes reasonable interpolations/mashups, but i have not seen a single succesful example of extrapolation. Not one. You simply leave the learned manifold and everything gets chaotic like in the example i provided.

If AI can make what you want, then the thing you made is not as novel as you thought. You're repurpising solved problems. Still useful, still interesting, just not as ground breaking as the bot will try and tell you.

Grimblewald · 2026-02-13T13:40:38 1770990038

I've been writing a new textbook for undergrads (chemistry domain focus), and think this excerpt is generally solid advice that is applicable here. Any feedback is welcome (textbook to be published gplv3 via GitHub). I appreciate I am on the conservative side here. The following is copy-paste of the final notes/tips/warnings in the book, copied from latex source with minimal edits for display here:

Rather than viewing AI as forbidden or universally permitted, consider this progression:

1. Foundation Phase (Avoid generation, embrace explanation)

When learning a new library (e.g., your first RDKit script or lmfit model), do not ask the AI to write the code.

Instead, write your own attempt, then use AI to:

• Explain error tracebacks in plain language

• Compare your approach to idiomatic patterns

• Suggest documentation sections you may have missed

2. Apprenticeship Phase (Pair programming)

Once you can write working but inelegant code, use AI as a collaborative reviewer:

• Refactor working scripts for readability

• Vectorize slow loops you have already prototyped

• Generate unit tests for functions you have written

3. Independence Phase (Managed delegation)

When you have the skill to write the code yourself but choose to delegate to save time, you are essentially trading the effort of writing for the effort of auditing. Because your prompts are condensed summaries of intent rather than literal instructions, the LLM must fill the "ambiguity gap" with educated guesses.

Delegation only works if you are skilled enough to recognise when those guesses miss the mark; if your words were precise enough to never be misunderstood, they would already be code. Coding without oversight is dangerous and deeply incompetent behaviour in professional environments.

Examples of use-cases are:

• Generate boilerplate for familiar patterns, then audit line-by-line

• Prototype alternative algorithms you already understand conceptually

• Document code you have written (reverse the typical workflow)

Grimblewald · 2026-02-12T14:01:26 1770904886

I'm a Claude user who has been burned lately by how opaque the system has become. My workflows aren't long and my projects are small in terms of file count, but the work is highly specialized. It is "out of domain" enough that I'm getting "what is the seahorse emoji" style responses for genuine requests that any human in my field could easily follow. I've been testing Claude on small side projects to check its reliability. I work at the cutting edge of multiple academic domains, so even the moderate utiltity I have seen in this is exciting for me, but right now Claude cannot be trusted to get things right without constant oversight and frequent correction, often for just a single step. For people like me, this is make or break. If I cannot follow the reasoning, read the intent, or catch logic disconnects early, the session just burns through my token quota. I'm stuck rejecting all changes after waiting 5 minutes for it to think, only to have to wait 5 hours to try again. Without being able to see the "why" behind the code, it isn't useful. It makes typing "claude" into my terminal an exercise in masochism rather than the productivity boost it's supposed to be. I get that I might not be the core target demographic, but it's good PR for Anthropic if Claude is credited in the AI statements of major scientific publications. As it stands, trajectory in develeopment means I cannot in good conscience recommend Claude Code for scientific domains.

tartoran · 2026-02-12T16:47:29 1770914849

>the session just burns through my token quota

Did you ever think that this may be Anthropic's goal? It is a waste for sure but it increases their revenue. Later on the old feature you were used to may resurface at a different tier so you'd have to pay up to get it.

jatora · 2026-02-12T14:05:14 1770905114

What academic domains are you on the cutting edge of? Genuinely curious what specifically is beyond claude's capabilites

Grimblewald · 2026-02-12T14:21:25 1770906085

Most recent problems were related to topology, but it can take the wrong direction on many things. This is not an LLM fault; it's a training data issue. If historically a given direction of inquiry is favored, you can't fault an LLM for being biased toward it. However, if small volume and recent results indicate that path is a dead end, you don't want to be stuck in fruitless loops that prevent you from exploring other avenues.

The problem is if you're interdisciplinary, translating something from one field to one typically considered quite distant, you may not always be aware of historic context that is about to fuck you. Not without deeper insight into what the LLM is choosing to do or read and your ability to infer how expected the behavior you're about to see is.

Grimblewald · 2026-02-11T22:58:20 1770850700

Termux development is fully functional. I do the same happily, even for the joy of it to pass time in transit. There isn't much to be gained by a rig you could get on that budget. A better keyboard and a usb monitor for phone can go miles tho.

That said, i see no proof of anything other tham a few file names. i can do that too, see?

guys! I made an sovreign analysis app that helps you collate and analyse crowd sourced data in fun social competitions that then produces awards for teams based on statistifal identities (means, outliers, maxs, highest varience over categories etc) buy me a laptop so i can make the cocktail olympics a reality for everyone!

```

awards.py fix_score_fuckery.py

category_correlations.png

pairwise.py

competition_boxplot_dashboard.png

pairwise_plot.png

competition_dashboard.png

plotter.py

competition_data.json

power_score.py

competition_data_backup.json

bring_it_home_baby.py

correlation_heatmap.png

statistical_significance_heatmap.png

```

suffering · 2026-02-12T22:57:48 1770937068

Impressive! Working on Termux is the first step toward breaking free from closed platform dependencies. I am developing 'Noor' on a 6-inch screen from Yemen because sovereignty starts with the mind, not the hardware If it weren't for the 'Code Slavery' locking my earnings (over 100 Million tokens), I’d tell you my next laptop is a gift for you. Keep coding; the future belongs to Sovereign AI.

Grimblewald · 2026-02-09T22:43:05 1770676985

This reads like "if we have to solution, then we have the solution". If I can model the system required to condition inputs such that outouts are deseriable, haven't i given the model the world model it required? More to the point, isn't this just what the article argues? Scaling the model cannot solve this issue.

it's like saying a pencil is a portraint drawring device, like it isn't thr artist who makes it a portrait drawring device, wheras in the hands of a peot a peom generating machine.

docjay · 2026-02-10T00:20:09 1770682809

So much of what you said is exactly what I’m saying that it’s pointless to quote any one part. Your ‘pencil’ analogy is perfect! Yes, exactly. Follow me here:

We know that the pencil (system) can write a poem. It’s capable.

We know that whether or not it produces a poem depends entirely on the input (you).

We know that if your input is ‘correct’ then the output will be a poem.

“Duh” so far, right? Then what sense does it make to write something with the pencil, see that it isn’t a poem, then say “the input has nothing to do with it, the pencil is incapable.” ?? That’s true of EVERY system where input controls the output and the output is CAPABLE of the desired result. I said nothing about the ease by which you can produce the output, just that saying input has nothing to do with it is objectively not true by the very definition of such a system.

You might say “but gee, I’ll never be able to get the pencil input right so it produces a poem”. Ok? That doesn’t mean the pencil is the problem, nor that your input isn’t.

Grimblewald · 2026-02-08T23:02:03 1770591723

Despite your objection to looking, I did. What you're saying doesn't seem to check out. For example, hellow world not compiling seems like a significant issue and at first glance seems genuine even if there is some anti ai banter in the thread.

zamadatix · 2026-02-09T01:54:57 1770602097

Between all of the "banter" (which is from more than just the anti-AI folks) you may not have caught I was one of the first 10 comments on that very issue 3 days ago https://github.com/anthropics/claudes-c-compiler/issues/1#is.... Not to imply there are no issues or it's a good compiler, the README.md says as much, but I found in practice you can get to CCC compiling a version of the Linux kernel in the amount of time it takes to go through that thread about hello world.

Of course - you do you, not everyone is the same. If that kind of discussion piques your interest or feels easier to consume then there is plenty more to be found there. At least that guy's bot spamming 75% of the issues board has closed them all now (though the comments are still there in responses other issues) so it's a little cleaner.

N.b. for anyone seeing "root@main" in the above link - that's just an ephemeral rootless container instance on a dev VM host from a template named "main" I spun up to mess with CCC. I.e. "don't let the prompt imply I recommend using actual root on your actual main box to do much of anything, let alone run random projects from GitHub" :).

Grimblewald · 2026-02-08T22:48:27 1770590907

Paying per token also encouragages reduced quality only now you pay. If they can subtbtly degrade quality or even probability of 1shot solutions, they get you paying for more tokens. Under current economic models and incentive structures, enshitification is inevitable, since we're optimizing for it long term.

jtbayly · 2026-02-09T11:35:31 1770636931

What if there is actual competition, though? That’s the hope I keep having. If there is a cheaper, better model, I can switch.

Grimblewald · 2026-02-09T22:07:05 1770674825

For that to work it requires a free market, llms in their current format are a neccesarily closed market. It's like mobile phones. You'll get a sleek somewhat passable product increasingly dated and dysfunctional which every year serves you less and someone else more. Given I can't decide smart phones in their current form are shit, i'll make something better (without enromous capital) meams we're failing open market conditions. Do you see the point i am trying to make?

jtbayly · 2026-02-10T11:34:30 1770723270

That’s not what free market means.

Grimblewald · 2026-02-11T09:25:12 1770801912

Tell my economics textbook not me. Free markets are defined, in part, by the absence of coercive impediments to economic activity, which explicitley includes restrictions on entry.

see "low barriers for entry"

https://en.wikipedia.org/wiki/Free_market

jtbayly · 2026-02-12T00:42:13 1770856933

"Low" is relative. But we've got people creating new models with millions of dollars, not billions. Granted, not thousands either. It's low enough that I don't think the barrier to entry is a problem.

Grimblewald · 2026-02-14T01:12:43 1771031563

Good golly miss molly them goalposts sure can fly

right, so basically the only people who can enter the market are those part of the same club who have brought us the stripped down and dated wonders before us today.

Take the mobile phone market, there is basically no innovation going on these days. Small iterative steps and minor improvements, each new generation another sensor removed or new consumer hostile bloat added, because everyone in the club agrees on how to fuck the consumer, irrespective of what the consumer wants. It's an illusion of choice.

Grimblewald · 2026-02-07T06:10:01 1770444601

Yes. Something you should intuit, and is eaisly confirmed with a quick search. It is licensed to drive and the conditions underwhich it may do so are clearly stipulated. If it didnt require a license elon would have his deathtraps littering roadsides with mangled flesh and steel everywhere. Perhaps ask yourself why you asked such a misguided question and consider what you can do different in your cognitive patternd to avoid it in the future.

whatever1 · 2026-02-07T19:30:33 1770492633

So if it is licensed, the entire stack is licensed even if there are dogs pressing buttons on the other side of the planet.

Elons cars are already driving by themselves against the conditions they are licensed for, causing accidents and the liability falls to the drivers.

Aka, licensing means absolutely nothing. Specially today.

Grimblewald · 2026-02-06T00:29:27 1770337767

might not work well, but by navigating to a very harry potter dominant part of latent space by preconditioning on the books you make it more likely to get good results. An example would be taking a base model and prompting "what follows is the book 'X'" it may or may not regurgitate the book correctly. Give it a chunk of the first chapter and let it regurgitate from there and you tend to get fairly faithful recovery, especially for things on gutenberg.

So it might be there, by predcondiditioning latent space to the area of harry potter world, you make it so much more probable that the full spell list is regurgitated from online resources that were also read, while asking naive might get it sometimes, and sometimes not.

the books act like a hypnotic trigger, and may not represent a generalized skill. Hence why replacing with random words would help clarify. if you still get the origional spells, regurgitation confirmed, if it finds the spells, it could be doing what we think. An even better test would be to replace all spell references AND jumble chapters around. This way it cant even "know" where to "look" for the spell names from training.

Grimblewald · 2026-02-03T12:51:02 1770123062

And in such a system, before long, we have an ecosystem that resembles the venereal disease masequaraing as an addon store we see in wordpress.