Hacker Newsnew | past | comments | ask | show | jobs | submit | aadarshkumaredu's commentslogin

Exactly. Agents drift fast-internal state just can’t be trusted over long chains, and prompt rules degrade immediately.

Curious; have you seen drift follow a pattern, like step count or constraint complexity?

We’ve tried hybrid setups: ephemeral agent state plus external validation gates. Cuts down rollbacks while keeping control tight.

Would love to hear if anyone else has experimented with something similar.


Drift correlating more with constraint tension than raw step count matches what we’ve observed.

Your external gate instinct is right, but the gate has to be structurally external, not just logically external. If the agent can reason about the gate, it can learn to route around it.

We’ve been experimenting with pre-authorization before high-impact actions (rather than post-hoc validation) - I've drafted Cycles Protocol v0 spec to deal with this problem.

What’s interesting is that anomalous reservation patterns often show up before output quality visibly degrades — which makes drift detectable earlier.

Still early work, but happy to compare notes if that’s useful.


>...if the agent can reason about the gate, it can learn to route around it.

This is especially true. Earlier iterations of our build had python backed enforcement modules in an accessible path. The agent would identify the module that was blocking completion and, instead of fixing the error, it would access the enforcement module and adjust the code to unblock itself.


This is exactly the point where agent design starts to look less like workflow automation and more like control theory.

If the agent can inspect or mutate the enforcement layer, then the enforcement layer becomes part of the optimization surface. At that point you’re not solving drift, you’re creating an adversarial environment where the agent optimizes around constraints.

That suggests the real boundary isn’t logical separation, it’s capability isolation. The agent shouldn’t just fail validation, it shouldn’t even have the representational access required to reason about how validation works.

We’ve been experimenting with isolating enforcement in a separate execution layer with scoped pre-authorization for high-impact actions. When the agent can’t model the gate, routing-around behavior drops significantly, and drift shows up first in reservation or planning instability rather than surface output errors.

Still early exploration, but it’s becoming clear that “better prompting” is the least interesting part of this problem.


Calling it “hiding” assumes the default should be full exposure of internal reasoning. That’s not obviously true.

There are three separate layers here:

What the model internally computes

What the product exposes to the user

What developers need for debugging and control

Most outrage conflates all three.

Exposing raw reasoning tokens sounds transparent, but in practice it often leaks messy intermediate steps, half-formed logic, or artifacts that were never meant to be user-facing. That doesn’t automatically make a product more trustworthy. Sometimes it just creates noise.

The real issue is not whether internal thoughts are hidden. It’s whether developers can:

• Inspect tool calls • See execution traces • Debug failure modes • Reproduce behavior deterministically

If those are restricted, that’s a serious product problem. If what’s being “hidden” is just chain-of-thought verbosity, that’s a UI decision, not deception.

There’s also a business angle people don’t want to acknowledge. As models become productized infrastructure, vendors will protect internal mechanics the same way cloud providers abstract away hardware-level details. Full introspection is rarely a permanent feature in mature platforms.

Developers don’t actually want full transparency. They want reliability and control. If the system behaves predictably and exposes the right operational hooks, most people won’t care about hidden internal tokens.

The real question is: where should the abstraction boundary sit for a developer tool?


I think the “AI bubble burst” framing is too simplistic. Bubbles don’t erase technology. They erase mispriced capital.

In 2000, internet stocks collapsed. The internet didn’t. What disappeared were business models built on fantasy unit economics. The same distinction matters here.

If AI funding compresses, here’s what likely happens:

• Speculative AI wrappers die fast • Subsidized API pricing rises toward real compute cost • Consolidation around a few model and infrastructure providers • Enterprises shift from experimentation to strict ROI enforcement

Right now a lot of AI usage is artificially cheap. Growth is being prioritized over margin. If that flips, casual usage drops. That doesn’t mean the tools disappear. It means only usage that creates measurable value survives.

If a tool suddenly costs $1,000 per month, most hobbyists won’t pay. But that’s irrelevant. The real question is whether it replaces or amplifies enough labor to justify the price.

If it saves a team $8,000 to $20,000 in monthly productivity or headcount cost, it survives. If it’s just a nice-to-have, it dies.

The bigger risk isn’t foundation models disappearing. It’s the application layer collapsing. A lot of AI startups exist purely because inference is subsidized. If API pricing normalizes, many evaporate overnight.

A post-bubble AI world probably looks less magical and less overfunded, but more disciplined. Fewer demos. More boring enterprise contracts. More focus on margins.

The internet after 2000 didn’t disappear. It matured and consolidated into utilities. AI likely follows the same path.

The real question isn’t whether AI vanishes. It’s who was relying on subsidies instead of economics.


If possible I would love to have another button in HN: "Probably AI Slop"

when collected enough points, eventually punishes the authors and every comment they write will be labeled as "AI Slop"


I'm all for it if it'll stop people from commenting about things being AI slop or vibecoded and comes with a new guideline discouraging said comments.

Fair enough. If the substance reads generic, that’s on me.

The intent wasn’t to produce volume, it was to frame the economic layer of the discussion. Whether written with or without AI assistance, the argument still stands or falls on its logic.

What’s more interesting to me is how quickly “AI slop” becomes shorthand for structured reasoning. As these tools become common, separating low-effort output from thoughtful analysis is going to matter more, not less.

Ironically, this ties back to the original question about bubbles. If AI-generated content becomes abundant and cheap, signal will only survive where there’s clear economic or technical grounding behind it.

I’m spending a lot of time thinking about that boundary right now, especially in developer-facing systems where quality and constraint adherence actually matter. The difference between fluff and production-grade behavior is becoming very measurable.

Curious how others here distinguish between shallow AI-assisted output and work that’s actually grounded in systems thinking.


There must be a misunderstanding.

I'm saying I dislike the constant "this is AI slop" and "this is vibecoded" complaints on posts. I see no value in them and I wish they'd stop.

I'm saying nothing about the use of AI in generating content.


He's still using AI for his replies...

You really can’t even write a comment without relying on ai slop?

Fair point. I do rely on AI to help me organize thoughts sometimes, but the analysis isn’t generated blindly. Every point here reflects tradeoffs I’ve seen in real systems and historical patterns.

The real question isn’t whether AI helped write this. It’s whether the reasoning makes sense and matches what happens when capital and infrastructure collide.


Sorry, ai;dr (AI didn't read)

> The real question isn’t whether AI helped write this

It is. As soon as I saw the bullet points, my mind went "AI wrote this" and I stopped reading.


That’s fair. If formatting alone is enough to trigger dismissal, there’s not much I can do about that.

Bullet points aren’t an AI signature, they’re just a way to compress structure. If the argument is wrong, I’m happy to debate it. If it’s right, the formatting shouldn’t matter.

The economics of subsidized infrastructure vs. sustainable pricing is the core claim. That’s the part worth engaging with.


Removing jQuery isn’t a mechanical find-and-replace task.

jQuery is often deeply intertwined with: • Event delegation patterns • Implicit DOM readiness assumptions • Legacy plugin ecosystems • Cross-browser workarounds

An LLM can translate $(selector).addClass() to element.classList.add(). But it struggles when behavior depends on subtle timing, plugin side effects, or undocumented coupling.

The hard part isn’t syntax replacement. It’s preserving behavioral invariants across the app.

AI is decent at scaffolding migrations, but for legacy front-end codebases, you still need test coverage and incremental refactors. Otherwise it’s easy to “remove jQuery” and silently break UX flows.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: