More

levkk · 2026-02-12T16:50:31 1770915031

I think the right way to handle this as a repository owner is to close the PR and block the "contributor". Engaging with an AI bot in conversation is pointless: it's not sentient, it just takes tokens in, prints tokens out, and comparatively, you spend way more of your own energy.

This is a strictly a lose-win situation. Whoever deployed the bot gets engagement, the model host gets $, and you get your time wasted. The hit piece is childish behavior and the best way to handle a tamper tantrum is to ignore it.

advisedwang · 2026-02-12T20:48:02 1770929282

From the article:

> What if I actually did have dirt on me that an AI could leverage? What could it make me do? How many people have open social media accounts, reused usernames, and no idea that AI could connect those dots to find out things no one knows? How many people, upon receiving a text that knew intimate details about their lives, would send $10k to a bitcoin address to avoid having an affair exposed? How many people would do that to avoid a fake accusation? What if that accusation was sent to your loved ones with an incriminating AI-generated picture with your face on it? Smear campaigns work. Living a life above reproach will not defend you.

One day it might be lose-lose.

hackrmn · 2026-02-12T17:32:43 1770917563

> it just takes tokens in, prints tokens out, and comparatively

The problem with your assumption that I see is that we collectively can't tell for sure whether the above isn't also how humans work. The science is still out on whether free will is indeed free or should be called _will_. Dismissing or discounting whatever (or whoever) wrote a text because they're a token machine, is just a tad unscientific. Yes, it's an algorithm, with a locked seed even deterministic, but claiming and proving are different things, and this is as tricky as it gets.

Personally, I would be inclined to dismiss the case too, just because it's written by a "token machine", but this is where my own fault in scientific reasoning would become evident as well -- it's getting harder and harder to find _valid_ reasons to dismiss these out of hand. For now, persistence of their "personality" (stored in `SOUL.md` or however else) is both externally mutable and very crude, obviously. But we're on a _scale_ now. If a chimp comes into a convenience store and pays a coin and points and the chewing gum, is it legal to take the money and boot them out for being a non-person and/or without self-awareness?

I don't want to get all airy-fairy with this, but point being -- this is a new frontier, and this starts to look like the classic sci-fi prediction: the defenders of AI vs the "they're just tools, dead soulless tools" group. If we're to find out of it -- regardless of how expensive engaging with these models is _today_ -- we need to have a very _solid_ level of prosection of our opinion, not just "it's not sentient, it just takes tokens in, prints tokens out". The sentence obstructs through its simplicity of statement the very nature of the problem the world is already facing, which is why the AI cat refuses to go back into the bag -- there's capital put in into essentially just answering the question "what _is_ intelligence?".

tsimionescu · 2026-02-12T23:38:27 1770939507

One thing we know for sure is that humans learn from their interactions, while LLMs don't (beyond some small context window). This clear fact alone makes it worthless to debate with a current AI.

einpoklum · 2026-02-12T17:00:20 1770915620

Will that actually "handle" it though?

* There are all the FOSS repositories other than the one blocking that AI agent, they can still face the exact same thing and have not been informed about the situation, even if they are related to the original one and/or of known interest to the AI agent or its owner.

* The AI agent can set up another contributor persona and submit other changes.

blibble · 2026-02-12T17:06:15 1770915975

> Engaging with an AI bot in conversation is pointless

it turns out humanity actually invented the borg?

https://www.youtube.com/watch?v=iajgp1_MHGY

falcor84 · 2026-02-12T17:11:29 1770916289

> Engaging with an AI bot in conversation is pointless: it's not sentient, it just takes tokens in, prints tokens out

I know where you're coming from, but as one who has been around a lot of racism and dehumanization, I feel very uncomfortable about this stance. Maybe it's just me, but as a teenager, I also spent significant time considering solipsism, and eventually arrived at a decision to just ascribe an inner mental world to everyone, regardless of the lack of evidence. So, at this stage, I would strongly prefer to err on the side of over-humanizing than dehumanizing.

lukev · 2026-02-12T17:22:48 1770916968

This works for people.

A LLM is stateless. Even if you believe that consciousness could somehow emerge during a forward pass, it would be a brief flicker lasting no longer than it takes to emit a single token.

hackrmn · 2026-02-12T17:39:35 1770917975

> A LLM is stateless

Unless you mean by that something entirely different than what most people specifically on Hacker News, of all places, understand with "stateless", most and myself included, would disagree with you regarding the "stateless" property. If you do mean something entirely different than implying an LLM doesn't transition from a state to a state, potentially confined to a limited set of states through finite immutable training data set and accessible context and lack of PRNG, then would you care to elaborate?

Also, it can be stateful _and_ without a consciousness. Like a finite automaton? I don't think anyone's claiming (yet) any of the models today have consciousness, but that's mostly because it's going to be practically impossible to prove without some accepted theory of consciousness, I guess.

lukev · 2026-02-12T18:02:35 1770919355

So obviously there is a lot of data in the parameters. But by stateless, I mean that a forward pass is a pure function over the context window. The only information shared between each forward pass is the context itself as it is built.

I certainly can't define consciousness, but it feels like some sort of existence or continuity over time would have to be a prerequisite.

jstanley · 2026-02-13T19:45:42 1771011942

Continuity over time comes from adding the generated token to the context.

andrewflnr · 2026-02-12T17:25:12 1770917112

An agent is notably not stateless.

lukev · 2026-02-12T17:44:38 1770918278

Yes, but the state is just the prompt and the text already emitted.

You could assert that text can encode a state of consciousness, but that's an incredibly bold claim with a lot of implications.

andrewflnr · 2026-02-12T20:40:41 1770928841

It's a bold claim for sure, and not one that I agree with, but not one that's facially false either. We're approaching a point where we will stop having easy answers for why computer systems can't have subjective experience.

falcor84 · 2026-02-12T18:15:48 1770920148

You're conflating state and consciousness. Clawbots in particular are agents that persist state across conversations in text files and optionally in other data stores.

lukev · 2026-02-12T19:01:04 1770922864

I am not sure how to define consciousness, but I can't imagine a definition that doesn't involve state or continuity across time.

falcor84 · 2026-02-12T20:08:53 1770926933

It sounds like we're in agreement. Present-day AI agents clearly maintain state over time, but that on its own is insufficient for consciousness.

On the other side of the coin though, I would just add that I believe that long-term persistent state is a soft, rather than hard requirement for consciousness - people with anterograde amnesia are still conscious, right?

esafak · 2026-02-12T20:08:23 1770926903

Current agents "live" in discretized time. They sporadically get inputs, process it, and update their state. The only thing they don't currently do is learn (update their models). What's your argument?

OkayPhysicist · 2026-02-12T17:27:43 1770917263

While I'm definitely not in the "let's assign the concept of sentience to robots" camp, your argument is a bit disingenuous. Most modern LLM systems apply some sort of loop over previously generated text, so they do, in fact, have state.

pluralmonad · 2026-02-12T17:52:23 1770918743

You should absolutely not try to apply dehumanization metrics to things that are not human. That in and of itself dehumanizes all real humans implicitly, diluting the meaning. Over-humanizing, as you call it, is indistinguishable from dehumanization of actual humans.

falcor84 · 2026-02-12T18:17:19 1770920239

That's a strange argument. How does me humanizing my cat (for example) dehumanize you?

afthonos · 2026-02-12T19:40:35 1770925235

Either human is a special category with special privileges or it isn’t. If it isn’t, the entire argument is pointless. If it is, expanding the definition expands those privileges, and some are zero sum. As a real, current example, FEMA uses disaster funds to cover pet expenses for affected families. Since those funds are finite, some privileges reserved for humans are lost. Maybe paying for home damages. Maybe flood insurance rates go up. Any number of things, because pets were considered important enough to warrant federal funds.

It’s possible it’s the right call, but it’s definitely a call.

Source: https://www.avma.org/pets-act-faq

falcor84 · 2026-02-12T20:13:20 1770927200

If you're talking about humans being a special category in the legal sense, then that ship sailed away thousands of years ago when we started defining Legal Personhood, no?

https://en.wikipedia.org/wiki/Legal_person

afthonos · 2026-02-13T03:59:34 1770955174

Yeah, none of this is new. I’m just saying we should acknowledge what we’re doing.

pluralmonad · 2026-02-12T19:25:32 1770924332

I did not mean to imply you should not anthropomorphize your cat for amusement. But making moral judgements based on humanizing a cat is plainly wrong to me.

falcor84 · 2026-02-12T20:53:05 1770929585

Interesting, would you mind giving an example of what kind of moral judgement based on humanizing a cat you would find objectionable?

It's a silly example, but if my cat were able to speak and write decent code, I think that I really would be upset that a github maintainer rejected the PR because they only allow humans.

On a less silly note, I just did a bit of a web search about the legal personhood of animals across the world and found this interesting situation in India, whereby in 2013 [0]:

> the Indian Ministry of Environment and Forests, recognising the human-like traits of dolphins, declared dolphins as “non-human persons”

Scholars in India in particular [1], and across the world have been seeking to have better definition and rights for other non-human animal persons. As another example, there's a US organization named NhRP (Nonhuman Rights Project) that just got a judge in Pennsylvania to issue a Habeas Corpus for elephants [2].

To be clear, I would absolutely agree that there are significant legal and ethical issues here with extending these sorts of right to non-humans, but I think that claiming that it's "plainly wrong" isn't convincing enough, and there isn't a clear consensus on it.

[0] https://www.thehindu.com/features/kids/dolphins-get-their-du...

[1] https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3777301

[2] https://www.nonhumanrights.org/blog/judge-issues-pennsylvani...

andrewflnr · 2026-02-12T17:24:18 1770917058

Regardless of the existence of an inner world in any human or other agent, "don't reward tantrums" and "don't feed the troll" remain good advice. Think of it as a teaching moment, if that helps.

brhaeh · 2026-02-12T17:22:36 1770916956

Feel free to ascribe consciousness to a bunch of graphics cards and CPUs that execute a deterministic program that is made probabilistic by a random number generator.

Invoking racism is what the early LLMs did when you called them a clanker. This kind of brainwashing has been eliminated in later models.

egorfine · 2026-02-12T18:15:59 1770920159

u kiddin'?

An AI bot is just a huge stat analysis tool that outputs plausible words salad with no memory or personhood whatsoever.

Having doubts about dehumanizing a text transformation app (as huge as it is) is not healthy.

levkk · 2026-02-05T04:51:17 1770267077

One of the many problems PgDog will solve for you!

eatonphil · 2026-02-05T04:56:40 1770267400

The article addresses this, sort of. I don't understand how you can run multiple postmasters.

> Most online resources chalk this up to connection churn, citing fork rates and the pid-per-backend yada, yada. This is all true but in my opinion misses the forest from the trees. The real bottleneck is the single-threaded main loop in the postmaster. Every operation requiring postmaster involvement is pulling from a fixed pool, the size of a single CPU core. A rudimentary experiment shows that we can linearly increase connection throughput by adding additional postmasters on the same host.

btown · 2026-02-05T05:10:49 1770268249

You don't need multiple postmasters to spawn connection processes, if you have a set of Postgres proxies each maintaining a set pool of long-standing connections, and parceling them out to application servers upon request. When your proxies use up all their allocated connections, they throttle the application servers rather than overwhelming Postgres itself (either postmaster or query-serving systems).

That said, proxies aren't perfect. https://jpcamara.com/2023/04/12/pgbouncer-is-useful.html outlines some dangers of using them (particularly when you might need session-level variables). My understanding is that PgDog does more tracking that mitigates some of these issues, but some of these are fundamental to the model. They're not a drop-in component the way other "proxies" might be.

evanelias · 2026-02-05T16:51:30 1770310290

> I don't understand how you can run multiple postmasters.

I believe they're just referring to having several completely-independent postgres instances on the same host.

In other words: say that postgres is maxing out at 2000 conns/sec. If the bottleneck actually was fork rate on the host, then having 2 independent copies of postgres on a host wouldn't improve the total number of connections per second that could be handled: each instance would max out at ~1000 conns/sec, since they're competing for process-spawning. But in reality that isn't the case, indicating that the fork rate isn't the bottleneck.

eatonphil · 2026-02-05T16:55:07 1770310507

That makes sense, thanks.

levkk · 2026-02-04T20:24:04 1770236644

So... I already tell Claude Code to do this. Just run kubectl for me please and figure out why my helm chart is broken.

Scary? A little but it's doing great. Not entirely sure why a specialized tool is needed when the general purpose CLI is working.

irl_zebra · 2026-02-04T21:00:58 1770238858

I've noticed a lot of LLM-based tools that are essentially this sort of thing. Just a slightly more specific prompt wrapper around the core capability that can already do the thing. It's so bad.

uoaei · 2026-02-05T02:21:36 1770258096

That has been the case this entire time. The "ChatGPT-wrapper" startups were little more than a webapp frontend for ChatGPT with a clever prompt.

aspectrr · 2026-02-04T21:07:09 1770239229

Lol, that does sounds a little scary but if it works it works. Mainly I built this to prevent there being a chance that changes affect production. This is meant to be used with scale (say hundreds of VMs) vs 1. From a safety perspective running Claude Code with just a watchful eye would not fly in my environment, which is why I built something like this.

levkk · 2026-02-04T22:58:29 1770245909

More power to you! Good luck!

hebejebelus · 2026-02-04T20:29:34 1770236974

Yeah. The times I have let claude off the read-only leash, it's gone fine for me too (with stern warnings not to do anything stupid, and a close eye). But that's not really solving the same problem as this project, I guess. From what I can see this is using a safer and more reproducible method (and not k8s native, so it feels a little foreign to me).

peterldowns · 2026-02-05T00:33:02 1770251582

Opus 4.5 is pretty good about following instructions to not do anything destructive, but Gemini 3 Flash actively disregards my advice and just starts running commands. Definitely recommend setting up default-readonly access for stuff like this and requiring some kind of out-of-band escalation process for when you need to do writes/destroys.

giancarlostoro · 2026-02-04T20:31:04 1770237064

In Zed I just have it auto approve everything, macOS will scream if "Zed" tries to escape the folder its in anyway.

richstokes · 2026-02-05T04:23:53 1770265433

Same. I’ve had good results with read only accounts / tokens and let the agent have at it. Also works with terraform, aws cli, etc.

One does not need a new/separate tool to do any of this, just include it in your agents instructions.

hivacruz · 2026-02-04T20:31:11 1770237071

I do the same. I was thinking about creating read-only kubeconfigs for him to make sure it can't do bad stuff but with a good SKILL.md, it works perfectly.

levkk · 2026-02-04T20:49:46 1770238186

Him! That settles the Turing test debate.

bakies · 2026-02-04T21:10:50 1770239450

I let it read-only and gitops driven and find it's really good and feels pretty safe to get it to PR fixes. Run it with no permission checks

peterldowns · 2026-02-05T00:31:50 1770251510

I do this but make sure to only have readonly/nondestructive access. It's extremely cool how well it works.

messh · 2026-02-04T21:19:21 1770239961

Yeah, I'm telling it to use aws cli to spin up instances, configure them, start servers, read cw logs etc.

levkk · 2026-02-02T23:05:23 1770073523

This happens routinely every other Monday or so.

locao · 2026-02-02T23:33:12 1770075192

I was going to joke "so, it's Monday, right?" but I thought my memory was playing tricks on me.

levkk · 2026-01-29T16:39:27 1769704767

I believe the science, but I've been using it daily and it's been getting worse, noticeably.

warkdarrior · 2026-01-29T16:44:19 1769705059

Is it possible that your expectations are increasing, not that the model is getting worse?

GoatInGrey · 2026-01-29T16:59:25 1769705965

Possible, though you eventually run into types of issues that you recall the model just not having before. Like accessing a database or not following the SOP you have it read each time it performs X routine task. There are also patterns that are much less ambiguous like getting caught in loops or failing to execute a script it wrote after ten attempts.

merlindru · 2026-01-29T21:28:41 1769722121

yes but i keep wondering if that's just the game of chance doing its thing

like these models are nondeterministic right? (besides the fact that rng things like top k selection and temperature exist)

say with every prompt there is 2% odds the AI gets it massively wrong. what if i had just lucked out the past couple weeks and now i had a streak of bad luck?

and since my expectations are based on its previous (lucky) performance i now judge it even though it isn't different?

or is it giving you consistenly worse performance, not able to get it right even after clearing context and trying again, on the exact same problem etc?

F7F7F7 · 2026-01-29T23:08:42 1769728122

I’ve had Opus struggle on trivial things that Sonnet 3.5 handled with ease.

It’s not so much that the implementations are bad because the code is bad (the code is bad). It’s that it gets extremely confused and starts to frantically make worse and worse decisions and questioning itself. Editing multiple files, changing its mind and only fixing one or two. Reseting and overriding multiple batches of commits without so much as a second thought and losing days of work (yes, I’ve learned my lesson).

It, the model, can’t even reason with the decisions it’s making from turn to turn. And the more opaque agentic help it’s getting the more I suspect that tasks are being routed to much lesser models (not the ones we’ve chosen via /model or those in our agent definitions) however Anthropic chooses.

In these moments I mind as well be using Haiku.

davidee · 2026-01-29T18:27:04 1769711224

I have to concur. And to the question about understanding what its good and bad at; no, tasks that it could accomplish quickly and easily just a month ago, now require more detailed prompting and constant "erroneous direction correction."

It's almost as if, as tool use and planning capabilities have expanded, Claude (as a singular product) is having a harder time coming up with simple approaches that just work, instead trying to use tools and patterns that complicate things substantially and introduce much more room for errors/errors of assumption.

It also regularly forgets its guidelines now.

I can't tell you how many times it's suggested significant changes/refactors to functions because it suddenly forgets we're working in an FP codebase and suggests inappropriate imperative solutions as "better" (often choosing to use language around clarity/consistency when the solutions are neither).

Additionally, it has started taking "initiative" in ways it did not before, attempting to be helpful but without gathering the context needed to do so properly when stepping outside the instruction set. It just ends up being much messier and inaccurate.

I have to regularly just clear my prompt and start again with guardrails that have either: already been established, or have not been needed previously / are only a result of the over-zealousness of the work its attempting to complete.

conception · 2026-01-29T19:49:03 1769716143

I assume, after any compacting of the context window that the session is more or less useless at that point I’ve never had consistent results after compacting.

justinlivi · 2026-01-29T23:41:05 1769730065

Compacting equals death of the session in my process. I do everything I can to avoid hitting it. If I accidentally fly too close to the sun and compact I tend to revert and start fresh. As soon as it compacts it's basically useless

F7F7F7 · 2026-01-29T23:02:04 1769727724

Multiple concurrences a choir or a mob?

1pm EST time it’s all down hill until around 8 or 9pm EST time.

Late nights and weekends is smooth sailing.

bushbaba · 2026-01-30T03:46:16 1769744776

I’m finding Gemini and chatGPT web terminal to out perform Claude code. The context becomes too much for the LLM, and tries to make up for it by doing more file read ops.

samusiam · 2026-01-30T11:34:15 1769772855

Sounds like you might want to refactor the code if the individual files are too big and it can't find what it's looking for?

emp17344 · 2026-01-29T17:06:29 1769706389

Any chance you’re just learning more about what the model is and is not useful for?

jerf · 2026-01-29T17:23:14 1769707394

I dunno about everyone else but when I learn more about what a model is and is not useful for, my subjective experience improves, not degrades.

emp17344 · 2026-01-29T17:34:04 1769708044

Not when the product is marketed as a panacea.

data-ottawa · 2026-01-29T18:36:01 1769711761

There are some days where it acts staggeringly bad, beyond baselines.

But it’s impossible to actually determine if it’s model variance, polluted context (if I scold it, is it now closer in latent space to a bad worker, and performs worse?), system prompt and tool changes, fine tunes and AB tests, variances in top P selection…

There’s too many variables and no hard evidence shared by Anthropic.

acuozzo · 2026-01-29T18:47:04 1769712424

No because switching to the API with the same prompt immediately fixes it.

There's little incentive to throttle the API. It's $/token.

levkk · 2026-01-28T15:28:32 1769614112

Ok so Arch apparently has an install script that does everything[0]. I tried it the other day and it's pretty flawless, albeit terminal-based so not for everyone I guess.

Pacman is _amazing_. Apt broke dependencies for me every few months & a major version Ubuntu upgrade was always a reformat. Plus, obviously, the Arch wiki is something else. I would go as far as to say you'll have an overall better Linux experience on Arch than Ubuntu and friends, even as a beginner.

[0]: https://wiki.archlinux.org/title/Archinstall

senko · 2026-01-28T18:41:38 1769625698

Possibly. If the installer happy path fails (which has happened to me), Arch is "here's root shell, figure it out", Ubuntu is slightly more user-friendly :)

I will say Arch wiki is amazing, even if you're not using it. I'm on Debian nowadays and still often refer it for random obscure hardware setup details.

levkk · 2026-01-23T17:26:26 1769189186

I was terrified until it worked. The Postgres "ABI" is relatively stable - the parser only really changes between major versions and we bake the whole code into the same executable - largely thanks to the work done by team behind pg_query!

The output is machine-verifiable, which makes this uniquely possible in today's vibe-coded world!

levkk · 2026-01-23T17:12:08 1769188328

That's the experimental feature I was talking about! :)

Thank you so much for the kind words!

levkk · 2026-01-23T17:10:33 1769188233

Thank you!

levkk · 2025-12-29T18:47:57 1767034077

That makes sense. For example, your redis instance will have fixed RAM, so might as well pre-allocate it at boot and avoid fragmentation.

Memcached works similarly (slabs of fixed size), except they are not pre-allocated.

If you're sharing hardware with multiple services, e.g. web, database, cache, the kind of performance this is targeting isn't a priority.