More

alexbecker · 2025-11-09T22:42:31 1762728151

I'm working on _prompt injection_, the problem where LLMs can't reliably distinguish between the user's instructions and untrusted content like web search results.

Just published a blog post a few minutes ago: https://alexcbecker.net/blog/prompt-injection-benchmark.html

sbinnee · 2025-11-10T01:12:05 1762737125

Good post. Thanks for sharing. I enjoyed it as much as I enjoyed your anime list. I agree on many.

alexbecker · 2025-08-24T16:32:45 1756053165

I doubt Comet was using any protections beyond some tuned instructions, but one thing I learned at USENIX Security a couple weeks ago is that nobody has any idea how to deal with prompt injection in a multi-turn/agentic setting.

hoppp · 2025-08-24T16:47:42 1756054062

Maybe treat prompts like it was SQL strings, they need to be sanitized and preferably never exposed to external dynamic user input

Terr_ · 2025-08-24T17:22:34 1756056154

The LLM is basically an iterative function going guess_next_text(entire_document). There is no algorithm-level distinction at all between "system prompt" or "user prompt" or user input... or even between its own prior output. Everything is concatenated into one big equally-untrustworthy stream.

I suspect a lot of techies operate with a subconscious good-faith assumption: "That can't be how X works, nobody would ever built it that way, that would be insecure and naive and error-prone, surely those bajillions of dollars went into a much better architecture."

Alas, when it comes to day's the AI craze, the answer is typically: "Nope, the situation really is that dumb."

__________

P.S.: I would also like to emphasize that even if we somehow color-coded or delineated all text based on origin, that's nowhere close to securing the system. An attacker doesn't need to type $EVIL themselves, they just need to trick the generator into mentioning $EVIL.

alexbecker · 2025-08-24T19:30:02 1756063802

There have been attempts like https://arxiv.org/pdf/2410.09102 to do this kind of color-coding but none of them work in a multi-turn context since as you note you can't trust the previous turn's output

Terr_ · 2025-08-24T19:48:14 1756064894

Yeah, the functionality+security everyone is dreaming about requires much more than "where did the the words come from." As we keep following the thread of "one more required improvement", I think it'll lead to: "Crap, we need to invent a real AI just to keep the LLM in line."

Even just the first step on the list is a doozy: The LLM has no authorial ego to separate itself from the human user, everything is just The Document. Any entities we perceive are human cognitive illusions, the same way that the "people" we "see" inside a dice-rolled mad-libs story don't really exist.

That's not even beginning to get into things like "I am not You" or "I have goals, You have goals" or "goals can conflict" or "I'm just quoting what You said, saying these words doesn't mean I believe them", etc.

prisenco · 2025-08-24T17:50:47 1756057847

Sanitizing free-form inputs in a natural language is a logistical nightmare, so it's likely there isn't any safe way to do that.

hoppp · 2025-08-24T18:08:09 1756058889

Maybe an LLM should do it.

1st run: check and sanitize

2nd run: give to agent with privileges to do stuff

prisenco · 2025-08-24T18:19:09 1756059549

Problems created by using LLMs generally can't be solved using LLMS.

Your best case scenario is reducing risk by some % but you could also make it less reliable or even open up new attack vectors.

Security issues like these need deterministic solutions, and that's exceedingly difficult (if not impossible) with LLMs.

OtherShrezzing · 2025-08-24T21:43:51 1756071831

What stops someone prompt injecting the first LLM into passing unsanitised data to the second?

gmerc · 2025-08-24T19:10:50 1756062650

Now you have 2 vulnerable LLMs. Congratulations.

alexbecker · 2025-08-24T17:49:49 1756057789

The problem is there is no real way to separate "data" and "instructions" in LLMs like there is for SQL

gmerc · 2025-08-24T19:10:35 1756062635

There's only one input into the LLM. You can't fix that https://www.linkedin.com/pulse/prompt-injection-visual-prime...

internet_points · 2025-08-25T09:05:48 1756112748

SQL strings can be reliably escaped by well-known mechanical procedures.

There is no generally safe way of escaping LLM input, all you can do is pray, cajole, threaten or hope.

chasd00 · 2025-08-24T22:13:00 1756073580

Can’t the connections and APIs that an LLM are given to answer queries be authenticated/authorized by the user entering the query? Then the LLM can’t do anything the asking user can’t do at least. Unless you have launch the icbm permissions yourself there’s no way to get the LLM to actually launch the icbm.

alexbecker · 2025-08-25T06:27:01 1756103221

Generally the threat model is that a trusted user is trying to get untrusted data into the system. E.g. you have an email monitor that reads your emails and takes certain actions for you, but that means it's exposed to all your emails which may trick the bot into doing things like forwarding password resets to a hacker.

thebytefairy · 2025-08-26T03:30:34 1756179034

I think it depends what kind of system and attack we're talking about. For corporate environments this approach absolutely makes sense. But say in a user's personal pc where the LLM can act as them, they have permission to do many things they shouldn't - send passwords to attackers, send money to attackers, rm -rf etc

lelanthran · 2025-08-24T21:25:56 1756070756

You cannot sanitize prompt strings.

This is not SQL.

alexbecker · 2025-06-29T23:22:27 1751239347

Lately I've been trying to detect/mitigate prompt injection attacks. Wrote a blog post about why it's hard: https://alexcbecker.net/blog/prompt-injection.html

alexbecker · 2025-03-07T17:35:40 1741368940

After reading Judith Butler for a class in college, reading "Professor of Parody" was such a breath of fresh air. Nussbaum is a clear thinker who doesn't take BS kindly.

scythe · 2025-03-07T17:52:40 1741369960

Context:

https://newrepublic.com/article/150687/professor-parody

tehjoker · 2025-03-08T00:58:36 1741395516

Really good article. There is a whole cast of philosophical characters that have been supported because they provide a non-materialist subversive, yet directionless, politics that go no-where, great for capitalism. Some of them were even directly supported by the CIA. This is in part why politics is so empty today, the post-modernists won and call you names for wanting peace, real life choices, control over your work life, health care, etc while also not actually advancing the cause of equality among the different divisions of the working class beyond spoken words.

sevensor · 2025-03-08T15:33:36 1741448016

I thought this bit was particularly perceptive and foreshadows the situation in which we now find ourselves:

> Indeed, Butler’s naively empty politics is especially dangerous for the very causes she holds dear. For every friend of Butler, eager to engage in subversive performances that proclaim the repressiveness of heterosexual gender norms, there are dozens who would like to engage in subversive performances that flout the norms of tax compliance, of non-discrimination, of decent treatment of one’s fellow students. To such people we should say, you cannot simply resist as you please, for there are norms of fairness, decency, and dignity that entail that this is bad behavior.

alexbecker · on Oct 11, 2019

The argument as I understand it is that group 2 mostly does not care about being protected from their government or even agrees with the government's actions, and they get to benefit from continued access to Apple's superior products. I'm not endorsing this argument, but it's not prima facie crazy.

alexbecker · on Sept 16, 2019

I run a similar, maybe even more boring stack for my less-than-one-person company [PyDist](https://pydist.com):

- PostgreSQL database

- Nginx proxy in front of Django apps for UI and API servers (I use gunicorn instead of uWSGI though)

- Cron jobs which invoke django-admin commands to keep the PyPI mirror in sync

Perhaps the only place I'm any fancier than OP is that my deploy script is in Python, not shell, since any time I try to write a shell script with even slightly nontrivial logic it falls over and catches fire :)

welder · on Sept 16, 2019

What's your experience with gunicorn instead of uWSGI? I'm using haproxy + nginx + uWSGI but I'm wondering if gunicorn scales network more than uWSGI. My bottleneck isn't CPU, it's the amount of open connections uWSGI can handle at once.

Here's a trimmed down version of my web configs: https://wakatime.com/blog/23-how-to-scale-ssl-with-haproxy-a...

alexbecker · on Aug 20, 2019

Running a python package registry has some unique challenges, so it makes sense not to start with it (I run such a registry: https://pydist.com).

For example, Python has a distinction between distributions (the actual file downloaded, e.g. a tarfile or a manylinux1 wheel) and versions that doesn't exist in most other languages.

takeda · on Aug 21, 2019

All of these concerns are handled on client side, in the end all python needs is an http server, it can be actually hosted on S3.

alexbecker · on July 24, 2019

DevPi is a good solution if you want to self-host a Python package index. PyDist has some additional features like API keys and download statistics which I think are nice, but the main selling point is that you don't have to set up and maintain it yourself.

alexbecker · on July 24, 2019

Why would you not expect someone to charge for this? There are many services that charge for hosting private packages (rather than making them public to the world); I'm not aware of _any_ service that does so for free.

philipov · on July 24, 2019

I read private, and didn't notice the word hosting, so I thought it was an on-prem package indexing. Uploading their proprietary code to some random hosting provider isn't something that would fly with any of my clients, so I didn't expect that. Whenever I see a product landing page with pricing, they title it Pricing, so calling it something else sounds like someone playing coy with the fact they're a payed product. Compare with Artifactory, which is up-front about it, and offers much more than just a package index.

alexbecker · on July 24, 2019

Yes, you can use --index-url.

I assume you're referring to how --extra-index-url means that pip will randomly choose which index to try to install from, potentially installing a public package by the same name instead of your private package?