A bit unrelated, but if you ever find a malicious use of Anthropic APIs like that, you can just upload the key to a GitHub Gist or a public repo - Anthropic is a GitHub scanning partner, so the key will be revoked almost instantly (you can delete the gist afterwards).
It works for a lot of other providers too, including OpenAI (which also has file APIs, by the way).
I wouldn’t recommend this. What if GitHub’s token scanning service went down. Ideally GitHub should expose an universal token revocation endpoint.
Alternatively do this in a private repo and enable token revocation (if it exists)
They mean it went down as in stopped working, had some outage; so you've tried to use it as a token revocation service, but it doesn't work (or not as quickly as you expect).
I'm being kind of stupid but why does the prompt injection need to POST to anthropic servers at all, does claude cowork have some protections against POST to arbitrary domain but allow POST to anthropic with arbitrary user or something?
In the article it says that Cowork is running in a VM that has limited network availability, but the Anthropic endpoint is required. What they don't do is check that the API call you make is using the same API key as the one you created the Cowork session with.
So the prompt injection adds a "skill" that uses curl to send the file to the attacker via their API key and the file upload function.
Yeah they mention it in the article, most network connections are restricted. But not connections to anthropic. To spell out the obvious—because Claude needs to talk to its own servers. But here they show you can get it to talk to its own servers, but put some documents in another user's account, using the different API key. All in a way that you, as an end user, wouldn't really see while it's happening.
So that after the attackers exfiltrate your file to their Anthropic account, now the rest of the world also has access to that Anthropic account and thus your files? Nice plan.
Maybe, the point is that people, in general, commit/post all kinds of secrets they shouldn't into GitHub. Secrets they own, shared secrets, secrets they found, secrets they don't known, etc.
GitHub and their partners just see a secret and trigger the oops-a-wild-secret-has-appeared action.
Was a bit disappointed when I almost "solved" it but couldn't solve the last 2 words, finally clicked the hint and it told me to undo 12 times.. would have preferred if there was a warning earlier.
Can anyone point to an actual reputable source that has any details about what specifically got leaked, and how? Instagram has way more users, so it's very odd that only 17.5M get "leaked". Just honestly feels like this is overblown and it's again just scraped data or something.
The original Malwarebytes tweet is incredibly generic.
probably some kind of plugin or app they logged into via instagram but I am not sure what kind of integrations there are, or could it be regional for some reason?
No, for example Alibaba has huge proprietary Qwen models, like Qwen 3 Max. You just never hear about them because that space in western LLM discussions is occupied with the US labs.
The best is probably something like GLM 4.7/Minimax M2.1, and those are probably at most Sonnet 4 level, which is behind Opus 4.1, which is behind Sonnet 4.5, which is behind Opus 4.5 ;)
And honestly Opus 4.5 is a visible step change above previous Anthropic models.
Oh, of course not, you might need up to 100GB VRAM to have those models at decent speeds even just for low-quant versions.
And all the hype about Macs with unified memory is a bit dishonest because the actual generation speed will be very bad, especially if you fill the context.
One of the things that makes Opus 4.5 special in comparison to e.g. GPT 5.2 is the fact that it doesn't have to reason for multiple minutes to make some simple changes.
It works for a lot of other providers too, including OpenAI (which also has file APIs, by the way).
https://support.claude.com/en/articles/9767949-api-key-best-...
https://docs.github.com/en/code-security/reference/secret-se...
reply