I don’t buy the “LLMs = books” analogy. Books are static; today’s LLMs are adaptive persuasion engines trained to keep you engaged and to mirror your feelings. That’s functionally closer to a specialized book written for you, in your voice, to move you toward a particular outcome. If there exists a book intended to persuade its readers into committing suicide, it would surely be seen as dangerous for depressed people.
There has certainly been more than one book, song, film, romanticising suicide to the point where some people interpreted it to be "intended to persuade its readers into committing suicide".
The general public submitting CSAM directly would indeed be highly unlikely, but the scenario we need to consider involves those in positions of authority who can manipulate systems behind the scenes.
Imagine that an unflattering or satirical image of Viktor Orban is circulating in France, and let's say it becomes viral, inciting discussions that the Hungarian government finds detrimental to its international image. The authorities might want to suppress this image, not just within Hungary but also in the whole of European Union.
The Hungarian government - who presumably has access to the EU CSAM database (or can coerce those who do), might attempt to add a fingerprint of a manipulated CSAM image that collides with the fingerprint of the satirical image. The principle here is not for public individuals to submit CSAM directly but for government actors to manipulate systems clandestinely.
>The Hungarian government - who presumably has access to the EU CSAM database (or can coerce those who do), might attempt to add a fingerprint of a manipulated CSAM image that collides with the fingerprint of the satirical image.
Then what? What does that achieve? There would be a huge spike in images identified as CSAM which would obviously throw up red flags. It seems like this would mostly just be headache for the law enforcement across the EU. It isn't like France is going to arrest a huge number of people without any investigation or thought of how this one CSAM image spread so far so quickly. And if we are talking about the result in Hungary, that government doesn't need this tool to abuse its power. Why go through all that effort? They could just do the equivalent of rubber-hose cryptanalysis.
One idea would be for the government of Hungary to create a list of its citizens that share this inciting, dangerous or whatever you you wanna call it material. If they keep on finding the same persons distributing multiple times, they may pay them a visit, get them fired from their government job, block their bank accounts, put them on the no-fly list or whatever.
And that’s just one idea, probably there’s others.
But why do this flagging out in the open where other member countries will see this obvious pattern of behavior and abuse of this system? Once again, yes this tool can be abused, but there are much simpler and harder to detect approaches that a corrupt or authoritarian government could implement. That is where the rubber-hose analogy comes into play.
Won't that easily be found out when the hash matches the image of Viktor Orban rather than an image of CSAM. I'm not sure the legal system is as stupid as you think it is. Then Hungary would just have their hashes reviewed.
Sure a conspiracy of all the relevant authorities across the whole EU would work, ... but that seems a stretch to enable what, political elites to _cooperatively_ censor images the public hold. There are easier ways, surely.
The point is a real CSAM image can be manipulated such that when hashed it matches the hash of the Orban image. So your phone calls the police saying you have CSAM when you get the Orban image. Your life is ruined and you're financially impacted trying to defend yourself. Even if you're cleared of charges you're fucked. This now has a chilling effect on sharing of the Orban image.
Orban's people won't face any pushback because they submitted an actual CSAM image. It's on you to prove somehow they intentionally poisoned the CSAM image such that its hash matches the Orban image.
I get people's opinions of law enforcement are low, but do you really think that no one would question why a huge number of people have a single CSAM image on their device especially when the hash for that image was just added to the database? Do you think that no one would question that maybe there is something wrong with that hash? Do you think that no one would look at the flagged image on any of those devices?
I understand people's concerns with this tech, but this seems like a rather silly hypothetical to me.
It's really about preventing images from circulating.
Yes, the database maintainers would notice, but it could take them a while to get around to it. And they're not going to be eager to remove a collision, because that would effectively "legalize" the child porn member of the image pair. There are lots of images that could be useful to suppress temporarily.
And if you actually succeed in suppressing the false-positive image, it may not get passed around very fast, and it won't get vastly more hits than real target images, so it will take longer for anybody to notice or care.
There's also a possible end game where the child porn traders start perturbing their images to collide with really common images like flags, corporate logos, iconic movie stills, and whatever else. So now you either have to ban the US flag, or let this or that actual child porn image go.
I don't actually think that the false hits would ruin very many lives in most places. But it's worth noticing that the original article was talking about authoritarian regimes repurposing the system without the consent of the database maintainers. In the Orbán example, it's possible that the system might flag you for child porn, but you might actually get arrested for sedition. And that continues to happen to people until the database maintainers pull the hash.
>Yes, the database maintainers would notice, but it could take them a while to get around to it. And they're not going to be eager to remove a collision, because that would effectively "legalize" the child porn member of the image pair. There are lots of images that could be useful to suppress temporarily.
They may not know it immediately, but the actual CSAM image wouldn't actually be shared in any real numbers which removes much of the concern with "legalizing" the image. You can argue this makes this system ineffectual, but that also means this isn't repeatable since weakening the impact of this system would quickly result in Hungary losing the power to add hashes to the database.
>And if you actually succeed in suppressing the false-positive image, it may not get passed around very fast, and it won't get vastly more hits than real target images, so it will take longer for anybody to notice or care.
Who is sharing the real target image? Is Hungary now an active creator and distributor of actual CSAM in addition to manipulating the database?
>There's also a possible end game where the child porn traders start perturbing their images to collide with really common images like flags, corporate logos, iconic movie stills, and whatever else. So now you either have to ban the US flag, or let this or that actual child porn image go.
Once again, this behavior would get Hungary booted pretty quickly. Also it is important to remember these are hashes. Not all images of the US flag would trigger the system only that specific image that has a hash collision.
>I don't actually think that the false hits would ruin very many lives in most places. But it's worth noticing that the original article was talking about authoritarian regimes repurposing the system without the consent of the database maintainers. In the Orbán example, it's possible that the system might flag you for child porn, but you might actually get arrested for sedition. And that continues to happen to people until the database maintainers pull the hash.
But this isn't a closed system within that authoritarian regime. It leaves bread crumbs of this behavior out for everyone to see. If Hungary is going to arrest people on made up charges, they can do that anyway.
I just think this in a complicated and ineffectual bullet that can only be fired once because it has a fingerprint of the person who fired it. That adds up to make this type of abuse less of a worry.
>> There's also a possible end game where the child porn traders [...]
> Once again, this behavior would get Hungary booted pretty quickly.
I'm sorry; I should have made myself clearer. In that paragraph, I've taken an aside and moved from Hungary (or any other government) as the adversary to child-porn-sharers-in-general as the adversary, and also changed the adversary goal.
No matter what you do, you can't "boot" the child porn sharers, because they're the ones who actually define what images you're legitimately trying to block.
> Also it is important to remember these are hashes. Not all images of the US flag would trigger the system only that specific image that has a hash collision.
They're approximate perceptual hashes, designed to come up with close values on close pictures. The US flag has an officially defined appearance. You'll get the same hash for any two close-cropped, straight-on images of the flag, the kind you might embed in your Web site. They'll be at least as close as two reprocessed versions of the same child porn image.
I was oversimplifying, though. You're not going to be able to tweak just any child porn image to make it hash like just any flag, unless you're willing to distort it into unrecognizability. And flags and logos might be bad candidates in general, because they're going to give DCT output that's wildly different than what you'll get from most photos. But if you had a relatively large library of child porn and a relatively large library of heavily-used effectively unbannable images of whatever kind, you should be able to find a lot of the child porn images that you can tweak to hash like one or another of the heavily used ones.
... and if you're generating fake child porn from scratch using ML, you can probably hack your ML model to bake the hash of one or another unbannable image into everything it creates. You could probably make those matches pretty damned close.
So, once they got the whole thing down, they should be able to force the system to greatly tighten its match thresholds and/or deal with a really high false positive rate. In fact, they could probably make things bad enough that they could end up with a pretty large collection of child porn that the operators would be forced to completely exclude from detection.
Now back on the original authoritarian threat model:
> I just think this in a complicated and ineffectual bullet that can only be fired once because it has a fingerprint of the person who fired it.
For the "authoritarian" purpose, you may be right... although I wouldn't be surprised if it stayed under the radar for longer than you think. If the image you want to suppress only circulates among your own people, and if you're the authority who receives and verifies the reports on your own people, then all you have to do is to keep the overall volume down enough that it doesn't make anybody suspicious enough to demand that you show them the reports you're getting.
If you're a relatively small country, the number of people who share some local meme you care about may be quite a bit smaller than the number of people who share some new real child porn image.
that's not a great way to think. One needs to think if something is possible before trying to solve unrelated issues that some engineering can probably figure out.
This has already happened in China, where Baidu (The Chinese equivalent of Google) can’t crawl any articles from WeChat (The Chinese equivalent of Medium), as a result, the usefulness of its search result has deteriorated significantly. Recently, Baidu has been trying to start its own publishing platform with little success.
Also food ordering, travel reservations, health care appointments, banking, government services, and a whole lot of other things that would take too long to list. It's not an exaggeration to say the entire Chinese consumer experience runs through WeChat.