Hacker News new | past | comments | ask | show | jobs | submit login

My take on this is that the system is by and large useless.

It won't catch anything but the dumbest of dumb criminals, because those who care about CSAM can surely figure out a better way to share images, or find a way to obfuscate their images enough to bypass the system (the lower the false positive rate, the easier it must be to trick the system).

So what's left when all the criminals this is supposed to catch have figured it out?

False positives. Only false positives.

Is it really worth turning personal devices into snitches that don't even do a good job of protecting children?

Also, numbers about false positives must be taken with a grain of salt because of the non-uniform distribution of perceptual hashes. It might be that your random vacation photos and kitty pics have a 1-in-a-million chance of a fapo, but someone who happens to (say) live in an apartment that has been laid out very similarly to a scene in pictures appearing in the CSAM database may have a massively higher chance of fapos for photos taken in their home.




> It won't catch anything but the dumbest of dumb criminals

Dumb is a pretty accurate description of a large fraction of criminals. For the most part you only get smart criminals when you are talking about crimes where you have to be smart to even plan and carry out the crime.


Given the average user don't know what a url is and a pedophile can use the darknet, I'd say criminals are not all dumb.


It is not so easily comparable. It’s all about your intrests and expectations.

Darknet might sound bit complex, but as darknet user, you literally just install different browser.


Something like 40 percent of people use the default browser installed by the os.

We're just in an echo chamber of people who know what JavaScript is, and that distorts our perception of the world.


Can you give source for that number? Regardless, browser is like any other app. If that amount of people don’t know how to install apps on their computers, then we have a either real dump people (or just lack of motivation) or great UX design failure in general.


Users are on average incompetent, not dumb.

It's a dauntins realisation that sinks in once you have to do support for a web site or app catering to the general population instead of a niche.

They don't read, don't know the diff between an app and a web site, don't know right click or drag and drop, think google or chrome is the internet and overall their startegy to solve any problem is as follow:

- look for something obvious that seems like the answer but is not scary

- click

- wait for it

- repeat 3 times until ok or give up and call someone or get angry or both

Working on a streaming video site really opened my eyes on this one. Most tickets we received were insults, some were incomprehensible garbage, a few were actionable request from someone not understanding anything about their computer.

This is nothing like your github ticket. Your parent number are being generous IMO.


Apple seems to have completely botched this PR stunt/feature.

Reading your comment, I realize how these… ‘criminals’ could use phone number networks to share illegal sexual content peer to peer.

In other words, Apple doesn’t need to analyze your images to find these criminals. They only need to analyze the frequency or quantity of flagged images.

In other words not one image correctly/falsely tagged, but individuals and networks of individuals who are *collecting* and *storing* mass quantities of these images. And, they’re using Apple privacy and security to hide from law enforcement.

Racketeering?


Yes, but when you admit that the target is just the dumb criminals, then why adopt a scheme that has false positives?

Decompress and downsample. Drop the least significant bit or two, maybe do it in the dct domain instead. SHA256. It'll preserve matching for at least some cases of recompression and downsampling. But finding an unrelated image that matches is as hard as attacking SHA256, the only false positives that could be found would be from erroneous database entries.


> Dumb is a pretty accurate description of a large fraction of criminals.

Is there any reading on that? I'd love it to be true.


Selection bias? It _might_ be a pretty accurate description of a large fraction those who _get caught_.


So we must compare those statistics for unsolved mysteries to find the truth and correlation.


For the best criminals, you wouldn't even know that there's an unsolved mystery to be solved.


Is there even a problem in such a scenario?


With the caveat already noticed for selection bias (maybe the smart criminals never get caught?), there's definitely some evidence to support this. https://law.jrank.org/pages/1363/Intelligence-Crime-Measurin...


You only hear about the criminals who get caught and crimes that go unsolved get blamed on these kinds of criminals.


> Is it really worth turning personal devices into snitches that don't even do a good job of protecting children?

Yes, because the point is not to protect children. It's to get everyone used to the idea that their content is being monitored. Once that is accomplished, other forms of monitoring can and will be added.


Exactly. It's a Trojan Horse (https://en.wikipedia.org/wiki/Trojan_Horse) to make more pervasive individual control the new normality. The current motivations are just a pretext.


Perceptual hashes are only used to reduce the search space for human review. Apple doesn’t have images in the CSAM database to do a comparison, but if it’s just a picture of a door their going to reject it. Also, because human review is an expense Apple’s incentives are to minimize the number of times it happens, thus the requirement for multiple collisions.


I don't really want my family photos reviewed by strangers. "Reducing the search space" of photos on my phone isn't an outcome I want to live with. At the time someone is looking at photos of my, my wife/husband/girlfriend/boyfriend, and my kids, they'd better have a darned good reason (e.g. a search warrant).

I'd also appreciate if Apple let me know if my false positives were reviewed and found to not be CASM.


Don’t upload an image anywhere, else it can be reviewed.


I saw a story on here yesterday about iphones resetting to default settings after restarting. So people were turning off backups to the cloud, and then finding that their device turned the feature on after sometime.


The whole point of Apple's system is that I don't need to upload an image anywhere.

Images from my phone can be stolen and reviewed with no due process, based on proprietary Apple technology.


The system as described only submits its safety vouchers when photos are uploaded to iCloud.

Not saying it will stay that way, but there are three distinct realms of objection to this system, and it's probably useful to separate them:

1. Objections that in the future, something different will happen with the technology, system, or companies; so that even if the system is unobjectionable now, we should object because of what it might be used for in the future; or how it might change. 2. Objections that Apple can't be trusted to do what they say they are doing, so that even if they say they will only refer cases after careful manual review, or that they will submit images for review that were not uploaded to iCloud, we can't believe them, so we should object. 3. Objections that hold for the system as designed and promised; in other words, even if all the actors do what they say they are doing in good faith and this monitoring never expands, it's still bad.

People who have the third kind of objection need to deal with the fact that Apple is basically putting in a system with more careful safeguards than are already in place in many Internet services, even for their "private" media storage or exchange. You likely don't know how the services you use are scanning for CSAM but if the service is at all sizeable (chat, mail, cloud storage) it's likely using PhotoDNA or something similar.

I think there are valid objections on all three bases. But there's a difference in saying "this is bad because of something that might happen" and "this is bad because of what is actually happening".


I think the issue is that the content review is happening on phone, and would be a small change to go from scanning uploaded photos to all photos


Oh yes, I agree. We will see a change in privacy policy before that happens. And Apple will lose a lot of us if that comes to pass.

For many years, it happened in the cloud. Soon it will happen on device and send a message about which item in the cloud is an issue.

I think it’s all about apple moving ML jobs (like Siri) to device to lighten the load on their datacenters.


> Apple’s incentives are to minimize the number of times it happens, thus the requirement for multiple collisions.

How can we be sure they won’t cut costs by increasing worker load? I could see them giving each reviewer less time to review individual pictures before passing it on to law enforcement.


We can't and they probably will, everyone else seems to already be doing so. There's that Swiss federal police report that only about 10% of NCMEC reports are actually relevant (https://fedpol.report/en/fedpol-in-figures/fight-against-pae...)


If they pass false positives to authorities that will open them up to legal action.


Apple's human review is largely useless.

Trolls will be able to easily use tools slightly modify ambiguous adult porn to collide with a "known CP hash".

A human reviewer will see a blurry grayscale derivative of adult pornographic content and hit "report" every time.


This is the threat model I am looking at. It is number one with a bullet. We have already had a court case where an adult actress had to show up in court and prove that she was adult when experts testified that the images were of a non-adult woman.

Baby in the sink? No. But a bunch of the aforementioned? Yeah.


> Perceptual hashes are only used to reduce the search space for human review.

False. The Apple proposed system leaks the cryptographic keys needed to decode the images conditional on the match (threshold of matches) of the faulty neuralhash perceptual hash.

Matching these hashes results in otherwise encrypted highly confidential data being decodable by apple, accessable on their servers to the relevant staff along with anyone who compromises them or coerces them.


Apple can decode the data either way. Their the ones doing the encryption on their servers.

There are two basic reasons for this first it’s a backup service which makes end to end encryption risky, but second they also let users share access to their baked up photos. iCloud > photos > shared album.


Edit: I incorrectly claimed there wasn’t manual review - see below


What is the basis for your understanding?


Apologies, I was mistaken:

“ Only when the threshold is exceeded does the cryptographic technology allow Apple to interpret the contents of the safety vouchers associated with the matching CSAM images. Apple then manually reviews each report to confirm there is a match”

The design goal was no human review for individual matches.


I knew a probation officer for sex offenders. They told me that most of them were quite dumb. What the repeat offenders were, though, is dedicated. They had all day to try to avoid getting caught, and the PO had a few minutes per week per offender.

It's true that in any arms race, a given advance gets adapted to. This will surely catch a bunch of people up front and then a pretty small number thereafter as the remainder learn to avoid iPhones. But that's how arms races work. You could say that about almost any advance in fighting CSAM.


I think it's only the dumb ones who get caught.

Source: I've met a few white collar criminals.


Probably some of both. One point of the criminal justice system is to shift incentives such that people with their acts together satisfy their desires without criming. There are plenty of smart, greedy people who just go get an MBA and siphon off value in ways that are technically legal. The risk-adjusted ROI is better.


> It won't catch anything but the dumbest of dumb criminals, because those who care about CSAM can surely figure out a better way to share images

Apparently that better way is by using Facebook. Facebook made 20.3 million reports to NCMEC in 2020.

https://www.missingkids.org/content/dam/missingkids/gethelp/...


Yeah, Facebook's blog post makes me wonder what all the stuff they report actually is. When people say CSAM, I think "kids getting raped" but apparently there's stuff that people find humorous or outrageous and spread it like a meme (and not like pornography).

"We found that more than 90% of this content was the same as or visually similar to previously reported content. And copies of just six videos were responsible for more than half of the child exploitative content we reported in that time period."

"we evaluated 150 accounts that we reported to NCMEC for uploading child exploitative content in July and August of 2020 and January 2021, and we estimate that more than 75% of these people did not exhibit malicious intent (i.e. did not intend to harm a child). Instead, they appeared to share for other reasons, such as outrage or in poor humor (i.e. a child’s genitals being bitten by an animal)."

Based on this, I wouldn't conclude that FB is the platform where people pedos go share their stash of child porn.

Their numbers also include Instagram, which I believe is quite popular among teenagers? I wonder how likely it is for teens' own selfies and group pics get flagged and reported to NCMEC.

(https://about.fb.com/news/2021/02/preventing-child-exploitat...)


> Facebook made 20.3 million reports to NCMEC in 2020.

Which appears to have resulted in what... 5 prosecutions?


> It won't catch anything but the dumbest of dumb criminals, because those who care about CSAM can surely figure out a better way to share images, or find a way to obfuscate their images enough to bypass the system (the lower the false positive rate, the easier it must be to trick the system).

Given the reported numbers of illegal images detected by similar systems within Facebook and Google, I think it is very clear that this will catch a lot of illegal content.


Facebook and google are not catching 20m people a year, they're mostly flagging and removing tor/proxy-based throwaway accounts.


The false positive rate reported in the blogpost for imagenet was 1 in a trillion, and the author concludes that this algorithm is better than they expected.


"After running the hashes against 100 million non-CSAM images, Apple found three false positives"

So closer to 1/10M. The reporting threshold is made artificially higher by requiring more than one positive.

But anyway, that's beside the point.

A perceptual hash is not uniformly distributed; it's not a random number. Likewise for photos taken in a specific setting; they do not approach the randomness of a set of random images.

So someone snapping a photos in a setting that has features similar to a set of photos in the CSAM database may risk a massively higher false positive rate. It's no longer a million sided dice, it could be a thousand sided dice when your outputs happen to be clustered around similar values due to similar setting.

But I can't say I care about false positives. To me the system is bad either way.


"After running the hashes against 100 million non-CSAM images"

They don't say what kind/distribution of non-CSAM images. Landscapes? Parent pix of kids in the bathtub? Cat memes? Porn of young adults? Photos from real estate listings?

I suspect some pools of image types would have a much higher hit rate.

Edit: And, well "hot dog / not hot dog" is impressive on a set of random landscapes too.


Well the same article also claims zero false positives for "a collection of adult pornography." I don't know if the size of that collection is mentioned anywhere.

Anyway, I suspect that the algo is more likely to pick defining features of the scene and overall composition (furniture, horizon, lighting, position & shape of subject and other objects) more than the subject matter itself.


That's why I included "Photos from real estate listings?" in my list.


Sometimes the best way to catch the really smart or sophisticated criminals is to exploit their less smart and less sophisticated accomplices, co-conspirators, peers, acquaintances, or even their victims.


The point of these innovations is never the stated purposes. To catch criminals is an excuse. I would bet a great deal that this system is by and large pressured by state actors for the purpose of creating a new political surveillance tool.


can try a web demo of it here on huggingface https://huggingface.co/spaces/akhaliq/AppleNeuralHash2ONNX


> False positives. Only false positives.

I really doubt this. In the long term, a few people Apple wants to frame will surely slip into the mix. If Apple didn't want Trump to win, a CASM flag a week before the election might do it.


> It won't catch anything but the dumbest of dumb criminals

This includes the vast majority of pedophiles.


Do you have any source that pedophilia correlates very strongly with low intelligence?


Where did you find the statistics about pedophiles' intelligence?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: