Hacker Newsnew | past | comments | ask | show | jobs | submit | simandl's commentslogin

Last year's ICLR had a paper, "Data Poisoning Won't Save You From Facial Recognition" that included the Glaze team's previous project, Fawkes. This statement from that paper is quite damning.

"This paper shows that these systems (and, in fact, any poisoning strategy) cannot protect users’ privacy. Worse, we argue that these systems offer a false sense of security. There exists a class of privacy-conscious users who might have otherwise never uploaded their photos to the internet; however who now might do so, under the false belief that data poisoning will protect their privacy. These users are now less private than they were before."

Paper: https://arxiv.org/pdf/2106.14851.pdf Fawkes: https://sandlab.cs.uchicago.edu/fawkes/


This assumes that the filter actually works in practice: https://www.reddit.com/r/StableDiffusion/comments/11v7sv9/ha...


Thank you for the feedback! That's what we're hoping our opt-in tools will help artists do. For the problems you've posed here, you might actually want to both opt-in and opt-out. You'd be able to flag the imageurl-caption pairs that don't accurately credit you or describe your work and we'll forward them to the dataset creators for removal. You could then add any works that you're comfortable being used for AI training, and caption them however you'd like. Those would go into the dataset and be used when training future models.

We'd love to have you sign up if you're interested! We expect to start the beta for those tools in 2 to 3 weeks.


We think this is because the images are all links and the browser itself is pulling them in from across the web. With chrome it happened once during our testing, and we've had one user also experience it, but it's intermittent. Thank you for pointing to brave, which we haven't tested yet! That might help us repeat the errors.


These are important questions. We aren't storing any of the images used to search, and the email addresses are going to mailchimp lists for opt in and opt out. As we role out the next set of tools, which let users flag images and create lists from them, we'll email the mailchimp lists with more info.

When we enable sign-in, we'll also add a privacy policy, because at that point, we will store some images, on request, to use them to make finding other works by the same artists easier.

Opt-out image URL lists will be made available to the dataset owners for removal. Opt-in image lists will be public.


This article highlights several of the problems that we are working on. Here's another article, about our organization:

https://www.inputmag.com/culture/mat-dryhurst-holly-herndon-...


You might be surprised. We have almost as many opt-in requests as opt-outs since we announced this today.

We don't see this as binary in the long term. Maybe artists want to release art from a prior period, for example, but withhold their current series until they move to the next.


I think we'll see people who want to post images of themselves in there as well. Like, being a 'playable character' you could prompt for to be in the artwork could be pretty fun.


Why is opt out a thing?

Who told you you could use the data in the first place?

It's copyrighted. Reproducing any of it is a crime.


Reading through your comments, it appears you're under the impression that we (Spawning) trained models using this data. That is not the case. We're building tools to help people to remove themselves from, or add themselves to, datasets used (by others) to train these models. We're hoping to make that as close to 0 effort as possible for all parties. We agree that artists should control how their works are used, and we're working hard to make that the norm.


> It's copyrighted. Reproducing any of it is a crime.

In which case it should be fairly simple to challenge it in court, no?


We don't store any images used for searching.

We are building an opt-in list, because a lot of people do want to be able to prompt AI with something like, "a cat in the style of me" or "me riding a dinosaur". That will be shared publicly, of course.


Thanks! That is definitely on the list, but might be a few months away. We're focusing on using images to find other images so it will be easy for artists to flag all of their stuff quickly. But, once we have that in a good place, we will definitely be adding more to the text search side.


If that's a rights issue, we'll definitely add a link to the source. For now, you can right click -> open in new tab to see where it came from, but we'll look into this asap.

The goal here is to give people the opportunity to remove images they don't want in this dataset or add images they do want in there.


That's not how this works. Copyright owners have the right to control when, where, how and by whom their content may be used.

Not you https://www.law.cornell.edu/uscode/text/17/106


Isn’t this fair use? Copyright holders don’t get to opt out of fair use, correct?


Fair use doesn't apply here.


It’s clearly transformative so I cannot see a good argument for why it does not apply here. Do you have one?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: