Just out of curiosity - if the AI software relies on utilizing images scraped from the web, could one conceivably hit the developer with copyright infringement charges?
It's illegal. Prior authorization _should_ be required using someone's copyrighted work . In this case, the AI firm that compiled the profile of this poor soul probably had not reached out and asked for authorization for use of his image in a vast database which is then productized and sold to law enforcement agencies.
Now, things get complicated when you involve firms which, in the process of signing up a user, might force an agreement to sell/market the data in a non-exclusive manner to "trusted partners". Sooooo...is it illegal? Depends on on the lawyer and how long you can fund a lawsuit for, I suppose.
We do not actually have precedence of this being illegal yet. You are not using the image in creation of a work (in which case, you need permission), but using it to train an AI model, which we currently have no legal indication is not allowed, even if the image is copyrighted.
Could you explain to me why you don't consider Clearwater's product to be a work in this regard? It seems plausible to me that someone could infringe copyright by incorporating copyrighted works into a machine learning model
I don't have much of an opinion myself on it, I'm just trying to state that so far, this appears to be the status quo. Here is some recent additional information on it compiled by Gwern: https://www.gwern.net/Faces#copyright (scroll up just above this link):
>Models in general are generally considered “transformative works” and the copyright owners of whatever data the model was trained on have no copyright on the model. (The fact that the datasets or inputs are copyrighted is irrelevant, as training on them is universally considered fair use and transformative, similar to artists or search engines; see the further reading.) The model is copyrighted to whomever created it.
The stance from Clearview AI (henceforth known as "privacy rapists") is that anything put publicly is 'fair game' for their use - I don't think that stance has been fully challenged yet.
What I'd be interested to know is if they have pictures of minors in their database, which could theoretically require some sort of release for them to collect/use (which I'm quite sure they wouldn't have).
* "Twitter sent a cease-and-desist letter to the company ordering it to stop mining the social media platform’s data and delete anything it had already collected."
* "YouTube, and Venmo sent their own cease-and-desist letters"
* "Here’s a LinkedIn spokesperson on Clearview AI: “We are sending a cease & desist letter to Clearview AI. The scraping of member information is not allowed under our terms of service and we take action to protect our members.”"
* "Facebook notably has not sent a formal cease-and-desist letter but claims to have sent other letters to Clearview to request more detail on its practices and then eventually “demanded” that it stop scraping user data. Peter Thiel, a venture capitalist and notable surveillance enthusiast who sits on Facebook’s board of directors, invested $200,000 in Clearview’s first round of funding."
Or would this be fair use?