You know why we put up with copyrighted info in the human brain right? Because those are human beings, it’s unavoidable. This? Avoidable.
Also, the model isn’t a human brain. Nobody has invented a human brain.
And the model might not infringe if its inputs are licensed but that doesn’t seem to be the case for most and it’s not clearly transparent they don’t. If the inputs are bad, the intent of the user is meaningless. I can ask for a generic super hero and not mean to get superman but if I do I can’t blame that on myself, I had no role in it, heck even the model doesn’t know what it’s doing, it’s just a function. If I Xerox Superman my intent is clear.
> You know why we put up with copyrighted info in the human brain right? Because those are human beings, it’s unavoidable.
I would hope we put up with it because "copyright" is only useful to us insofar as it advances good things that we want in our society. I certainly don't want to live in a world where if we could forcibly remove copyrighted information from human brains as soon as the "license" expired that we would do so. That seems like a dystopian hell worse than even the worst possible predictions of AI's detractors.
> I can ask for a generic super hero and not mean to get superman but if I do I can’t blame that on myself, I had no role in it, heck even the model doesn’t know what it’s doing, it’s just a function.
And if you turn around and discard that output and ask for something else, then no harm has been caused. Just like when artists trace other artists work for practice, no harm is caused and while it might be copyright infringement in a "literal meaning of the words" it's also not something that as a society we consider meaningfully infringing. If on the other hand, said budding artist started selling copies of those traces, or making video games using assets scanned from those traces, then we do consider it infringement worth worrying about.
> If I Xerox Superman my intent is clear.
Is it? If you have a broken xerox machine and you think you have it fixed, grab the nearest papers you can find and as a result of testing the machine xerox Superman, what is your intent? I don't think it was to commit copyright infringement, even if again in the "literal meaning of the words" sense you absolutely did.
I’m saying that retaining information is a natural, accepted part of being human and operating in society. Don’t know why it needed to be turned into an Orwell sequel.
I had assumed when you said that a human retaining information was "unavoidable" and a machine retaining it was "avoidable" that the implication was we wouldn't tolerate humans retaining information if it was also "avoidable". Otherwise I'm unclear what the intent of distinguishing between "avoidable" and "unavoidable" was, and I'm unclear what it has to do with whether or not an AI model that was trained with "unlicensed" content is or isn't copyright infringing on its own.
I’m in the camp that believes that it’s neither necessary nor desirable to hold humans and software to the same standard of law. Society exists for our collective benefit and we make concessions with each other to ensure it functions smoothly and I don’t think those concessions should necessarily extend to automated processes even if they do in fact mimic humans for the myriad ways in which they differ from us.
So what benefit do we derive as a society from deciding that the capability for copyright infringement is in and of itself infringement? What do we gain by overturning the current protections the law (or society) currently has for technologies like xerox machines, VHS tapes, blank CDs and DVDs, media ripping tools, and site scraping tools? Open source digital media encoding, blank media, site scraping tools and bit-torrent enable copyright infringement on a massive scale to the tune of millions or more dollars in losses every year if you believe the media companies. And yet, I would argue as a society we would be worse off without those tools. In fact, I'd even argue that as a society we'd be worse off without some degree of tolerated copyright infringement. How many pieces of interesting media have been "saved" from the dust bin of history and preserved for future generations by people committing copyright infringement for their own purposes? Things like early seasons of Dr Who or other TV shows that were taped over and so the only extant copies are from people's home collections taped off the TV. The "De-specialized" editions of Star Wars are probably the most high quality and true to the original cuts of the original Star Wars trilogy that exist, and they are unequivocally pure copyright infringement.
Or consider the youtube video "Fan.tasia"[1]. That is a collection of unlicensed video clips, combined with another individual's work which itself is a collection of unlicensed audio clips mashed together into a amalgamation of sight and sound to produce something new and I would argue original, but very clearly also full of copyright infringement and facilitated by a bunch of technologies that enable doing infringement at scale. It is (IMO) far more obviously copyright infringement than anything an AI model is. Yet I would argue a world in which that media and the technologies that enable it were made illegal, or heavily restricted to only the people that could afford to license all of the things that went into it from the people who created all the original works, would be a worse world for us all. The ability to easily commit copyright infringement at scale enabled the production of new and interesting art that would not have existed otherwise, and almost certainly built skills (like editing and mixing) for the people involved. That, to me, is more valuable to society than ensuring that all the artists and studios whose work went into that media got whatever fractions of a penny they lost from having their works infringed.
The capability of the model to infringe isn’t the problem. Ingesting unlicensed inputs to create the model is the initial infringement before the model has even output anything and I’m saying that copyright shouldn’t be assigned to it or its outputs. If you train on licensed art and output Darth Vader that’s cool so long as you know better than to try copyrighting that. If you train on licensed art and produce something original and the law says it’s cool to copyright that or there’s just no one to challenge you, also cool.
If you want to ingest unlicensed input and produce copyright infringing stuff for no profit, just for the love of the source material, well that’s complicated. I’m not saying no good ever came of it, and the tolerance for infringement comes from it happening on a relatively small scale. If I take an artists work with a very unique style and feed it into a machine then mass produce art for people based on that style and the artist is someone who makes a living off commissions I’m obviously doing harm to their business model. Fanfics/fanart of Nintendo characters probably not hurting Nintendo. It’s not black or white. It’s about striking a balance, which is hard to do. I can’t just give it a pass because large corporations will weather it fine.
That Fantasia video was good. You ever see Pogo’s Disney remixes? Incredible musical creativity but also infringing. I don’t doubt the time and effort needed to produce these works, they couldn’t just write a prompt and hit a button. I respect that. At the same time, this stuff is special partly because there aren’t a lot of things like it. If you made a AI to spit out stuff like this it would be just another video on the internet. Stepping outside copyright, I would prefer not to see a flood of low effort work drown out everything that feels unique, whimsical, and personal but I can understand those who would prefer the opposite. Disney hasn’t taken it down in the last 17 years and god I’m old.
https://youtu.be/pAwR6w2TgxY?si=K8vN2epX4CyDsC96
The training of unlicensed inputs is the ultimate issue and we can just agree to disagree on how that should be handled. I think
I’m not saying it’s better because it’s naturally occurring, the objective reality is that we live in a world of IP laws where humans have no choice but to retain copyrighted information to function in society. I don’t care that text or images have been compressed into an AI model as long as it’s done legally but the fact that it is has very real consequences for society since, unlike a human, it doesn’t need to eat, sleep, pay taxes, nor will it ever die which is constantly ignored in this conversation of what’s best for society.
These tools are optional whether people like to hear it or not. I’m not even against them ideologically, I just don’t think they’re being integrated into society in anything resembling a well thought out way.
Firstly it’s not an appeal to nature fallacy to accurately describe how a product of nature works, secondly it’s the peak of lazy online discussion to name a fallacy and leave as though it means something. Fallacies can be applied to tons of good arguments and along with the fallacy, you need to explain why the point itself being made is fallacious.
Also, the model isn’t a human brain. Nobody has invented a human brain.
And the model might not infringe if its inputs are licensed but that doesn’t seem to be the case for most and it’s not clearly transparent they don’t. If the inputs are bad, the intent of the user is meaningless. I can ask for a generic super hero and not mean to get superman but if I do I can’t blame that on myself, I had no role in it, heck even the model doesn’t know what it’s doing, it’s just a function. If I Xerox Superman my intent is clear.