Hacker Newsnew | past | comments | ask | show | jobs | submit | ThunderSizzle's commentslogin

Where does the fine money go? To the victims?

> Since AI companies claim fair use no copyright applies. There is no fixing this.

They can claim whatever they want. You can still try to stop it via lawsuits and make them claim it in court. Granted, I believe there's already been some jurisdictions that have sided with fair use in those particular cases.


Laws can be changed. This is right now a trillion dollar industry, perhaps later it could even become a billion dollar industry. Either way, it's very important.

Strict copyright enforcement is a competitive disadvantage. Western countries lobbied for copyright enforcement in the 20th century because it was beneficial. Now the tables have turned, don't hold your breath for copyright enforcement against the wishes of the markets. We are all China now.


Yes, I think Japan added an AI friendly copyright law. If there were problems in the US, they'd just move training there.

Moving training won't help them if their paying customers are in jurisdictions which do respect copyright as written and intended.

OPs idea is about having a new GPL like license with a "may not be used for LLM training" clause.

That the LLM itself is not allowed to produce copyrighted work (e.g. just copies of works or too structurally similar) without using a license for that work is something that is probably currently law. They are working around this via content filters. They probably also have checks during/after training that it does not reproduce work that is too similar. There are law suits about this pending if I remember correctly e.g. with the New York Times.


The issue is that everyone is focusing on verbatim (or "too similar") reproduction.

LLMs themselves are compressed models of the training data. The trick is the compression is highly lossy by being able to detect higher-order patterns instead of fucusing on the first-order input tokens (or bytes). If you look at how, for example, any of the Lempel-Ziv algorithms work, they also contain patterns from the input and they also predict the next token (usually byte in their case), except they do it with 100% probability because they are lossless.

So copyright should absolutely apply to the models themselves and if trained on AGPL code, the models have to follow the AGPL license and I have the right to see their "source" by just being their user.

And if you decompress a file from a copyrighted archive, the file is obviously copyrighted. Even if you decompress only a part. What LLMs do is another trick - by being lossy, they decompress probabilistically based on all the training inputs - without seeing the internals, nobody can prove how much their particular work contributed to the particular output.

But it is all mechanical transformation of input data, just like synonym replacement, just more sophisticated, and the same rules regarding plagiarism and copyright infringement should apply.

---

Back to what you said - the LLM companies use fancy language like "artificial intelligence" to distract from this so they can they use more fancy language to claim copyright does not apply. And in that case, no license would help because any such license fundamentally depends on copyright law, which as they claim does not apply.

That's the issue with LLMs - if they get their way, there's no way to opt out. If there was, AGPL would already be sufficient.


I agree with your view. One just has to go into courts and somehow get the judges to agree as well.

An open question would be if there is some degree of "loss" where copyright no longer applies. There is probably case law about this in different jurisdictions w.r.t. image previews or something.


I don't think copyright should be binary or should work the way it does not. It's just the only tool we have now.

There should be a system which protects all work (intellectual and physical) and makes sure the people doing it get rewarded according to the amount of work and skill level. This is a radical idea and not fully compatible with capitalism as implemented today. I have a lot on my to-read list and I don't think I am the first to come up with this but I haven't found anyone else describing it, yet.

And maybe it's broken by some degenerate case and goes tits up like communism always did. But AFAICT, it's a third option somewhere in between, taking the good parts of each.

For now, I just wanna find ways to stop people already much richer than me from profiting from my work without any kind of compensation for me. I want inequality to stop worsening but OTOH, in the past, large social change usually happened when things got so bad people rejected the status quo and went to the streets, whether with empty hands or not. And that feels like where we're headed and I don't know whether I should be exited or worried.


Everyone is trying to be the shovel sales person for AI, not the gold diggers buying shovels.

I'm not sure if even the LLM companies themselves are selling shovels yet. I think everyone is racing to find what the shovel of LLMs are.


It was collectively decided some time ago that this particular shovel is called nVIDIA.


And memory. What I'm surprised by is that memory production isn't being scaled up since it's basically a universal part of any computing device.


That was decided when crypto mining became too expensive, I guess.


"We" don't have passkeys now. Many functional android devices are not being upgraded to the latest Android versions, and simply will never get true passkey support that isn't locked away inside of Google's vault.

Passwords are much better than the OAuth2 coolaid, and passwords will still be better as long as older devices can't support passkeys due to arbitrary restrictions.


That should only be the case if the fine was actually prosecuted in court.

Plenty of people pay the fine and admit to guilt to avoid being further penalized with court fees, etc. In other words, many people just pay a injustice fine to avoid more trouble. This would punish those type of people even more.


>Plenty of people pay the fine and admit to guilt to avoid being further penalized with court fees, etc.

The system is in fact architected to maximize this in some states. Virginia and Ohio come to mind.


Whose gonna pay for it? The companies that laid off those people? They'll just continue on without worrying.


Civil, but unreasonable. An unpaid maintainer of a free library isn't a vendor, and shouldn't be treated in any such way. A vendor is paid.


This isn't the same as bigcorps offloading their compliance costs to open-source ""vendors"". No one's obligated to do anything. The disclosure window is meant to address a tradeoff between giving the dev a chance to fix it, and minimizing users' risk until patch issuance. But if the dev can't fix it, the risk tradeoff shifts and you do have a duty to make it public for users' sake. You can't take it for granted that you're the first one and only one to have found that vulnerability.


They aren't demanding anything of you. The alternative is immediate disclosure of bugs, not indefinite embargo of bugs.


I don't see how they were treated in that way, though?


Put plainly, any sort of expectations as if they other person is an employee or coworker makes no sense to me.

If Google wants bugs fixed in open source software, they should also submit a PR with the fix, or provide a bounty for the fix.

The way this is done is an unveiled threat (if it was my library, I'd tell them as much. Deadlines are for vendors or employees, not for free libraries).


What did you vote for?


I can't tell you how many people are confused that (1) Microsoft dropped "Core" from .NET 5+, and that .NET 4.8 and .NET 8 are not the same thing.

Microsoft jumped from .NET Core 3 to .Net (Core) 5 to avoid people conflating .NET Core 4 with .NET Framework 4.

Now tech adjacent people in my world, including people from Microsoft, think .NET Core 8 and .NET Framrwork 4.8 refer to the same version.

Luckily that problem will go away as we do our now biannual ritual of upgrading .NET versions, frustratingly.


Easy, do you want links to podcast interviews from .NET team members where they mention this still being an issue with .NET adoption outside traditional Microsoft shops.

For example, see Mandy Mantiquila interview with Nick Chapsas, if I remember correctly it is one of them.


In many languages, the basic version can be just one line of code, if you know the right libraries to leverage. C# leveraging Linq, for example:

    String.Join(" ",
      String.Split(" ", sentence).Reverse()))


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: