Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is misguided. What you want is adding a special clause to your license that disallows usage for training LLMs. Whether the code is on GitHub or not, it’ll be used to train models if it’s publicly available and the license allows it.


That would make the code not Open Source.

What many people actually want is for LLMs to respect Open Source licenses, and propagate those licenses to the derived works they create.


> That would make the code not Open Source.

So? Maybe making so much code open source without any restrictions was a mistake in the first place. I know that I don’t want trillion dollars megacorps benefiting from my free open source code in any way. That would include LLM training.


The contents of the license are irrelevant, because such training is not being done under that license at all (or else it would already be violating it, failing attribution at the least—and it seems likely to be fundamentally impossible to comply, just like a human could not comply if human learning was subject to such licenses). It’s depending upon copyright restrictions not applying to it (under “fair use” doctrine).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: