Most of Meta's models have not been released as open source. Llama was a fluke, ...

_aavaa_ · 2025-07-02T02:23:12 1751422992

Llama is not open source. It is at best weights available. The license explicitly limits what kind of things you are allowed to use the outputs of the models for.

jacquesm · 2025-07-02T03:36:11 1751427371

Which, given what it was trained on, is utterly ridiculous.

Grimblewald · 2025-07-02T05:46:21 1751435181

Yup, but that being said, Llama is GPLv3 weather Meta likes it or not. Same as ChatGPT and all the others. ALL of them can perfectly reproduce GPLv3 licensed works and data, making them derivative work, and the license is quite clear on that matter. In fact up until recently you could get chatGPT to info dump all sorts of things with that argument, but now when you try you will hit a network error, and afterwards it seems something breaks and it goes back to parroting a script on how it's under a proprietary license.

Iolaum · 2025-07-02T10:34:34 1751452474

This is interesting but it has not been proven in court, right?

Grimblewald · 2025-07-12T10:49:52 1752317392

related stuff has, the core part being that if your model reproduces parts or all of a licensed work, it needs to comply with the license / copyright. Otherwise why aren't pirates just making 'models' that generate protected material, or music, and completely bypass all laws?

I know because I wanted to, as a form of protest/performance art, train a model to a few Disney movies and publicly distribute, but legal advice was this would put me directly into hot water not just because of who im pissing off (which i knew and was comfortable with) but also the fact there was precedent (i.e. news papers suing LLM providers).

It would be an open and shut case that would leave me in financial ruin.

The reason openAI hasn't been struck with this yet is, who has the time? and there isn't much to learn from all that either. Most open source tooling out competes openAI's offering as is, so the community wouldn't really win beyond punishing someone.

_aavaa_ · 2025-07-03T22:46:13 1751582773

I don't see how this follows at all. Github isn't GPL3 just because it stores and gives you back gpl3 code

Grimblewald · 2025-07-12T10:44:44 1752317084

read the license, and look up what derivative work means. If you're still unclear after that I'm happy to walk you through it.

birn559 · 2025-07-02T04:28:35 1751430515

Is that easier to enforce than having AI only trained in a legal way (=obeying licenses / copyright law)?

_aavaa_ · 2025-07-02T18:39:12 1751481552

Yes. Having training obey copyright is a big coordination problem that requires copyright holders to group together to sue meta (and prove they broke copyright, which is not something proven before for LLM).

Whereas meta suing you into radioactive rubble is straightforward.

phyrex · 2025-07-01T22:19:51 1751408391

That's not true, the llama that's open source is pretty much exactly what's used internally

saubeidl · 2025-07-02T11:02:33 1751454153

> There is no good or open AI company of scale yet, and there may never be.

Deepseek, Baidu.