Because its a finetune of 3.5 optimized for the use case of computer use. Its ac...

therealmarv · 2024-10-22T15:38:05 1729611485

So 3.5.1 ?

dotancohen · 2024-10-22T16:10:07 1729613407

I think that was the last version number for KDE 3.

Stands out for me as I once replaced a 2.3 Turbo in a TurboCoupe with a 351 Windsor ))

r00fus · 2024-10-22T16:21:04 1729614064

For networks

usaar333 · 2024-10-22T15:41:58 1729611718

I don't think that's correct. This looks like a new model. Significant jump in math and gpqa scores.

diggan · 2024-10-22T15:45:45 1729611945

If the architecture is the same, and the training scripts/data is the same, but the training yielded slightly different weights (but still same model architecture), is it a new model or just a iteration on the same model?

What if it isn't even a re-training from scratch but a fine-tune of an existing model/weights release, is it a new version then? Would be more like a iteration, or even a fork I suppose.

cooper_ganglia · 2024-10-22T16:04:25 1729613065

Yes, it's a new model, but not a Claude 4.

It's the same, but a bit different; Claude 3.6 makes sense to me.

bloedsinnig · 2024-10-23T08:06:31 1729670791

I would assume that 3.5 means that the base training (which takes weeks/month) wasn't changed but only finetuning happened.

HarHarVeryFunny · 2024-10-22T17:39:27 1729618767

Could be just additional post-training (aka finetuning) for coding/etc.