Mlx-community/OLMo-2-0325-32B-Instruct-4bit

teruakohatu · 2025-03-18T03:04:49 1742267089

OLMo uses open datasets, such as CommonCrawl and StackOverflow, for training, about 5TB worth of text. I wonder how well it would perform if it was also trained on Annas Archive/LibGen (>600TB).

mdp2021 · 2025-03-18T10:22:41 1742293361

A possibly better question could be how well it would perform if it was trained on selected material - see the efforts of Mortimer Adler in the USA, or the efforts of any good publishing house in the definition of editorial collections.

But I remain skeptical that without "critical thinking as a condition to write into "conscious" memory" the barrier of "conformism" will ever be broken.

rajman187 · 2025-03-18T04:38:57 1742272737

Not a lawyer but would assume downloading material from libgen is, in the vast majority of cases, illegal because it's a breach of copyright or similar. That’s gotten Meta in quite a spectacle of late [1]

[1] https://www.loeb.com/en/insights/publications/2023/12/richar...

maxloh · 2025-03-18T05:46:51 1742276811

CommonCrawl is composed of copyrighted contents too. You gain copyright on your work automatically the moment you created it, including this very comment.

AmazingTurtle · 2025-03-18T07:23:10 1742282590

What if I repost your comment without your permission?

maxloh · 2025-03-18T08:52:08 1742287928

One could argue that using copyrighted content in LLMs, much like reposting, should fall under fair use. This is also Microsoft's claim in the GitHub Copilot lawsuits. It's up to the court to decide though. (IANAL)

fulafel · 2025-03-18T04:47:22 1742273242

In many jurisdictions it's just sharing that is illegal, not obtaining.

akx · 2025-03-18T07:22:50 1742282570

Yes. The interesting legal question is that are you sharing the original knowledge if you've transformed it via teaching it to an AI.

https://www.reuters.com/legal/litigation/ai-companies-lose-b... reports on the ongoing case on the image generation side of the fence.

maxloh · 2025-03-18T08:58:25 1742288305

That is called copyright laundering FYI.

anon373839 · 2025-03-18T12:27:15 1742300835

It’s a catchy term, but loaded. Copyright protects only original expression, not ideas and information. So if a computer algorithm reads the former and outputs the latter, arguably copyright isn’t involved at all.

There are plenty of good counterarguments to this as well, when you consider the effects of automation and scale. I’m definitely interested in seeing how the jurisprudence develops as these cases go through the courts.

teruakohatu · 2025-03-18T02:25:42 1742264742

I have struggled with SVG generation with just about all models, the SVG demo for this model is more or less that I get from much larger models.

Am I doing something wrong? Everyone seems to say how well models work in producing SVGs but I get shapes in all sorts of the wrong places. SVG documents are quite low level (verses editing them in Inkscape or Illustrator) so its tricky to modify, beyond very simple shapes.

simonw · 2025-03-18T02:48:56 1742266136

The models are mostly terrible at SVG output, at least if you ask for something that's hard (or impossible?) to draw like a pelican riding a bicycle. That's why I use it as a benchmark, I think it's amusing: https://simonwillison.net/tags/pelican-riding-a-bicycle/

Some of them can do good SVGs for things that make sense, like simple diagrams.

showmexyz · 2025-03-18T02:53:20 1742266400

These works well for some svg that are simple and already in training data but doesn't work for harder svgs, even simple one if they are out of distribution of training data.

In simon's example whole purpose is to make it draw something that it has not seen before but can easily infer from geometry, spatial arrangement. I think it makes a fun problem.

mdp2021 · 2025-03-18T10:48:57 1742294937

Sorry, I realize I should also provide links to resources published by the model maker, the Allen Institute for Artificial Intelligence ("Ai2"):

https://allenai.org/blog/olmo2-32B

https://allenai.org/blog

https://github.com/allenai/olmo

https://allenai.org/papers

anon373839 · 2025-03-18T04:47:42 1742273262

I think it’s a big deal to see a fully open LLM now achieving this level of quality. While the partially open releases we’ve seen from the big labs are are quite valuable, models like OLMo-2 are the only way that researchers can truly study this technology to answer questions about how the models’ capabilities are shaped by their training data and training process.

The closed and partly-closed models rely on a lot of secret sauce, so it’s also just really impressive to see their results being replicated in the open.

mdp2021 · 2025-03-18T10:34:02 1742294042

Hear, hear.

In the paramount tasks is to understand the internals of the "black box", get knowledge, engineer better. Of course having "fully open" projects should help that.

maxloh · 2025-03-18T05:52:39 1742277159

You should link to the original model too: https://huggingface.co/allenai/OLMo-2-0325-32B-Instruct

Kudos to Allen AI for their great work on a fully-open LLM!

lastdong · 2025-03-18T06:51:12 1742280672

100%! Allen AI OLMo. Thank you.

I was here wondering if there was a specific reason for MLX behind this model, but (thankfully thinking of openness) nothing to do with the original model.

(*) https://allenai.org/olmo

kristianp · 2025-03-18T06:13:49 1742278429

Here's the huggingface link from that article: https://huggingface.co/mlx-community/OLMo-2-0325-32B-Instruc...

How do mlx quants compare to gguf?

Edit: the thread below says that mlx is faster, but gguf quantisations tend to maintain better text quality.

https://www.reddit.com/r/LocalLLaMA/comments/1gc0t0c/how_doe...

bmn__ · 2025-03-18T10:29:43 1742293783

The download-model step breaks with:

max_rec_size = mx.metal.device_info()["max_recommended_working_set_size"]

RuntimeError: [metal::device_info] Cannot get device info without metal backend

Because it is accompanied with a huge stacktrace, it makes me think this is a genuine bug and I hope Simon will fix it.

simonw · 2025-03-18T12:59:52 1742302792

What hardware are you using?

My plugin only works on Apple Silicon.

pylotlight · 2025-03-18T03:20:22 1742268022

"refreshingly abstract" is just another term for wrong.. not particuarly helpful..

simonw · 2025-03-18T03:24:53 1742268293

My joke there makes more sense in the context of the series: https://simonwillison.net/tags/pelican-riding-a-bicycle/