Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The [schnell] model variant is Apache-licensed and is open sourced on Hugging Face: https://huggingface.co/black-forest-labs/FLUX.1-schnell

It is very fast and very good at rendering text, and appears to have a text encoder such that the model can handle both text and positioning much better: https://x.com/minimaxir/status/1819041076872908894

A fun consequence of better text rendering is that it means text watermarks from its training data appear more clearly: https://x.com/minimaxir/status/1819045012166127921



That’s not really fair to conclude that the training data contains vanity fair images since the prompt includes “by Vanity Fair”.

I could write “with text that says Shutterstock” in the prompt but that doesn’t necessairly mean the dataset contains that


The logo has the same exact copyrighted typography as the real Vanity Fair logo. I've also reproduced the same-copyrighted-typography with other brands with identical composition as copyrighted images. Just asking it "Vanity Fair cover story about Shrek" at a 3:2 ratio gives it a composition identical to a Vanity Fair cover very consistently (subject is in front of logo typography partially obscuring it)

The image linked has a traditional www watermark in the lower-left as well. Even something innocous as a "Super Mario 64" prompt shows a copyright watermark: https://x.com/minimaxir/status/1819093418246631855


If the training data includes a public blog post which has a screenshot of a vanity fair piece?

It's like GRRM complaining that LLMs can reproduce chunks of text from his books "they fed my novels into it" Oh yeah? It's definitely not all the parts of your book quoted in millions of places online, including several dedicated wiki style sites? That wouldn't be it, right?


On my list of AI concerns, whether or not Vanity Fair has it’s copyright infringed does not appear.


First they came for fashion magazines, and I said nothing.


Just to be clear: you're comparing the collapse of the creative restrictions which the state has cleverly branded "intellectual property" to... the holocaust?

Of all of the instances on HN of Godwin's law playing out that I've ever seen, this one is the new cake-taker.


No, I’m not making a comparison to the holocaust. Thanks for asking.


Must we always jump to Nazis?

This is like the fifth time I see someone paraphrasing Niemöller in an ai context, and it's exhausting. It's also near impossible to take the paraphraser seriously.

More to the point, AI is a tool. I could just as well infringe on vanity fair IP using ms-paint. Someone more artistic than me could make a oil-on-canvas copy of their logo too.

Or, to turn your own annoying "argument" against you:

First they came for AI models, and I did not speak out, because I wasn't using them. Then they came for Photoshop, and I did not speak out, because I had never learned to use it. Then they came for for oil and canvas, and now there are no art forms left for me.


This isn’t paraphrasing, it’s referencing. The reference has become synonymous with saying “this is a slippery slope for X”.

As to your use of the argument in the other direction, I’d say it doesn’t work very well because no one with any power is coming for those things.


Nobody at all is "coming for" fashion magazines, but you sure seem to be "coming for" AI. Whether you have any power or not is besides point.

Whether you are paraphrasing or referencing to a famous confessional poem dealing with the Holocaust, the only reasonable interpretation is that you're comparing with the Holocaust. Even if you were unaware of the phrases origins, that's how anyone who does know where it comes from will interpret it. See other comments drawing the same conclusion for reference.

Again. Ai is a tool. It can produce illegal material, just like a pencil can, or a brush with oil and canvas. How are they different? They are not.


All journalism is just duplicating the works and performance of others without their permission for profit anyway.


True, but I guess some societies decided there was a greater good in that very specific context.


Are you suggesting that the model independently came up with Vanity Fair's logo, including font and kerning?

https://www.vanityfair.com/verso/static/vanity-fair/assets/l...


How does the licence work when there's a bunch of restrictions at the bottom of that page that seem to contradict the licence?


IANAL but I suspect that "Out-of-Scope Use" has no legal authority.


Thank you. Their website is super hard to navigate and I can't find a "DOWNLOAD" button.


Note that actually running the model without a A100 GPU or better will be tricker than usual given its size (12B parameters, 24GB on disk).

There is a PR to that repo for a diffusers implementation, which may run on a cheap L4 GPU w/ enable_model_cpu_offload(): https://huggingface.co/black-forest-labs/FLUX.1-schnell/comm...


You don't need an A100, you can get a used 32GB V100 for $2K-$3K. It's probably the absolute best bang-for-buck inference GPU at the moment. Not for speed but just the fact that there are models you can actually fit on it that you can't fit on a gaming card, and as long as you can fit the model, it is still lightyears better than CPU inference.


Why this versus the 2 3090s (with nvlink for marginal gains) and 48GB for 2$K ?


3090 TIs should be able to handle it without much in the way of tricks for a "reasonable" (for the HN crowd) price.


higher ram apple silicon should be able to run it too. if they don't use some ancient pytorch version or something.


Why not on a CPU with 32 or 64 GB of RAM?


Much slower memory and limited parallelism. Gpu ÷- 8k pr more cuda cores vs +-16 on regular cpu. Less mem swapping between operations. Gpu much much faster.


Performance, mostly. It'll work but image generation is shitty to do slowly compared to text inference.


Got it running. But it is a special setup.

* NVIDIA Jetson AGX Orin Dev. Kit with 64 GB shared RAM.

* Default configuration for flux-dev. (FP16, 50 steps)

* 33GB GPU RAM usage.

* 4 minutes 20 seconds per image at around 50 Watt power usage.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: