Web Stable Diffusion

nl · on March 17, 2023

Note that (only?) Chrome canary supports WebGPU so this won't yet work in most people's browsers.

They kindly provide instructions to run it (even on Apple M1).

capitainenemo · on March 17, 2023

hm. It seems it's been in firefox, pref'd off by default, for a few years now.. https://hacks.mozilla.org/2020/04/experimental-webgpu-in-fir...

ruihangl · on March 17, 2023

Thanks for the pointer! As far as we know the WebGPU development on firefox is a bit lagging behind, so we use Chrome and did not develop this project on firefox.

junrushao1994 · on March 17, 2023

I downloaded Firefox nightly a couple of days ago, turned the flags on accordingly, but it didn't work out saying:

> Find an error initializing the WebGPU device TypeError: adapter.requestAdapterInfo is not a function

No idea how to fix it.

capitainenemo · on March 17, 2023

Well, based on the reply below, it sounds like they weren't testing in Firefox at all, so my guess is they were simply using APIs that only exist in Chrome right now. Whether those are actually necessary for the SD implementation, no idea.

junrushao1994 · on March 17, 2023

Submitted a question in Firefox support forum, and there seems some bugs blocking it from running smoothly: https://support.mozilla.org/en-US/questions/1408328

capitainenemo · on March 18, 2023

Ah. Thanks for following up. I see it's mentioning there was a regression bug in v111 in that particular call, so I guess could use an older version. I'm going to subscribe to the bug too.

crowwork · on March 17, 2023

Webgpu will ship this year, so it will be more widely available pretty soon

pjmlp · on March 17, 2023

It took 10 years for WebGL to be widely available, and it is still barely used beyond some niche use cases like 3D models in ecommerce, or Flash's revenge when coupled with WASM.

EugeneOZ · on March 17, 2023

With WebGPU it will be faster - libraries like ThreeJS will use WebGPU when possible.

pjmlp · on March 17, 2023

ThreeJS doesn't even has a good alternative to having a complete incompatible shading language, other than switching to graphical based shaders.

Do you expect everyone to rewrite 10 years of shaders just for fun?

BiteCode_dev · on March 17, 2023

modeless · on March 17, 2023

There is an origin trial. They should enable that, then it would work in Chrome stable today. It's only supported on Windows and Mac right now though, I think.

FL33TW00D · on March 17, 2023

WebGPU comes out April 26th - then 65% of the worlds web browsers will have access to it.

WanderPanda · on March 17, 2023

Super interesting! Do you have a source on this? A quick search didn‘t turn up anything for me

tormeh · on March 17, 2023

Anyone interested in this might also be interested in WONNX: https://github.com/webonnx/wonnx

rattt · on March 17, 2023

Ohh wow that actually worked, that's awesome: https://i.imgur.com/4tYEphX.png

Tested on Intel MacOS 12.5 PC with AMD 8GB RX 580 GPU, about 28 secs for 20 steps, surprisingly fast too. I did have to go to chrome://flags and enable "Unsafe WebGPU" even on Chrome Canary (113.0.5656.0) before it would work, otherwise I just got "no adapter" errors.

junrushao1994 · on March 17, 2023

Yep it surprisingly works on my AMDGPU too, even if it’s designed only for M1/M2

wood_spirit · on March 17, 2023

This is a tangent, but I’ve been wondering… and perhaps HNera know?

I use some basic libre cad programs to plan my dream house project. And their renderings are pretty non-photorealistic.

Are there any upscalers that can take an inside or outside house render and make it look like something from Pinterest? Meaning the input is an image, not text?

davidbarker · on March 17, 2023

Yes! Something like ControlNet is great for this. I use this[1] API on Replicate specifically because it offers the various methods (depth maps, edge detection, etc.)

Replicate sometimes give you free use (I forget if this model does), but if you pay then an image output will cost you about one cent.

Give it your image, write a prompt (something like "a modern living room"), choose your ControlNet model from the drop-down, and submit.

If you choose depth map, for example, it will generate its best guess depth map for your image and use that to steer the Stable Diffusion output. It's fascinating, and a lot of fun to play with.

[1] https://replicate.com/jagilley/controlnet

Edit: I'd love to see what you produce, if you do use it.

canadiantim · on March 17, 2023

Do you think it's possible to use similar tools for visualizing a house from a floor plan only? Or maybe ChatGPT 4.0 is more up to the task these days

matheist · on March 17, 2023

ControlNet maybe? https://github.com/lllyasviel/ControlNet

At huggingface: https://huggingface.co/spaces/hysts/ControlNet

Play around with the different models, you might get better results with some vs others.

Tajnymag · on March 17, 2023

The existing tooling around Stable Diffusion very often features either a style transfer tool or even have a an input image to be given together with the text prompt.

For example: https://nmkd.itch.io/t2i-gui

v9v · on March 17, 2023

You can also try importing your CAD model into Blender, assigning some materials to the different objects and getting renders there, though tuning the render may take more time than tuning a prompt.

junrushao1994 · on March 17, 2023

Is it possible to integrate this with [onnxruntime-web](https://onnxruntime.ai/docs/tutorials/web/)?

ruihangl · on March 17, 2023

Yes of course. Optimizing and building the model to the format acceptable by ONNX web runtime will getting this in. On the other hand, we also need to enhance our own runtime (for example for better memory pool management) in the future.

gamblor956 · on March 17, 2023

Confirmed working on Windows with AMD RX 590 in Chrome Canary. About 23 seconds on average using DPM (20 steps), 55 second average using PNDM (50 steps).

I had issues compiling it on my own computer, but the demo version at https://mlc.ai/web-stable-diffusion/#text-to-image-generatio... works fine.

snvzz · on March 18, 2023

Same on Windows 11, 5800x3d with Vega64, PortableApps Chrome Canary.

17s DPM (20 steps), 41s PNDM (50 steps)

kevinlinxc · on March 17, 2023

Can someone Eli5 how machine learning compilation works? Is site basically A1111's SD web ui with fewer bells and whistles but way less intensive?

ruihangl · on March 17, 2023

Thanks for your interest! Most of the existing stable diffusion demos rely on a server behind to run the image generation. It means you need to host your own GPU server to support these workloads. It is hard to have the demo run purely on web browser, because stable diffusion usually has heavy computation and memory consumption.

The web stable diffusion directly puts stable diffusion model in your browser, and it runs directly through client GPU on users’ laptop. This means there is no queueing time for the server’s response. It also means more opportunities for client server co-optimizations, since essentially the “client” and “server” are both the single laptop. The web stable diffusion is also friendly for personalization and privacy. Since everything runs only on the client side and no interaction with server is needed, you can imagine to have your own style stable diffusion deployed and demonstrated on the web without sharing the model to anyone else, and you can also run with personalized model input (e.g., the text prompt in this case) without letting others know.

Thanks for your interest again! And we are happy to hear your feedback on your experiences and the functionalities you would like us to add in the future.

nico · on March 17, 2023

Thank you, great explanation.

What happens when someone wants to update/upgrade the model to a newer version? Can they just get a “diff” and “patch” their model, or do they have to download a whole new one?

ruihangl · on March 17, 2023

Upgrading the model is pretty easy. We just need to build the new model locally in the same way we build the current model. This usually takes fewer than 2min. If people want to deploy the new version to web browser and share for others to use, they just need to upload the model weights to some server (for example we are now using a public Hugging Face repo to store the weights), and provide a link pointing to the weights. This can be achieved also with little effort.

nico · on March 17, 2023

Amazing, thank you!

Kye · on March 17, 2023

Does the functionality differ any from Easy Diffusion?

https://github.com/cmdr2/stable-diffusion-ui

It installs in and runs from a single folder, so it's nice and tidy.

nl · on March 17, 2023

Usually Stable Diffusion (and most machine learning models) run on a server with the web front end (eg A1111's SD web ui) just providing a user interface. Even if you download it, you need to run the server on your computer. When you make a request to it you are using the CPU/GPU on the server.

The linked version runs a version of the stable diffusion model in the webbrowser, so it uses your own CPU (and in the case GPU) via the APIs provided by the web browser. This specific implementation uses an API called WebGPU which isn't yet widely supported.

jaimex2 · on March 17, 2023

Where's the 4GB model loaded from and to where?

jokethrowaway · on March 17, 2023

Interesting choice!

Before reading what they used I assumed they would run tch-rs (torchlib bindings for rust) on wgpu and ship it via wasm.

jokethrowaway · on March 17, 2023

Ouch, torchlib doesn't have a webgpu target though

bulbosaur123 · on March 17, 2023

How does this compare to Automatic1111?

mk_stjames · on March 17, 2023

(not the author) This is taking the model and running it entirely in the browser via web assembly and using webGPU, meaning the browser is executing compiled code directly, inside its own process. This is different to the 'web ui' implementations like Auto1111 which are just a website front-end for a python script that is running a server and running the model in a background process on your computer. It's a very different type of implementation.

onion2k · on March 17, 2023

It works (very, very slowly) on my Intel Macbook. Very impressive indeed. WebGPU has a ton of potential.