Model allows image generation in 4-8 steps. 8 steps on M1 Max take 8 seconds, on RTX 3090Ti just 3 seconds. Requires latest chrome canary. Model paper here https://latent-consistency-models.github.io
Hold on, to run your demo does one have to click the "Load Model" button before doing anything? 'cos what I see is a form that is greyed out with the error message still at the top:
> You need latest Chrome with "Experimental WebAssembly" and "Experimental WebAssembly JavaScript Promise Integration (JSPI)" flags enabled!
Now I'm wondering whether the top message goes away once the flags are enabled?
> Hold on, to run your demo does one have to click the "Load Model" button before doing anything?
Yes. I thought it won't be good if it would download 3.5gb once you open the page.
>Now I'm wondering whether the top message goes away once the flags are enabled?
No, I haven't added any checks for that (and I'm not sure how the first one can be properly checked), so it's just an info bar. Which is, eventually, misleading.
UNET takes about a 1:10 on WebGPU and around a minute on CPU in one thread. VAE is 2 minutes on CPU and about 10 seconds on GPU. It should be because most GPU ops for VAE are already implemented but for UNET are not. So in the latter case browser is just tossing data from GPU to CPU and back on each step