That's a very low learning rate -- between 2-3 orders of magnitude lower than wh... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		carbocation on Aug 22, 2023 \| parent \| context \| favorite \| on: I Made Stable Diffusion XL Smarter by Finetuning I... That's a very low learning rate -- between 2-3 orders of magnitude lower than what I've seen for that number of steps. I'll have to give it a try.

AuryGlenz on Aug 22, 2023 [–]

I should have been clear - I'm using the Prodigy settings on that page, not the Adafactor one. You set the learning rate to 1 and the scheduler to cosine, but the real learning rate is figured out by the optimizer.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact