Hacker Newsnew | past | comments | ask | show | jobs | submit | emikulic's commentslogin

It's over for decels.


Three epochs means it sees each token three times. The dataset is ~1T like you said.


I want flying cars and moving sidewalks like I was promised by The Jetsons. :(


> "I want flying cars and moving sidewalks like I was promised by The Jetsons. :("

And worse yet, as promised in Popular Science / Mechanics; "A flying car in every garage by the year 2000!"


Try https://huggingface.co/ehartford/WizardLM-7B-Uncensored and related models. They're not even trained on smut, just the neutered responses were removed before the RLHF stage (IIUC)


Nice! Will do.


I used to be bad at math but then I did a 360° on that.


Emad said SD cost $600,000 to train. I wonder if Midjourney also had to pay that to train from scratch.


What benefits does the Huggingface diffusers(?) implementation have over A1111?


- Compatibility with stuff from research papers and ML compilers since it is the "de facto" SD implementation.

- The codebase is cleaner more hackable, and (compared to base SAI code) more performant.

- HF continues to put lots of work into optimization and cleanup. For instance, they ensure there are no graph breaks for torch.compile, and work with other hardware vendors for thier own SD implementations.


I like to run A1111 in --api mode and write my own script to drive it over HTTP.


Wow, Reddit is going to lose a lot of pagerank.


You're thinking of archive.is


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: