From the video output seems fine. But if it is a trimmed version, it is wong to ...

PreachSoup · on March 15, 2023

Could call it Slim LLaMa

bryant · on March 15, 2023

SLLaMa?

refulgentis · on March 15, 2023

It's nonsensical, celeb announces they're going to rehab and notes it (?) is an issue affecting all women, at least, earlier today (??), they also noted it wasn't drugs or alcohol this time, but, a life (???)

londons_explore · on March 15, 2023

Without instruction tuning, the perfect language model produces output which has the same level of intelligibility as random text from the training set. And the training set probably has a lot of spam and junk in.

ddren · on March 15, 2023

What are you comparing it to? Without instruction tuning and a two character prompt "He" I am not sure why you would expect it to perform any better.

refulgentis · on March 15, 2023

I was replying to a comment that said it “seems fine.”

It does not seem fine.

It is incomprehensible and doesn’t match the results I’ve seen from 7B through 65B.

It is true that RLHF could improve it, and perhaps then this severe of optimization will seem fine.

tbalsam · on March 15, 2023

I've heard a number of people say (from earlier) that the quantization and default sampling parameters is way wacked. Honestly even running that model size alone is the big achievement here and getting the accuracy to actually reach the benchmark is the beeg next step nao, I believe. <3 :'))))

lostmsu · on March 16, 2023

If you run a quantized 60G model and the output is worse than raw 7G model, you can throw your quantizer out.