Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
Calculating the cost of a Google DeepMind paper (152334h.github.io)
303 points by 152334H on July 30, 2024 | past | 150 comments
Knowing Enough About MoE to Explain Dropped Tokens in GPT-4 (152334h.github.io)
3 points by 152334H on Aug 8, 2023 | past | 1 comment
Non-determinism in GPT-4 is caused by Sparse MoE (152334h.github.io)
397 points by 152334H on Aug 4, 2023 | past | 181 comments
Why can't TorToiSe be fine-tuned? (152334h.github.io)
1 point by 152334H on Feb 11, 2023 | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: