Hacker News new | past | comments | ask | show | jobs | submit login

Wait, so it was never even confirmed or actually leaked by OpenAI that they're using a MoE model? That was just invented by some blog? I've seen it mentioned everywhere as though it's true.

I think it's likely they're using a technique that is similar to or a descendant of the Tree of Thought technique, because in Karpathy's talk where he was not allowed to discuss GPT4s architecture so he had to discuss only information in the public domain about other models, he pretty strongly indicated that the direction of research he thought people should pursue was ToT. In the past, Karpathy has communicated basically as much as he can to try and educate people about how these models are made and how to do it yourself - he has one of the best YouTube tutorials on making an LLM up. I suspect that he personally probably does not agree with OpenAI's level of secrecy, but at minimum he shares a lot more information publicly than most OAI employees.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: