Hacker Newsnew | past | comments | ask | show | jobs | submit | trekhleb's commentslogin

What I mean is training something like GPT-3 in a distributed manner using a large number of regular browsers or laptops with average WebGPU support/power and WebRTC for communication.

Does it even make sense to ask this? Is it reasonable or feasible?

I understand there are many nuances, such as the size and source of the training data, the size of the model (which would be too large for any browser to handle), network overhead, and the challenge of merging all the pieces together, among others. However, speculative calculations suggest that GPT-3 required around 3x10^22 FLOPs, which might (very speculatively) be equivalent to about 3,000 regular GPUs, each with an average performance of 6 TFLOPs, training it for ~30 days (which also sounds silly, I understand).

Of course, these are naive and highly speculative calculations that don’t account for whether it’s even possible to split the dataset, model, and training process into manageable pieces across such a setup.

But if this direction is not totally nonsensical, does it mean that even with a tremendous network overhead there is a huge potential for scaling (there are potentially a lot of laptops connected to the internet that potentially and voluntary could be used for training)?


Thanks for the feedback! WebGPT is good. Looks like it is a vanilla JS? I used TensorFlow.js to offload all the troubles of working with tensors, gradients, and WebGPU integration to it. Along with a possibility to train the model in the browser it also helped to keep the actual GPT code pretty concise (<300 lines). Hopefully it will make easier to learn the model architecture itself for those who’re interested.


It is a very visual and entertaining visualization, I love it.

It inspired me to experiment with a genetic algorithm in "Self-parking car evolution":

https://trekhleb.dev/self-parking-car-evolution/


Thanks for the feedback! It is already in the roadmap here https://feedback.okso.app/feedback/p/love-it-is-in-desperate...


yup that was me :-)

cant wait! OS wide dark theme makes encountering a big white screen very painful...


No it is not open source for now


Person “R” breaks into the person’s “U” private property and kills part of the person’s “U” family. Could you give me an example of the properly worded “why” part that could justify person “R”?


That is the "what" happened, it does not give me any meaningful insight as to "why" it happened.


I think if we would need to select the best advertisement method/channel it would be people who are in love with the product :)


Yeah, this one is fun :D Really nice catch!


Yeah, the GA is not the best option for self-driving tasks, agree.

The reason why I chose GA is because I wanted to play around with this algorithm at the first place. And only after that I’ve tried to come up with some artificial problem I could try to solve with it :)


I'm glad that we're on the same page on this. Your motivation is totally understandable and the end result (including relevant blog post) is both very nicely done and educational. Thank you for this and your other resources that you share online! :-)


Yeah, the linear model is too simple to generalise :)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: