More

trekhleb · on Nov 22, 2024

What I mean is training something like GPT-3 in a distributed manner using a large number of regular browsers or laptops with average WebGPU support/power and WebRTC for communication.

Does it even make sense to ask this? Is it reasonable or feasible?

I understand there are many nuances, such as the size and source of the training data, the size of the model (which would be too large for any browser to handle), network overhead, and the challenge of merging all the pieces together, among others. However, speculative calculations suggest that GPT-3 required around 3x10^22 FLOPs, which might (very speculatively) be equivalent to about 3,000 regular GPUs, each with an average performance of 6 TFLOPs, training it for ~30 days (which also sounds silly, I understand).

Of course, these are naive and highly speculative calculations that don’t account for whether it’s even possible to split the dataset, model, and training process into manageable pieces across such a setup.

But if this direction is not totally nonsensical, does it mean that even with a tremendous network overhead there is a huge potential for scaling (there are potentially a lot of laptops connected to the internet that potentially and voluntary could be used for training)?

trekhleb · on Nov 10, 2024

Thanks for the feedback! WebGPT is good. Looks like it is a vanilla JS? I used TensorFlow.js to offload all the troubles of working with tensors, gradients, and WebGPU integration to it. Along with a possibility to train the model in the browser it also helped to keep the actual GPT code pretty concise (<300 lines). Hopefully it will make easier to learn the model architecture itself for those who’re interested.

trekhleb · on April 29, 2024

It is a very visual and entertaining visualization, I love it.

It inspired me to experiment with a genetic algorithm in "Self-parking car evolution":

https://trekhleb.dev/self-parking-car-evolution/

trekhleb · on July 21, 2022

Thanks for the feedback! It is already in the roadmap here https://feedback.okso.app/feedback/p/love-it-is-in-desperate...

senectus1 · on July 21, 2022

yup that was me :-)

cant wait! OS wide dark theme makes encountering a big white screen very painful...

trekhleb · on July 21, 2022

No it is not open source for now

trekhleb · on March 19, 2022

Person “R” breaks into the person’s “U” private property and kills part of the person’s “U” family. Could you give me an example of the properly worded “why” part that could justify person “R”?

timintheglades · on March 19, 2022

That is the "what" happened, it does not give me any meaningful insight as to "why" it happened.

trekhleb · on Oct 9, 2021

I think if we would need to select the best advertisement method/channel it would be people who are in love with the product :)

trekhleb · on Sept 28, 2021

Yeah, this one is fun :D Really nice catch!

trekhleb · on Sept 28, 2021

Yeah, the GA is not the best option for self-driving tasks, agree.

The reason why I chose GA is because I wanted to play around with this algorithm at the first place. And only after that I’ve tried to come up with some artificial problem I could try to solve with it :)

ablekh · on Sept 28, 2021

I'm glad that we're on the same page on this. Your motivation is totally understandable and the end result (including relevant blog post) is both very nicely done and educational. Thank you for this and your other resources that you share online! :-)

trekhleb · on Sept 28, 2021

Yeah, the linear model is too simple to generalise :)