Possible and has been done, but super-slow and inefficient resulting in long tra...

pk-protect-ai · 2025-07-02T18:39:06 1751481546

Do you mean this one?

https://blog.lambdaclass.com/introducing-demo-decoupled-mome...

reactordev · 2025-07-02T20:15:52 1751487352

This is what piqued my interest in the first place

reactordev · 2025-07-02T20:13:32 1751487212

Yes but could you break it up into chunks of sets of gradients to compute? I know that compute needs the full chunk to compute a set. Again, things I’m exploring but ultimately no different than just having the full dataset on disk and just scaling out compute nodes in ro mode.