(un?)fortunately h264 is doing far more than mpeg iframes. Each frame contains look back data to up to 16 other frames, and each frame is also divided into variable size blocks 4-16 pixels in dimension. This arbitrary blocking of the white frames likely what is consuming so much space.
If you encoded in mpeg I'm sure you would dramatically reduce the file size, but not as magically as you would think. It will still store a new white iframe every 16 frames by default, though many encoders will let you specify an alternative.
In theory I guess YouTube could spend n-times as much processing to encode each video in n formats to find the smallest for that particular video and serve that encoding, even doing so after a video reaches x views to cut out yy% of wasted computation, but then they would have to support n codecs on m devices instead of just 1.
Edit: Tried using yt-dl on a 24 hour white screen video, it is 130 MB for the 'highest quality'. 10 hours of white noise it 1.55GB.