It seems exceedingly unlikely to me that frames from random YouTube videos would have been used to train image generation models. First off, they're difficult to extract and second, the quality of individual video frames is very low, especially if we're talking about 15 year old phone videos at what, 480p at the very best!