There’s so much data required for training it’d be surprising humans look at even a small subset of it at all. They need different statistical tools to clean it up. That’s where attacks will be concentrated, naturally, and this is why synthetic data will overtake real human data, just after ‘there isn’t enough data even if it’s too much already’.