That is technically true, but when the base model is wasting parameter information on poorly tagged, watermarked stock art and other garbage images, it's not really a meaningful distinction. Better data makes for better models, nobody cares about how well a model outputs trash.
Ok, but you're severely misrepresenting the importance of things. Base SDXL is a fine model. Base SDXL is going to be much better than a materially smaller model that you've retrained with "good data".