Everyone uses „pirated“ content, but some are better at hiding it and/or not talking about it.
There is no other way to do it.
Synthetic data will not replace original data like books. Synthetic data works very good for math.