It seems that you could just use Google's algorithms and modify the site trust metric using a front-page spam-score, whilst reducing the effect of link-juice from links with associated marketing keywords ("buy the doohickey on this link", or whatever).
Keeping marketing sites high in your SERPs would make you way more money on referrals though.
Solving algorithmic tasks by just building a ML model of your competitor's algorithm seems like a funny way to start. I imagine this to be the way "programming" will stop being a thing in a few hundred years.
At the moment for web search it probably would not work, because I imagine from feature extraction to result there are several models involved to create intermediate results.
Keeping marketing sites high in your SERPs would make you way more money on referrals though.