They did as you say (you are a PM, after all!), and next week they rolled out the "likelihood of engagement" model. An independent analysis by another team member, familiar with the old model, confirmed that it was still mostly driven by politics (there is nothing much going on in Elbonia, besides politics), but politics was neither the direct objective not an explicit factor in the model.
The observed behavior is the same: using the new model, most people are still shown highly polarized posts, as indicated by subjective assessment of user research professionals.
We used newsgroups and message boards long before Facebook. They weren’t as toxic, I’m assuming due to active moderation. The automated or passive or slow moderation is perhaps the issue.
I think they weren't as toxic because content creators didn't realize divisive content drives much more engagement. It's not about moderation, it's a paradigm shift in the way content is created.
In regards to a predictive model and privacy/ethics/etc, regardless of your objective function and explicit parameters a model can only be judged on what it actually predicts, thus it is enough to answer the prior question to be able to answer this.
This is because of the fact that machine learning models are prone to learn quite different things than the objective function intended, hence the introduction of different intent or structure of the model must be disregarded when analysing the results.
To any degree the models predict similarly, they must be regarded as similar, but perhaps in a roundabout way.
Agreed, as a general rule I shy away from predicting things I wouldn't claim expertise in otherwise. This is why consulting with subject matter experts is important. Things as innocuous as traffic crashes and speeding tickets are a huge world unbeknownst to the casual analyst (the field of "Traffic Records")