Isn’t fake news detection a solved problem with NLP? Language and grammar detection was already making good progress almost a decade ago. If it works in the field of terrorism and social engineering, that it is not yet wielded for fake news seems to be reluctance on the part of social media entities. Having said that.. I am not a technical person..this is just a layman point of view.
How would language and grammar detection aid in fake news detection? What makes news "fake" generally involves dimensions far more complex than grammar.
Well if I ever start following the textbook rules about the Oxford comma and splint infinitives, that's when you'll know I've been replaced with a Russian bot.
In seriousness, language/grammar use analysis could potentially provide a hint that the person behind a user account has been replaced by another writer.
I did not say that language and grammar detection would help in fake news detection. What I meant was shouldn’t fake news detection be the next logical step after language and grammar detection.
No this is nonsense. You could, at best, train a model to either identify specific fake news claims; or perhaps train it to identify the style of one particular fake news source. But you will never create a model that can identify novel claims from an author you don't have a strong reason to associate with fake/genuine news beforehand.
How could that possibly work? Not even holmes was so bold in his claims of linguistic analysis.
It’s not entirely nonsense. Not all fake news is social engineering but almost all social engineering has some ‘fake news’ component.
If you can use NLP for detecting social engineering scams, it can also be used to detect fake news. There will likely be false positives. But that’s because the net would be wide.
No it is absolutely nonsense. You cannot possibly use NLP to evaluate novel claims and evaluate them for truthfulness. The NLP model couldn't know that. Social engineering is detectable because it usually includes something like a request for your personal information.
A novel fake news claim would not look anything like prior fake news claims. You'll have false positives, yes. You'll also have false negatives. Probably a lot of them.
A generic fake news detector is very different from, say, a detector of claims that Ukraine is a nazi state. The latter is easy to detect.
But how do you make a difference between a true press release claiming that, say, an airplane was hijacked, and a fake press release claiming the same?
In my opinion fake news detection is an ill-defined problem, since the definition of "fake news" itself is not clear.
You might be able to spot blatant inaccuracies like "Mount Everest is located in France"; but then do you really need an NLP tool to tell you that? Verifying anything slightly more complex than that is an AI-complete task, if even humans have a hard time solving it.
Besides, you can manipulate people without writing factually incorrect statements, just by lying by omission and using careful word selection. And this is basically impossible to detect. Other times you have news that just can't be verified, even by humans, or whose verification depends on what source you consider "reliable" (which then introduces a whole lot of bias).
I've worked on the problem personally, and I think the only workable solution would be a tool that would in some way highlights possibly problematic parts of an article they are reading, maybe enrich it with possible related references where the user can read more about the topic (these would be suggested automatically).
Unfortunately, while this kind of solution could be helpful for a "trained" reader, most people don't want to put any effort into consuming news carefully; It would end up helping only those who already have a habit of verifying, which would then be useless to solve the "fake news problem".
You are absolutely right. But when someone from New Zealand is tweeting as an American or someone from Manchester tweeting as a Russian journalist, a subset of fake news can be flagged. (eta: I guess what I am trying to say is that we first have to identify the "unreliable narrator" archetype.) For example, we have AP stylebook in the United states that is related to associated press. Every publication has a recommended style book. Now...can it be mimicked? Yes, of course. But we have at least have one level of authentication. It takes years to master stylebook rules for a human. A fake news generator might be able to pass the Turing test, but I don't think its been done yet.
The downvotes were from the phrasing "Isn’t fake news detection a solved problem with NLP?" is a leading construction. It suggests that everyone already knows that it is solved, whereas that's not true at all.
Bit of both. The terrorism stuff doesn't work nearly as well as they'd like you to think, and knocking out obvious fakes would depress Twitter's engagement statistics.