If you want examples of AI-generated Hacker News titles (made pretty much the same way as the Reddit example in this post), check out this repo: https://github.com/minimaxir/hacker-news-gpt-2
"Each text is analyzed by how likely each word would be the predicted word given the context to the left. If the actual used word would be in the Top 10 predicted words the background is colored green, for Top 100 in yellow, Top 1000 red, otherwise violet."
What if I find out what is the supposedly "non-fake" distribution of green/yellow/red/violet words, and generate my text accordingly?
Generally, whenever your detection strategy is "a spam generator would never do X", I simply update my spam generator to do X. (Note that "X" must be something that is relatively easy to calculate, not things like "this actually makes sense", because it must be something your detector can recognize.)
Also, if Google suddenly started penalizing all text that doesn't "seem natural", there would be tons of false positives: languages other than English, jargon-heavy websites, dyslectic people, etc.
With the risk of being downvoted, what’s the point of ai generating text knowing that it will never understand it like we do and having in mind that poor non AI generated text is already being used for mostly nefarious reasons? As a problem I understand the fascination with it, human languages are extremely complex things and mimmicking it has been proven complicated, but beyond that, what is a practical purpose for this?
Enjoy if it is fun. I simply hope you won’t use it or enable others to use it for any nefarious means. I once entertained the thought that one day we may be able to ai generate novels. And the more i thought about it the more pointless it seemed.
> With the risk of being downvoted, what’s the point of ai generating text knowing that it will never understand it like we do
In part because we're trying to probe what it even means to "understand" something. If an AI is capable of carrying on a cogent conversation with you, does it understand what it's saying? How do you know?
You simply ask it (the AI entity) more and more questions, give it more facts and see if the answers make sense and it gets closer to what you consider cogent. Rinse and repeat. So far we’re far from it. It’s a very complicated problem that may not need to be solved. We have humans doing just that
It is slightly better now than it used to be, i.e. you now have to read two sentences to realize it's incoherent gibberish, rather than just one. But thinking you're getting closer to understanding this way is like the metaphor of climbing a tree to reach the moon: you can keep reporting progress until you run out of tree.
As for it being entertaining, sure, the first attempts were entertaining and instructive. Now, years later, reading more gibberish from a neural net on HN, not so much.
minimaxir's gpt-2-simple is a very nice work to get anyone start with Text Generation from the models released by OpenAI. I often wonder how can one keep up so much with the fast-paced NLP and produce things that abstracts the pain and exposes simple functions for developers to on it.
Hugging Face is another such company making massive impact among non-NLP developers to use such resources.
In my personal experience I get fooled by the bots far too often, it's actually really scary, I can imagine it being used nefariously to push a specific political agenda, spam places, do SEO fraud etc. etc.