Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As someone who has been ingesting a high volume of reddit "content" for the better part of 3 years - the spam/bot problem is _wild_.

I've implemented a handful of methods to detect 'networks' of these bots/spammers/karma farming accounts and, from the subreddits I'm monitoring, it's _more than half_ of the total accounts posting to them. This is across subreddits of all types, sizes and topics. Massive subs, regional subs, local subs, they're all completely inundated with these accounts - and these are the ones that make it through Reddit's own spam detection and whatever each subreddit has in place to handle moderation. These are the posts that do go public, more than half of the _accounts_ I've determined are spam/karma farming/bots. It's an even greater proportion of the _posts_ that belong to these accounts. (Thus, there are more spammers than "real" users, and they're posting more than the "real" users)

And this is with rather elementary methods of determining "spam" from "real" users/content. Those spammers who aren't being very lazy can pretty easily slide through my filters. (I'm detecting 'duplicates' of images and post titles/account descriptions using perceptual hash/simhash and hamming distance only - I'm rolling out text/image vector embedding based duplicate detection now and the numbers are even worse with this in place but I don't have it properly tuned yet) They're literally just re-posting the same content that successful/high karma accounts have previously posted en masse across as many subreddits as they can find/aren't banned from and it's wildly effective.

What's crazy to me is that many of them are in the 6 and 7 digits of karma - obviously spam accounts with > 1,000,000 karma is wild.

It seems to me that Reddit has zero interest in controlling this. Some might argue this is confirmed by the lack of moderation tools available to subreddit mods (which was ultimately the motivation of me building this system - but when they changed the API stuff I changed my goals/intention with it).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: