Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've just been using a combination of the Reddit API and PushShift, warehousing data in such a way where it makes sense for Shop By Sub since Sept 2020.

Using the PushShift API, I started backwards in time and searched for a bunch of different regexes I thought were important, slowly creeping my way towards the present.

Using the Reddit API, I wrote a script to do a deep dive into specific subreddits, so I could index more popular ones first.

It's mostly just patience to get the amount of data I've got :) (and PushShift can't be overstated, it's very useful).

A mailing list isn't a bad idea. I'll have to figure out a way where the user can see a sign up box, but have it not be intrusive. I've been thinking of adding infinite scrolling to the most recents on the homepage, too, just for folks who are curious as to what weird things Reddit is talking about



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: