Probably just a big hashtable mapping word -> the number of times it's been seen, and another hashset of all the words it hasn't seen. When a post comes in you hash all the words in it and look them up in the hashtable, increment it, and if the old value was 0 remove it from the hash set.
250k words at a generous 100 bytes per word is only 25MB of memory...
250k words at a generous 100 bytes per word is only 25MB of memory...