Yes, I remember trying to use Google Books Ngram Dataset [1], but it was too tedious for me to setup and maintain a server with the data for a purpose of a quick-and-dirty tool (that's why I asked for a ready API). Still, using it is probably a nice idea for a more ambitious side project or even a startup.
EDIT. Actually I would happily pay for a tool that implements the idea. Grammarly has paid plans but $30/month is too steep (for my types of usages), and the types of grammar checks it performs is not exactly what I need (which is what real people in real situations use).
LanguageTool has limited support for using Google's n-gram data to find spelling errors. It only uses 3-grams, and only for a list of commonly confused words. I'm not aware of any Free Software that does better.
Slightly self motivated yes. That being said I've spent two years developing applications that could greatly help communities across the world and was having a hard time getting those ventures the proper attention. I theorized that something "less-impactful" would get more attention and in the long run drive attention to more important projects. I've been sharing my google analytics on the facebook page and with friends as a way for all of us to better understand the influence of different platforms. 32% of traffic has come from Facebook which is pretty much free to all startups. I then ran an ad campaign for five days which performed miserably. I will be carefully examining traffic sources and how to cheaply launch my app in the coming month. I don't expect a great deal from tiny tiny trophies I put up a wordpress site in a night and have been watching it in the past week.
The logger was obviouslly there, it was deliberatly collecting the SSIDs and MAC addresses.
Possibly a debug option to log the whole packets added during development, and it was accidently left on in production.
Or the whole packet was always logged, a second process would then skim just extracting the SSID/MAC (correlating with GPS), and another process was deleting the raw logs. That third process failed.
A few big drives in teh data collection devices, and possibly nobody noticed where filling up a little too quickly.
... these are almost certainly used in many spelling and grammar checkers. (To help with where the same spelled word is used in different context)
http://www.aclweb.org/anthology/W12-0304