I'm not sure you'd want to involve NLP. I think it's more down to the fact that it's relatively easy to reverse engineer the function generating the question and get the parameters that way.
Wolfram Alpha won't answer very many questions of that sort[0] but a cracker can spent a couple of hourse to enumerate all kinds of questions and writing tailored functions (and a detection routine) for each. If your detection routine is naive (e.g. choose randomly) and some or all of your answering functions work badly, no problem, you only have to get it right occasionally anyway.
[0] And indeed it fails unsurprisingly if comically for the second question How many times does the word four appear within this sentence?, trimming it to How many times and showing details about the British newspaper.
That one is actually pretty easy to beat - the OCR is easy enough, and you can refresh the page until you get an easy question. Some of them are very simple integer arithmetic exercises.
As for the harder questions, Wolfram Alpha is better able to do them than the average human.
I'm not sure about which OCR you're talking about but I've tried ABBYY and because of the fractions ie top and bottom halves the thing craps out. Granted, if you get a single line problem it could easily be solved. Definitely there's the problem of application as well. Since this is a quantum bit service Site, you can expect people to know a minimum of integration. But I don't expect Facebook login to have anything even close to 8th grade mathematics.
I always seem to strike it lucky with that site. I can remember it coming up twice on HN (once now, once a while ago - http://news.ycombinator.com/item?id=2290466), and both times I got a one-line equation. I didn't save the equation I got this time, but it was maybe five terms long and most of the atoms were zeros.
Eg. 'What's 4 times four?' or 'How many times does the word four appear within this sentence?'