I wish I could publish it, but my company isn't very much into open source. It's...

I wish I could publish it, but my company isn't very much into open source. It's a standard context free grammar framework modified to generate output in a stochastic manner. So it's basically a stochastic context free grammar (SCFG). I can go more into depth in private if you like.

The phrase for finding word pairs in text corpora is "cohort analysis". I was a on research team that did studies of that; mostly finding them, not generating anything with them.

It's an interesting subject area.