One of the things that really pops out to me in the interview on arstechnica is his love of writing 'tools' - so do I..
I had a similar issue when I was working on educational CD's back in the dark ages - 660MB just wasn't large enough for all the voice-over we needed to fit the script we were given.
We couldn't go lower than 16-bit/22k/mono without making it painful to listen to, had to come up with another idea..
The key was to work out that the v/o artist we used had 7 phrasings/cadences, and as there were ~7 questions per section she was already phrasing the lesson correctly.
So instead of having to use an entire intro+question+correct/incorrect recording I could instead splice the intro and correct/incorrect responses (mostly) depending on their location in the lesson.
To just substitute based on position in lesson didn't work though, so the 'Audio Matcher' tool was born.
Allowed us to get over 3+ hours of natural sounding questions & answers audio onto a single CD.
(..loved writing and using that tool)
(..and yes of course we could have simply gone to production with 2 CD's - but that would have cost 2x as much, required installation and not have been anywhere near as cool)
Also check out Helsing's Fire for devices. I don't generally rave about this kind of puzzle games, but like everything Pope does, the attention to details is outstanding.