Thanks! I am working in this area and have some ideas for deep learning type methods which move away from concatenative synthesis. It will be nice to compare to what they are using.
We did some work on applying NNs to prosody prediction; see Fernandez, Raul, et al. "Prosody contour prediction with long short-term memory, bi-directional, deep recurrent neural networks." Proceedings of the Annual Conference of International Speech Communication Association (INTERSPEECH). 2014.