TensorForce: A TensorFlow library for applied reinforcement learning

plingamp · on July 18, 2017

In the past, i've found that these higher level libraries built on top of TF are useful for quick model building, but should be used cautiously. By having default hyperparameters it can be easy to blindly build semi-working models without knowing what's happening. I would recommend either reading the associated papers or implementing the models in code (at least once) before using these pre-built models. That being said, i'm really excited by this project! I think it'll save researchers a bunch of time.

TheIronYuppie · on July 18, 2017

Disclosure: I work at Google.

This is great feedback! I'd love to hear more - if you'd like to send me some examples of what you've seen in the past with pitfalls, I'd love to share them with the team.

Thanks! aronchick (at) google

gwern · on July 18, 2017

I like their focus on flexibility. I've tried a few deep RL implementations in the past and run into issues like their DQN or A3C implementation being hardwired in a number of ways to working only on ALE, with no way to use it on other problems (eg the CNN dimensions are hardwired).

naturalgradient · on July 18, 2017

If I understand the project correctly it precisely does not advocate default hyperparameters but exposes all configurations through the declarative interface.

plingamp · on July 18, 2017

I may have used the term hyperparameter too loosely. Yes, this project does a good job on taking a configuration first approach, but even they set some defaults. For example, they set relu as their default layer activation function. I haven't had time to see what other such defaults are being set.

AlexKuhnle · on July 18, 2017

Thanks for your feedback! To clarify our philosophy regarding default configurations: On the one hand, we try to make all hyperparameters and settings of the agents/models centrally configurable, by specifying one configuration object/file. On the other hand, we try to provide a set of default values, where it makes sense for the applied user, for whom the full range of hyperparameters is probably not interesting. In doing so, we try to combine "the best of both worlds", but sometimes that might lead to conflicts. This is something we're actively working on, to get the balance right (and comments are very welcome, best on GitHub).

cshenton · on July 18, 2017

This is super cool. Might be replacing something I've been building with it.

Are there any benchmarks I could check out? Especially for a3c.

AlexKuhnle · on July 19, 2017

Hey, glad you like it! We haven't properly benchmarked our models yet, but it's one of the things at the top of our ToDo list. (However, we will only run benchmarks on some selected "standard" environments, given our limited resources.)