qwert7890's comments

qwert7890 · on Sept 12, 2018

> They managed to coordinate simply by looking at what the others were doing.

This bit seems incorrect, https://medium.com/@stelmaszczykadam/do-openai-five-dota-2-b....

Ajedi32 · on Sept 12, 2018

In Dota you have perfect information of the state of all allied units, so I think it's debatable whether sharing input (observation) data between the bots really counts as "communication".

Though that same fact also means communication shouldn't really be necessary; the bots are all exact copies of each other and share a copy of the game state, so they should all have similar ideas of what actions are optimal at any given point in the game.

qwert7890 · on Sept 12, 2018

Thank you Ajedi32, I updated the post https://medium.com/@stelmaszczykadam/do-openai-five-dota-2-b....

qwert7890 · on July 26, 2018

You broke the LGPL license, you didn't state changes:

https://github.com/snowkylin/ntm/blob/master/LICENSE

Moreover, in the paper 5 times you write:

"Our implementation"

You also don't acknowledge that you are piggybacking on snowkylin's code. You didn't make it magically "stable", like fixed NaNs or anything like that.

And you want to be cited as follows:

title={Implementing Neural Turing Machines, author={Collier, Mark and Beel, Joeran},

That's just very bad.

You should:

1. State changes (orbifold did it for you below). Acknowledge snowkylin. Link to them.

2. Title it "Improved initialization in NTM" or something like that, not "Implementing NTM".

qwert7890 · on Nov 20, 2017

Simplest RL algorithm (Q-learning) achieves 100m in QWOP: https://www.youtube.com/watch?v=e27TUmMkOA0

Although it found and exploited a local maximum of "knee scraping" technique (which humans can replicate) :)