You can attempt to attack / move to / cast many spells on arbitrary pixels on the map. The bots are shown casting spells on targets that aren't visible in the demo. The amount of available targets probably blows up the count.
I figured this much as well, and I think this also begins to explain why some of the restrictions exist, and how difficult it would be to generalize this to the entirety of the Dota action space. I'm assuming they were pretty smart at defining and limiting the possible action space to get down to 170K. For example, restricting the hero pool down to 5 heroes which only have a reasonably small number of options in a reasonably small radius around them (I think Sniper's Q ability might lead to the highest number of discretized actions among their chosen hero pool), banning Boots of Travel (though I suppose this shouldn't add too many actions since you have to TP to a friendly unit of which there are not that many, so maybe this doesn't pose a problem with respect to the action space size, but it does have strategic implications), etc.
For a hero like invoker who can cast Sunstrike anywhere on the map at any time, would you try to come up with domain heuristics (only consider locations in the map near enemy heroes), or deal with an explosion of possible actions (and this applies to a ton of different hero mechanics that are not in scope here)?
If the goal of the project is generalization, you likely want to shy away from opinionated heuristics like the former you mention.
In the development for the Magic the Gathering AI (Duels), one of the restrictions is "don't cast harmful spells on your targets" even though for some edge cases this is actually the optimal thing to do. They traded a smaller search space at the expense of optimality.
I see. It seems like training a model for a hero with only targeted spells would be much faster than training a model for a hero which can cast spells at arbitrary map locations. I don't play DoTA, so not sure how many such heroes exist, or if a team comp of only heroes with targeted abilities would even be viable
It sounds like 170,000 is every possible combination of actions that might ever be valid. They stated that usually around 1000 are valid at any point in time.
Based on the examples under the "Model structure" section, I'm guessing they are counting all combinations of spell and target location, including locations on the ground for ground-targetable spells? That could add up quick... e.g. 10 spells * 20 target units * 9x9 grid of locations around each = around 16,000 possibilities.
Rough guesses for available actions:
+ 10 (spell/item activations) + 15 (attack commands. 5 enemy heroes and ~10 near by creeps)Which still leaves... approximately 170,000 actions unaccounted for