DeepMind is an algorithm which clearly improves a lot on traditional tree search...

Eliezer · on March 9, 2016

It's not an improved tree search. Deepmind was almost pro level purely using the deep learning network before doing any Monte Carlo search.

One way of looking at the significance of this is that it might tell us that relatively simple machine learning algorithms can capture key aspects of the versatile human cortical capacity to learn things like Go using sheer pattern recognition. (It's amazing that human visual cortex can do that.) If the human brain were more mystical in its power, then human-level ability to recognize Go patterns wouldn't have been penetrable at all to a comparatively simple neural algorithm like deep learning.

From another standpoint, this could show that we're reaching the point where, when an AI algorithm starts to reach interesting levels of performance, a much-encouraged Google dumps 100,000 tons of GPU time and 5 months of a few dozen researchers' time to improve the algorithm right past the human performance level. In N years from now when it's a more interesting AI system doing more interesting cognition, we could see a more interesting result materialize when a big organization, encouraged by some key milestone, invests 5 months of time and massively more computing power to further improve an AI system.

panic · on March 9, 2016

Regardless of how good the deep learning network is on its own, the algorithm described in the DeepMind paper is an improved tree search.

brian_cloutier · on March 9, 2016

Forest for the trees.

Monte Carlo Tree Search was necessary and itself a massive improvement over minimax but not sufficient for creating a Go program to challenge professional players. The true innovation here is the neural networks. Without those networks to guide it AlphaGo plays far worse than existing programs.

The fact that those networks are sufficient is pretty incredible. We already knew that by inventing them we had created a very pure form of pattern recognition, but it's surprising that the pattern recognition coupled with some tree search seems to be all you need to play Go as well as humans.

It's not impressive to you that we've now reproduced a piece of human intelligence, "intuition", which was previously considered out of reach?

nazka · on March 9, 2016

> but it's surprising that the pattern recognition coupled with some tree search seems to be all you need to play Go as well as humans.

Is it really all we need? Or it is more that they threw a lot of hardware to it? What if if a part of its efficiency is because they threw a lot of GPUs with a huge network, rather than having a NN efficient by itself?

We see that: "AlphaGos Elo when it beat Fan Hui was 3140 using 1202 CPUs and 176 GPUs. Lee Sedol has an equivalent Elo to 3515 on the same scale (Elos on different scales aren't directly comparable). For each doubling of computer resources AlphaGo gains about 60 points of Elo."

It's a lot of hardware.

javierluraschi · on March 9, 2016

What does a lot of hardware means? The human brain has about 100B neurons with about 100+ dendrites per neuron, while AlphaGo has about 1K CPUs with about 2B transistors per CPU.

nazka · on March 9, 2016

I am wondering how much the amount of hardware they used had an effect on the bottom line, compared to the wisdom of their algorithm. Everybody knows it's a great step in AI. But how much? How much their algorithm is smart? Or simply put did they overfit by throwing a lot of layers and GPUs to the task? Or the algorithm is truly smart? What is the ratio of that.

It is the same question for the data they used. Facebook, Google and others seem to agree that, at the end, the quantity and quality of data are more important than the algorithm itself. So how much is it at play here? Knowing that will be able to show us why it is performing well and how much we can appreciate their work.

javierluraschi · on March 9, 2016

You can take a look at their paper... http://airesearch.com/wp-content/uploads/2016/01/deepmind-ma...

Basically, the ("lots of hardware") distributed implementation gets ~3100 points in the Elo rating against ~2900. ~2900 is still sufficient to win against Fan Hui. So I would say, that yes, this algorithm has most of the merit here.

panic · on March 9, 2016

It's totally impressive! I'm excited for the future of DeepMind and to see what other kinds of things we can build from neural networks. I'm just saying that we have to be careful with what kind of intelligence we ascribe to an algorithm like this. An AI is not going to take over the world by impressing us with its game-playing skills.

JabavuAdams · on March 9, 2016

> An AI is not going to take over the world by impressing us with its game-playing skills.

What if you train an AI to play an RTS where matter, energy, and time are the resources and the goal is to take over the world?

argonaut · on March 9, 2016

I'm not going to fear an AI whose idea of "the world" is a computer game, an AI that isn't even aware of the existence of the real world, and isn't even aware of the existence of the set of real world actions.

PascalsMugger · on March 9, 2016

I think, almost by definition, you won't be afraid of anything until it's already coming for you. By that time it will already have the capability to decide your future.

argonaut · on March 9, 2016

The thing is you are essentially making an unfalsifiable argument, invoking the existence of something that is merely imaginary.

JabavuAdams · on March 9, 2016

I should have put "RTS" in quotes. I'm referring to the real world -- the original RTS.

nl · on March 9, 2016

DeepMind is the company/lab Google bought. The algorithm is more properly called AlphaGo.

taneq · on March 9, 2016

> Forest for the trees.

Heh, nice one.

Eliezer · on March 9, 2016

I don't think it's accurate to call AlphaGo 'improved tree search' the way that Deep Blue was improved tree search. You could with equal justice call it an improved neural net.

argonaut · on March 9, 2016

The correct characterization is that it's a hybrid of deep learning and tree search. It's also a hybrid in the feature representation for the algorithm. There are raw features (board state), but there are also handcrafted features for the tree search.

panic · on March 9, 2016

I'd say both characterizations are accurate. Neither the tree search nor the neural net could have accomplished this on their own. But the essential interface to the algorithm is the tree search: it's picking the best move from a set of legal moves determined by formal game rules. The real world doesn't follow formal game rules. I find it difficult to see the progression from this victory to some kind of real-world AI takeover.

kqr · on March 9, 2016

Sure, but when you say "tree search" people think of the traditional kind of expert system-based tree search. Traditional tree search methods prune trees by approximating subtrees according to some very specific rules decided on by human experts. This means in a tree search system, the computer cannot actually evaluate a position any better than the humans that designed it. The way it performs better is by running very many of these evaluations deep down in the tree.

When you're talking about some other kind of evaluation function, such as Monte Carlo rollouts, you usually prefix that to the tree search (in the case of Monte Carlo rollouts, "Monte Carlo tree search" or MCTS) to indicate that besides the basic fundamental task common to almost all AIs (finding the optimal branches in a decision tree) it functions completely differently from the expert systems.

So is the case with this program, which (in a first pass) approximates subtrees by a trained neural net, rather than Monte Carlo rollouts or an expert system. So using terminology that suggests classical expert system tree search is bound to cause confusion (as you noticed).

igravious · on March 9, 2016

> The real world doesn't follow formal game rules.

Really?

Why not?

argonaut · on March 9, 2016

Infinite state/belief/world space. Infinite action space. It's not so much that there aren't rules (there are - physics), it's that the complexity of the full set of rules is exponential or super-exponential.

ogrisel · on March 9, 2016

You can do math with continuous and infinite dimensional spaces.

argonaut · on March 9, 2016

And? This does not address my argument that the complexity is beyond-combinatorially explosive (infinite spaces). I'm not talking about the space of possible board states. I'm talking about merely the set of all possible actions.

EDIT: clarified my language to address below reply.

ogrisel · on March 9, 2016

...and it's possible to train learning agents to sense and interact with a world described by high dimensional continuous vector spaces, for instance using conv nets (for sensing audio / video signals) and actor-critic to learn an continuous policy:

http://arxiv.org/abs/1509.02971

The fact that the (reinforcement) learning problem is hard or not is not directly related to whether the observation and action spaces are discrete or continuous.

ogrisel · on March 9, 2016