Training AI to Do Everything in the Digital Universe

ramzyo · on Dec 26, 2016

Interesting read, thank you for sharing. The author summarizes the thesis of the OpenAI team to be that by exposing AIs to a variety of experiences, the AIs will learn flexible problem solving skills. I'm not convinced this is true. Without imparting upon the AI some mechanism for reasoning across experiences, (e.g. reasoning through analogy), the AI will simply be trained on many specific experiences. How does the AI make the leap from being trained on these specific experiences to abstracting and drawing comparisons between them? This ability is crucial to generalizable AI.

The author mentions transfer learning but sort of glosses over it. She writes, "And according to OpenAI, we’re slowly getting there: some of their agents already show signs of transferring some learning from one driving game to another." Which signs? How much is "some?" Driving game to driving game is interesting, but what about driving a car to driving a boat? Interested to hear more about this.

moxious · on Dec 26, 2016

I just don't trust general, high level descriptions. People have been going at this goal for decades without much proveable success on general intelligence. So when I read generalities I think they're not any further, just still trying. But when someone gives you a gory technical breakdown of a subproblem, that's when you know you've got something and they're serious.

maerF0x0 · on Dec 26, 2016

I think the concept of reasoning through analogy could kind of be like "call a friend" on game shows. If an AI had a directory of task specific agents, it could call on those agents to provide "advice".

ramzyo · on Dec 26, 2016

This is a really interesting idea, thanks for sharing!

bwang29 · on Dec 26, 2016

I'm not an expert on generalized A.I. training but it seems like the problem of training with audio and visual data input from games suffer the exact scale problem with Alpha Go at specific tasks. I could imagine training an A.I. to play a complex RPG game like Witcher, you would first probably need to train on a horse riding game, a running game, a weather reaction game, a free fighting game, a trading game and maybe a couple hundreds of thousands of games, each for a couple million times of trial and error? However, it also seems that human doesn't need to take this amount of reinforcement training data to quickly understand the complex mechanics in life. Wonder if there is any comparison between the amount of trials and errors a baby need to go through V.S. an A.I. need to go through using reinforcement learning to stand up and walk under similar gravity and muscle group setup?

fdej · on Dec 26, 2016

Well, babies don't learn to stand and walk (and many other skills) from scratch by trial and error. The hard part of constructing a walking machine has already been solved by millions of years of evolution.

Indeed, a calf can walk within hours of being born. A human infant might need more time than a calf in part because walking on two legs is harder than walking on four, but much of the difference simply comes down to the fact that humans are born so early that a lot of predetermined brain development has not yet occurred (the muscles and bones are also too weak). The study [1] found that across varying mammalian species, walking is learned a predictable amount of time after conception (as opposed to time after birth).

There's a surely a continuum between brain functions that are completely hard wired and completely learned from scratch. I think it's accurate to say that for many functions, learning is used as a form of adaptive refinement to finalize specific predetermined neural programs. But the exact interaction between learning and pre-programming isn't well understood in most cases.

[1] http://www.pnas.org/content/106/51/21889.abstract

the8472 · on Dec 26, 2016

There's a 3rd factor: memory. I.e. a somewhat static NN can still make input/output to more dynamic memory.

E.g. you could have basic feature detection (edges or maybe even something relating to facial features) prebaked into your visual system. General object categorization gets learned by the visual system while recognizing locations where you have been before needs to access memory.

Of course in the human brain memory is just another big web of neurons with different tradeoffs, but in software you can use other things.

TheSpiceIsLife · on Dec 26, 2016

It takes new humans 9 - 12 months to walk; it takes new humans 3 - 4 years to talk in complex sentences that are mostly grammatically correct and say things most people can understand.

The usual argument in favour of A.I goes something like this: each new human has to learn these things for themselves, whereas A.I. only has to learn to do it once, ever.

sdenton4 · on Dec 26, 2016

I'm curious about the total number of image recognition neutral networks that have ever been trained...

marvin · on Dec 26, 2016

Each instance of an AI only has to do it once, which is an important distinction, but still relevant. E.g. once we get to the point where a human-level housekeeping robot is possible, you don't have to wait 10 years after purchasing it for it to do useful things in your house :)

wubbfindel · on Dec 26, 2016

If an AI is duplicated, so that a new instance is created (pro-creation), then surely the new instance already knows what the parent instance learnt?

robochat · on Dec 26, 2016

This reminds me so much of Ted Chiang's story "the lifecycle of software objects". http://subterraneanpress.com/magazine/fall_2010/fiction_the_...

ge96 · on Dec 26, 2016

Damn I looked it up to see if that was a real game. haha, still reading

edit: holy christ look at the size of that scroll bar nvm

Jack000 · on Dec 26, 2016

it's a great short story, well worth the time

thallukrish · on Dec 27, 2016

Humans have the drive to go about because of a body that has various demands. And with every doing, humans accumulate Ego which again helps their drive to learn, explore, do stuff or define their behaviour. Humans are self-learning systems. The rewards and penalty are also self-created.

Unless general AI systems start to model the above which creates self-learning, they will be just specific systems with a tiny aspect of human world ingrained.

emmab · on Dec 27, 2016

You don't need to program Ego to program a drive to learn new things, just a procedure to calculate the expected value of new information from engaging in some candidate training activity.

thallukrish · on Dec 27, 2016

Self-learning is brought about by Ego. Reward / penalty is self determined or determined by environment. Without self-learning capability which lets humans to learn on their own with the limitations imposed by environment, by their Ego, by their body, which is driven by survival, whatever the AI learns has to be taught in someway by training, by reward/penalty imposed by us. Then it will be always a poor limited imitation of humans. Not a intelligent being.

rl3 · on Dec 26, 2016

>In other words, the VNC acts like the AI’s eyes and hands.

Is this lossless? If not, would the compression potentially create non-deterministic simulations?

Animats · on Dec 26, 2016

Probably the most useful thing they've done is to get the permissions to access all those games from an application.