Agreed. Deep learning has revolutionized AI and anyone hoping to contribute to AGI is going to have to master DL first, and probably a lot more AI like a variety of probabilistic methods.
That's a challenging learning curve that's not much different from earning a PhD. And then, to stand out in AGI, you're going to have to integrate a dozen kinds of cutting edge components, none of which are anywhere ready for prime time.
At this moment in time, I think any attempt at implementing AGI is going to be half-baked at best. For now, a Siri / Alexa that can do more than answer single questions will be challenging enough.
I actually don't think mastering deep learning is very difficult. Theres a gazillion papers and ideas floating around, but the core concepts, that actually work, things like batch normalization, gradient descent, dropout, etc are all relatively simple. Most of the complexity comes from second rate scientists pushing their flawed research out into the public in some form of a status game
> [...] but the core concepts, that actually work, things like batch normalization, gradient descent, dropout, etc are all relatively simple.
They may be simple, but it's controversial why they work. For example dropout is not really used much in recent CNN architectures, and it's just - I don't know - ~5 years old? So people don't even agree what the core concepts are ...
Sure, this is true. I just threw dropout in there without thinking much into it. The point is even if we include the techniques that have been replaced by newer ones, the total number of techniques is small. Also if youre learning deep learning for the first time, understanding why dropout was used, and then how batch normalization came to replace it is key to understanding neural networks. Same can be seen in network architectures, tracing the evolution of CNNs from VGG16 -> ResNet and why Resnet is better exposes one to the vanishing gradient problem, shows how the thought evolution happened, and gives hints to what could be next/builds intuition for the design of deep neural nets
Get some basics of linear algebra down. Eigenvectors, Eigenvalues. Nail down Matrix Factorization, Principal Components, and the relationship between the two.
Learn softmax, logit function, different activation functions. When to use them. Difference between classification, binary classification, multi label prediction etc. Theyre all similar, just use a few different functions in the neural net
After this, go through some optimization theory and learn the different algorithms for optimizing neural nets, i.e. Adam vs RMSProp.
Then I would just get a list of all the top network architectures, then go through their white papers. Do this chronologically. Start at ~2012. Basically all the network architectures build on each other. So take the first good working deep CNN (alexnet), find out why it worked. Then move to VGG, why did that one work? What problems were solved? then move onwards.
^Do this for computer vision, then again for NLP (Word Vectors) and transformers (BERT, XLNet, etc).
Then youre done.
Theres also GANs etc, but that stuff is extra.
From there, choose whatever specialty you wanna research, and just grab the state of the art.
That's a challenging learning curve that's not much different from earning a PhD. And then, to stand out in AGI, you're going to have to integrate a dozen kinds of cutting edge components, none of which are anywhere ready for prime time.
At this moment in time, I think any attempt at implementing AGI is going to be half-baked at best. For now, a Siri / Alexa that can do more than answer single questions will be challenging enough.