Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Carmack is someone who has proven to be an almost unequalled productivity machine when working on medium-difficulty problems... Now, for the first time, we'll see if his approach to problem solving can also work on a truly difficult problem. I agree it's very much an open question.


Is that really true though? It seems more like he's good at a medium difficulty problems in a narrow subdomain of software development, which is saying something a bit different. I might even say he's good at hard problems within that subdomain. How transferable those skills are is the most salient point. Assuming peak genius level intellect (which, I don't know, maybe?) It would still take something like 4 or 5 years to reach expert level knowledge in such a complex domain.


Agreed. Deep learning has revolutionized AI and anyone hoping to contribute to AGI is going to have to master DL first, and probably a lot more AI like a variety of probabilistic methods.

That's a challenging learning curve that's not much different from earning a PhD. And then, to stand out in AGI, you're going to have to integrate a dozen kinds of cutting edge components, none of which are anywhere ready for prime time.

At this moment in time, I think any attempt at implementing AGI is going to be half-baked at best. For now, a Siri / Alexa that can do more than answer single questions will be challenging enough.


I actually don't think mastering deep learning is very difficult. Theres a gazillion papers and ideas floating around, but the core concepts, that actually work, things like batch normalization, gradient descent, dropout, etc are all relatively simple. Most of the complexity comes from second rate scientists pushing their flawed research out into the public in some form of a status game


> [...] but the core concepts, that actually work, things like batch normalization, gradient descent, dropout, etc are all relatively simple.

They may be simple, but it's controversial why they work. For example dropout is not really used much in recent CNN architectures, and it's just - I don't know - ~5 years old? So people don't even agree what the core concepts are ...


Sure, this is true. I just threw dropout in there without thinking much into it. The point is even if we include the techniques that have been replaced by newer ones, the total number of techniques is small. Also if youre learning deep learning for the first time, understanding why dropout was used, and then how batch normalization came to replace it is key to understanding neural networks. Same can be seen in network architectures, tracing the evolution of CNNs from VGG16 -> ResNet and why Resnet is better exposes one to the vanishing gradient problem, shows how the thought evolution happened, and gives hints to what could be next/builds intuition for the design of deep neural nets


For anyone unfamiliar with all but the most trivial details, do you have some good papers to recommend, to save us from wading through all the rest?


Get some basics of linear algebra down. Eigenvectors, Eigenvalues. Nail down Matrix Factorization, Principal Components, and the relationship between the two.

Learn softmax, logit function, different activation functions. When to use them. Difference between classification, binary classification, multi label prediction etc. Theyre all similar, just use a few different functions in the neural net

After this, go through some optimization theory and learn the different algorithms for optimizing neural nets, i.e. Adam vs RMSProp.

Then I would just get a list of all the top network architectures, then go through their white papers. Do this chronologically. Start at ~2012. Basically all the network architectures build on each other. So take the first good working deep CNN (alexnet), find out why it worked. Then move to VGG, why did that one work? What problems were solved? then move onwards.

^Do this for computer vision, then again for NLP (Word Vectors) and transformers (BERT, XLNet, etc).

Then youre done.

Theres also GANs etc, but that stuff is extra.

From there, choose whatever specialty you wanna research, and just grab the state of the art.



Yeah but could he build a model railroad set as good as Rod Stewart’s in under five years?


Nobody will want to work in a research program under this guy. He's just way too mean, and being mean is a no to researchers. Also, as other commenters have mentioned, this is a demotion.


I hadn't been aware he was mean. What are examples of this meanness?


He took his cat to the animal shelter because it was getting old. Justice for Mitzi!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: