This is especially significant for all the stuff outside the virtual environment.
Is there a standard method of making a "deploycontainer" from a devcontainer? Something that strips out everything not necessary to execute the application (however that is defined)?
I've had cases where colour was required e.g. one of my ID stamps was unreadable in black-and-white, my tax return form has colour sections (might have been accepted in black-and-white but was certainly clearer in colour)...
I absolutely 100% guarantee that you can submit tax forms printed in black and white. I suspect the IRS is more surprised when they see printed forms that are anything but.
Thing is, I also own a color scanner. It’s just as easy for me to make a color copy of a doc as a B/W copy. That’s pretty common now. If your bank thinks that a form with red lines on it must be an original, then they suck at technology more than most banks.
> If your bank thinks that a form with red lines on it must be an original, then they suck at technology more than most banks.
Be that as it may, I don't have a banking license and wasn't about to turn my nose up at a bank that finally let me open a corporate account after about 4 months of effort.
I am skeptical whether "greedflation" is a real phenomenon. Corporations were no less greedy a few years ago. Recent high inflation rates are more likely connected to monetary policy.
Not sure if you have insight to supply chain costs and how they inflated cost during the pandemic, but me and most I know who do understand the truth behind the greedflation dynamic
No; prices went up fast but there is nothing to drive them back down (ie competiton) even with inputs costs having declined due to monetary policy. Ergo, inflation was used as cover to hold prices high long after their cause had subsided.
Even with no competition you can’t keep putting prices up unless your customers have the means to pay. Competition didn’t suddenly change, it was the means to pay that did.
4. Language prediction training will not get stuck in a local optimum.
Most previous things we train on could have been better served if the model developed AGI, but they didn't. There is no reason to expect LLMs to not get stuck in a local optimum as well, and I have seen no good argument as to why they wouldn't get stuck like everything else we tried.
There is very little in terms of rigorous mathematics on the theoretical side of this. All we have are empirics, but everything we have seen so far points to the fact that more compute equals more capabilities. That's what they are referring to in the blog post. This is particularly true for the current generation of models, but if you look at the whole history of modern computing, the law roughly holds up over the last century. Following this trend, we can extrapolate that we will reach computers with raw compute power similar to the human brain for under $1000 within the next two decades.
It's not just the volume of original data that matters here. From empirics we know performance scales roughly like (model parameters)*(training data)*(epochs). If you increase any one of those, you can be certain to improve your model. In the short term, training data volume and quality has given a lot of improvements (especially recently), but in the long run it was always model size and total time spent training that saw improvements. In other words: It doesn't matter how you allocate your extra compute budget as long as you spend it.
In smaller models, not having enough training data for the model size leads to overfitting. The model predicts the training data better than ever, but generalizes poorly and performs worse on new inputs.
Is there any reason to think the same thing wouldn't happen in billion parameter LLMs?
This happens in smaller models because you reach parameter saturation very quickly. In modern LLMs and with current datasets, it is very hard to even reach this point, because the total compute time boils down to just a handful of epochs (sometimes even less than one). It would take tremendous resources and time to overtrain GPT4 in the same way you would overtrain convnets from the last decade.
True but also from general theory you should expect any function approximator to exhibit intelligence when exposed to enough data points from humans, the only question is the speed of convergence. In that sense we do have a guarantee that it will reach human ability
It's a bit more complicated than that. Your argument is essentially the universal approximation theorem applied to perceptrons with one hidden layer. Yes, such a model can approximate any algorithm to arbitrary precision (which by extension includes the human mind), but it is not computationally efficient. That's why people came up with things like convolution or the transformer. For these architectures it is much harder to say where the limits are, because the mathematical analysis of their basic properties is infinitely more complex.
Where did you get that from? It seems pretty clear to me that language models are intended to be a component in a larger suite of software, composed to create AGI. See: DALL-E and Whisper for existing software that it composes with.
The comment said that LLMs are the path to AGI, which implies at least that they’re a huge part of the AGI soup you’re talking about. I could maybe see agi emerging from lots of llms and other tools in a huge network, but probably not from an llm with calculators hooked up to it.
You're arguing that LLMs would be a good user interface for AGI...
Whether that's true or not, I don't think that's what the previous post was referring to. The question is, if you start with today's LLMs and progressively improve them, do you arrive at AGI?
(I think it's pretty obvious the answer is no -- LLMs don't even have an intelligence part to improve on. A hypothetical AGI might somehow use an LLM as part of a language interface subsystem, but the general intelligence would be outside the LLM. An AGI might also use speakers and mics but those don't give us a path to AGI either.)
The comment I was replying to was referencing OpenAIs use of the phrase "the path to AGI". Natural language is an essential interface to AGI, and OpenAI recognizes that. LLMs are a great way to interface with natural language, and OpenAI recognizes that.
While it's kind of nuts how far OpenAI pushed language models, even as an outside observer it's obvious that OpenAI is not banking on LLMs achieving AGI, contrary to what the person I was replying to said. Lots of effort is being put into integrating with outside sources of knowledge (RAG), outside sources for reason / calculation, etc. That's not LLMs as AGI, but it is LLMs as a step on the path to AGI.
I don’t know if they are or not, but I’m not sure how anyone could be so certain that they’re not that they find the mere idea cringeworthy. Unless you feel you have some specific perspective on it that’s escaped their army of researchers?
Because AI researchers have been on the path to AGI several times before until the hype died down and the limitations became apparent. And because nobody knows what it would take to create AGI. But to put a little more behind that, evolution didn't start with language models. It evolved everything else until humans had the ability to invent language. Current AI is going about it completely backwards from how biology did it. Now maybe robotics is doing a little better on that front.
I mean, if you're using LLM as a stand-in for multi-modal models, and you're not disallowing things like a self-referential processing loop, a memory extraction process, etc, it's not so far fetched. There might be multiple databases and a score of worker processes running in the background, but the core will come from a sequence model being run in a loop.
After all these years of VC funding, I can't believe there is still "street cred" attached to it. Not contradicting you, I'm just flabbergasted people haven't caught on their track record.