The recent progress in “AI” is almost entirely due to advancements in large lang...

kubiton · on Nov 12, 2023

There are research papers left and right.

Stuff like certain architectures, reading LLM for promoting multi modal llms.

Then we have stuff like insteuctgpt, ml models for robots, lots and lots of research from Nvidia for virtual simulation and transfer to real world, digital twin is also a relevant art in agi.

Object detection is also much better and has nothing to do with llms. Segment anything from FB for example.

Whisper and sd are also not LLM.

There are a ton of puzzle peaces slowly falling in place left and right.

radarsat1 · on Nov 13, 2023

They may not be "large" in the same sense that GPT4 is "large" but apart from then simulator stuff, every single one of the models you mentioned is transformer-based. Every one of them basically includes encoders to project different modes of information (images and audio) into a "language-like" space so that it can be compared with and mapped to and from text. I think it's fair to say that language models, if not LLMs, unlocked a surprising amount of power.

pophenat · on Nov 12, 2023

"There are a ton of puzzle peaces slowly falling in place left and right."

Yet, we do not seem to have a very good understanding of how many pieces there are in the puzzle.

kubiton · on Nov 13, 2023

True.

But I feel well entertained watching them fall. Like using them and experimenting around.

But it also shows the road ahead quite clear. For example were is the money coming from? From millions of people paying for GitHub copilot for example.

How is it sold? Per webui, API and cloud providers.

Digital twin will also play a huge role in this as a bridge between AGI <> real world.

pixl97 · on Nov 12, 2023

The issue here is "It's complicated"

For example, looking at the mechanical replacement of human strength in the 1800 and 1900s shows people that the human hardship costs where real. The labor wars in the US are a good example of this. The process of mechanization shifted power to the hands of the capitalists, and was only wrestled back with blood.

The real key of the future with AI will be the question of generalization. Multimodal AI does show a reasonable amount of ability on predicting real world events. For example, show a picture of a kid opening a bike and ask what is next in image form, and the AI will return a picture of the kid riding a bike. This ability of reasonable prediction based on sets of 'real world' input is not something that we've had in previous generations of computer systems. Again, if these systems generalize well, rapidly become cheaper, and enable the capitalist class to gain more wealth expect their use to explode at a near exponential rate.

Very few reasonably educated people say "AI will never reach human ability", the only question that is really being asked is when, and in a lot of peoples eyes when has moved much sooner.

mistrial9 · on Nov 12, 2023

definite "maybe" - a model can only return elements present in the training material. This is a powerful formula, but not "everything" .. blurring the story with a child, and prediction as a general quality.. is moving into "deception technique" either consciously or not IMHO

afpx · on Nov 13, 2023

People use "AGI" in different ways. The term has vague meaning. Some mean true intelligence. Some mean it can wash their dishes.

That said, technology generally improves exponentially. So, where will we be in 5 years?

rsynnott · on Nov 12, 2023

> We are one step closer, sure, but the calculator was also a step forward.

Even that isn’t particularly clear, I don’t think. A speculative future AGI probably won’t be a fancy LLM, or at least there’s no particular reason to think it would be.