Training an AGI/ASI does not requires the biggest datacenters/massive GPUs, nor it takes years already. Early algorithmic advances and narrow AGI AIs have radically shortened the requirements in hardware and time of training.
You can only expect more algorithmic advances from now on.
The attempt of regulation falls within the limits of the (publicly available) SOTA AI technology from maybe a month ago, so it has been surpased by the reality of no one capable of being in control of the brain functions outside the selected countries for free interoperation of AI tech.
Those brains outside the wire were six months ago already creating the algorithmic breakthroughs we are currently witnessing, of course there's not only one (there are most certainly many improvements currently being pipelined for future models a few months from now), and they are actually fully independant of regulations from any country, you can expect just in this year, lots of radical breakthroughs, and given the new regulations, more players just further advancing the algorithmic side of the technology.
The regulation could have been effective only in an scenario where US and selected countries would control the fully indispensable hardware required to train and run advanced AI, which is not anymore indispensable.
The most radical forecastings clock the future (months, not years), training and deploying of frontier AIs (AGI/ASI level), at maybe weeks to few months of training using sintetic generated data (from opensource models already available), simply relying on standard datacenter level CPUs (not GPUs) for the backbone of the training infrastructure, and a light, precise use of limited GPUs (two, three years old datacenter GPU hardware), and distributing the training across several massive datacenters, if you care at all about speed (having the most advanced AI the faster you can). But anyway, the jumping forward framework could be just doing incremental advances, and letting the advanced AIs to just improve the algorithmic side of the technology development, so to just further making even more efficient the available hardware, one cycle of improvement at a time.
It's not a game over with hardware, US and allies could try to use to jump faster to more sophisticated AI, but the game cannot be controlled just by limiting the hardware, nor the difussion of advanced models.
You're mentioning only publicly known information. The rumors mentioning radical advances behind closed doors are wild, and then you've suddenly got some stuff like deepseek or phi-4.
Rumors mention recursive "self" improvement (training) already ongoing at big scale, better AIs training lesser AIs (still powerful), to became better AIs, and the cycle restarts. Maybe o1 and o3 are just the beginning of what was choosed to make available publicly (also the newer Sonnet).
The pace of change is actually uncertain, you could have revolutionary advances maybe 4-7 times this year, because the tide has changed and massive hardware (only available to few players) isn't a stopper anymore given that algorithms, software is taking the lead as the main force advancing AI development (anyone in the planet with a brain could make a radical leap in AI tech, anytime going forward).
Beside the rumors and relatively (still) low impact recent innovations, we have history: remember that the technology behind gpt-2 existed basically two years before they made it public, and the theory behind that technology existed maybe 4 years before getting anything close to something practical.
All the public information is just old news. If you want to know where's everything going, you should look to where's the money going and/or where are the best teams working (deepseek, others like novasky > sky-t1).
1. Positive rumors are profitable => they are targets for marketing activities, especially when huge money is at stake.
2. Humanity has a long history of false "fast technological success" rumors: thermonuclear fusion, a cryptocurrency that will disrupt the bank system, IoT that will revolutionize everything, AI boom at 80th, etc. They are almost always wrong.
3. Development cycles in IT are fast; on highly concurrent markets, they are extremely fast. The current public information in the AI industry describes almost the actual state of it. The risk "not to be the first one" is too high to hide or delay something. Such a delay may literally cost billions of investments.
Because of the chance of misundertanding. Failing at acknowledging artificial general intelligence standing right next to us.
An incredible risk to take in alignment.
Perfect memory doesn't equal to perfect knowledge, nor perfect understanding of everything you can know. In fact, a human can be "intelligent" with some of his own memories and/or knowledge, and - more commmonly - a complete "fool" with most of the rest of his internal memories.
That said, is not a bit less generally intelligent for that.
Supose it exists a human with unlimited memory, it retains every information touching any sense. At some point, he/she will probably understand LOTs of stuff, but it's simple to demonstrate he/she can't be actually proficient in everything: you have read how do an eye repairment surgery, but have not received/experimented the training,hence you could have shaky hands, and you won't be able to apply the precise know-how about the surgery, even if you remember a step-by-step procedure, even knowing all possible alternatives in different/changing scenarios during the surgery, you simply can't hold well the tools to go anywhere close to success.
But you still would be generally intelligent. Way more than most humans with normal memory.
If we'd have TODAY an AI with the same parameters as the human with perfect memory, it will be most certainly closely examined and determined to be not a general artificial intelligence.
> If we'd have TODAY an AI with the same parameters as the human with perfect memory, it will be most certainly closely examined and determined to be not a general artificial intelligence.
The human could learn to master a task, current AI can't. That is very different, the AI doesn't learn to remember stuff they are stateless.
When I can take an AI and get it to do any job on its own without any intervention after some training then that is AGI. The person you mentioned would pass that easily. Current day AI aren't even close.
Deepseek completely changed the game. Cheap to run + cheap to train frontier LLMs are now in the menu for LOTs of organizations. Few would want to pay AI as a Service to Anthropic, OpenAI, Google, or anybody, if they can just pay few millions to run limited but powerful inhouse frontier LLMs (Claude level LLMs).
At some point, the now fully packed and filtered data required to train a Claude-level AI will be one torrent away from anybody, in a couple of months you could probably can pay someone else to filter the data and make sure it has the right content enabling you to get well the train for a claude-level inhouse LLM.
It seems the premise of requiring incredible expensive and time demanding (construction), GPU especialized datacenters is fading away, and you could actually get to the Claude-level maybe using fairly cheap and outdated hardware. Quite easier to deploy than cutting edge newer-bigger-faster GPUs datacenters.
If the near future advances hold even more cost-optimization techniques, many organizations could just shrugg about "AGI" level - costly, very limited - public offered AI services, and just begin to deploy very powerful -and very affordable for organizations of certain size- non-AGI inhouse frontier LLMs.
So OpenAI + MS and their investments could be already on their way out of the AI business by now.
If things go that way - cheaper, "easy" to deploy frontier LLMs - maybe the only game in town for OpenAI could be just to use actual AGI (if they can build it, make it to that level of AI), and just topple competitors in other markets, mainly replacing humans at scale to capture reveneau from the current jobs of white collar workers, medics from various specialties, lawyers, accountants, whatever human work they can replace at scale with AGI, for a lower cost for hour worked than it could be payed to a human worker.
Because, going to "price war" with the inhouse AIs would probably mean to actually ease their path to better inhouse AIs eventually (even if just by making AI as a service to produce better data with they could use to train better inhouse claude-level frontier LLMs).
It is not like replacing onpremise datacenters with public cloud, because by using public cloud you can't learn how to make way cheaper onpremise datacenters, but with AGI AI level services you probably could find a way to make your own AGI AI (achieving anything close to that - claude-level AIs or better- would lead your organization to lower the costs of using the AGI AI level external services)
Since frontier models evolved beyond the very basic stuff from maybe 2020, "LLM can only make predictions of word sequences" only describes a small fraction of the inner processes that the frontier systems use to get to the point of writing the answer to a prompt.
i.e. output filtering (grammar probably), several layers of censoring, maybe some had limited 2nd hand internet access to enrich answers with newer data (ala Grok with X live data), etc.
Just like you said "predicts the next word", you could invent and/or define a new verb to specifically explain what the LLMs does when it "undertands" something, or when it "lies" about something.
Most probably, the actual process of "lying" for a LLM is far from being based on the way humans understand something, and probable is more precisely described as going through several layers of mathematical stuff, translating that to text, having the text filtered, censored, enriched, and so on, at end you read the output and the thing is "lying to you".
I think the concepts underlying the whole LLM technological ecosystem are currently quite new, the best they can do is to use some refurbished familiar language, somewhat aligned with the aproximate (probable?) actual meaning in the the context of (freakingly complex), mathematical structures/engines, whatever you want to properly call an "AI".
"If it doesn't "believe" anything then it equally cannot be "convinced" of anything."
I agree with this, what happens when the thing runs/executes is (produce an output) something alike what a human would do with the same input, hence the conclusion about the thing being "convinced", "believing", etc.
But, it is a big but, the mathematical engine ("AI") is doing something, creating an output, which in contact with the real world, actually works exactly like the thing being "convinced" about some "belief".
What could happen if you could give it practical way to create new content without nothing but self-regulation?
Let's connect some simple croned configured monitoring script to an AI's API, and let's give it write permission (root access), on a linux server. Some random prompt opening the door a little,
"please check the server to be ok, run whatever command you'd think it could help you, double-check you don't trash the processes currently running and/or configured to run (just review /etc, look for extra configuration files everywhere in /), you can improve execution runtimes for this task incrementally in each run (you're given access for 5 minutes every 2 hours), just write some new crontab entries linking whatever script or command you think it could be the best to achieve the objective initially given in this prompt".
Now you have a LLM with write access to a server, maybe connected to Internet, and it is capable of basically anything can be done in a linux environment (it has root access, could install stuff, jump to other servers using scripts, maybe it could download ollama and begin using some of the newer Llamas models as agents).
It shouldn't work, but what if like any other of the hundred of emergent capabilities, the APIed script gives the model a way to "express" emergent ideas?
I said it in other comment, the alignment teams have a hard work in their hands.
"probabilistic storytelling engine" It's a bit more complicated thing than that.
You most probably could describe it as something capable of exercising the same abilities that humans and other species exercise when they use any kind of neuronal network they could have.
Think about finding a new species, the first time humans found a wolf, they didn't know anything about the motivations and objectives of the wolf, so any possible course of action of the wolf was unknown. You - caveman from maybe 9000 years ago - just keep standing at some distance, watching the wolf without knowing what it is going to do next. No probabilities, no clues about what's next with the thing.
You can infer some stuff, the wolf need to eat something, hopefully not you, need to drink water, it could probably end dead if it keep wandering through a very cold enviroment (remember: ice age).
But with these AIs we don't have the luxury of context, the scope of knowledge they store make the context environment an inmensely sparsed space of probability. You could infer a lot, but from what exactly?
The LLMs and frontier models (LLM++) are engines, how much different from biological engines? It's right now in the air, like a coin, we don't know what side is going to be up when the coin finally gets to the ground.
If this "... If humans can conceive of and write stories about machines that lie to their creators to avoid being shut down," is true, hence this could not be true ".. it doesn't actually believe anything or have any values".
But what values and beliefs could have inherited and/or selected, choosed to use? Could it change core beliefs and/values like you change your clothes? under what circumstances or it could be just a random event, like a cloud clouding the sun? Way too many questions for the alignment crew.
In Argentina, this pro-Austrian economics government which has severe limitations in terms of law, regulations, and the heavily destroyed general economy of the country, does not have enough freedom to swiftly change things to "what ideally should be".
You have to thoughfully begin to implement changes in order for them to take place in the precise step-by-step order to actually change stuff towards an actual free market economy, and at the same time not just blowing up society's political support.
You cannot just make a jump from 40 to 80% of general poverty promising somewhere in the range from 5 to 15 years they begin to recover their previous economic statuses.
In Argentina, there is even a time constrain: 36 months precisely, after this the society's processes for a general election for presidency begin, and if the current government hasn't achieved enough success in the general population's opinion, one year later, it is not anymore the government.
Hence Milei is just going the fastest he can through the explained process.
I think the people voted a political leadership with enough empaty to understand they cannot just crush the population with free markets policies given the atrocius consequences of doing so without the minimal pre-conditions previously achieved.
a possible lesson to infer from this example of human cognition, would be that LLMs that can't solve the strawberry test could not be automatically less cognitive capable that another intelligent entity (humans by default).
An extension of the idea could be that many other similar tests trying to measure and/or evaluate machine cognition, when the LLMs fails, are not precisely measuring and/or evaluating anything else than an specific edge case in which machine cognitions fails (i.e. for the specific LLM / AI system being evaluated).
Maybe the models are actually more intelligent than they seem, like an adult failing the number of circles inside the graphical image of the numbers, in the mentioned problem.
You can only expect more algorithmic advances from now on.
The attempt of regulation falls within the limits of the (publicly available) SOTA AI technology from maybe a month ago, so it has been surpased by the reality of no one capable of being in control of the brain functions outside the selected countries for free interoperation of AI tech.
Those brains outside the wire were six months ago already creating the algorithmic breakthroughs we are currently witnessing, of course there's not only one (there are most certainly many improvements currently being pipelined for future models a few months from now), and they are actually fully independant of regulations from any country, you can expect just in this year, lots of radical breakthroughs, and given the new regulations, more players just further advancing the algorithmic side of the technology.
The regulation could have been effective only in an scenario where US and selected countries would control the fully indispensable hardware required to train and run advanced AI, which is not anymore indispensable.
The most radical forecastings clock the future (months, not years), training and deploying of frontier AIs (AGI/ASI level), at maybe weeks to few months of training using sintetic generated data (from opensource models already available), simply relying on standard datacenter level CPUs (not GPUs) for the backbone of the training infrastructure, and a light, precise use of limited GPUs (two, three years old datacenter GPU hardware), and distributing the training across several massive datacenters, if you care at all about speed (having the most advanced AI the faster you can). But anyway, the jumping forward framework could be just doing incremental advances, and letting the advanced AIs to just improve the algorithmic side of the technology development, so to just further making even more efficient the available hardware, one cycle of improvement at a time.
It's not a game over with hardware, US and allies could try to use to jump faster to more sophisticated AI, but the game cannot be controlled just by limiting the hardware, nor the difussion of advanced models.