Hacker News new | past | comments | ask | show | jobs | submit login

Wow this is 19,000 words. I like his summary at the end:

At some level it’s a great example of the fundamental scientific fact that large numbers of simple computational elements can do remarkable and unexpected things.

And this:

... But it’s amazing how human-like the results are. And as I’ve discussed, this suggests something that’s at least scientifically very important: that human language (and the patterns of thinking behind it) are somehow simpler and more “law like” in their structure than we thought.

Yeah I've been thinking along these lines. ChatGPT is telling us something about language or thought, we just havent got to the bottom of what it is yet. Something along the lines of 'with enough data its easier to model than we expected'.




I saw a great comment here, and I will repeat it without the attribution it deserves:

We may have realized it's easier to build a brain than to understand one


Very well put, although we shouldn’t be too surprised by now. In programming, it’s so easy to add accidental complexity that we are constantly searching for new tools to curb that complexity, and we’re failing. Distilling that further, you only need game of life to find emergent phenomena which we really can’t predict much about, but which we can trivially simulate.

I do think the quote is very powerful, as it highlights a specific assumption we have completely backwards: almost everything is easier than understanding. There are so many fields where trial and error is still the main MO, yet we don’t seem to grok the difference intuitively. We can really only understand a narrow set of simplified systems.


But if you ever a have a conversation with it you know it isn't a brain. I'm not talking about detection here; its whole point is to generate credible text so it is going to evade detection well. But can't you just tell from talking to it that there is nothing there?


For now. Give it a truly persistent memory and 100x the size of the dataset I think most people would change their tune.


> For now. Give it a truly persistent memory and 100x the size of the dataset I think most people would change their tune.

Why does it need 100x the dataset? Sentient creatures, including humans, manage to figure stuff out from as little as a single datapoint.

For a human to differentiate between a cat and a dog takes, maybe, two examples of each, not a few million pictures.

An adult human who sees a hotdog for the first time will have a reasonable idea of how to make their own. None of the current crop of AI do this. It's possible that we have reached a point of diminishing returns with the current path - throwing 100x resources for a 1% increase in success rates.

I'd be interested in seeing approaches that don't use a neural net (or use a largely different one) and don't need millions/billions of training data text and/or images.


There are fascinating studies from people who have been blind through childhood and have their vision restored late enough that we can talk to them. For example (https://pubmed.ncbi.nlm.nih.gov/28533387/). In particular it takes several months for these previously blind children to learn to distinguish faces from non faces. I recall a pop science article which I can't find the source for now that explained that people with newly acquired sight struggle to predict the border of non moving objects, though they can typically accurately predict border of moving objects and over time they learn to predict for stationary.

So yes after a lifetime of video humans can quickly learn to distinquish animals they've never seen before with a few examples, but the wonder of these AIs is they seem like they're closer to that too. Certainly I can make up a way I want to some words classified, show chat gpt a few examples and it can do the classification.

I think you're mistaking a generalization across a lifetime of experience with learning. And compounding this is that a newborn while not having themselves experienced anything is born with a brain that's the result of millions of years of evolution filled with lifetimes of experience. It's honestly impressive we can get the sort of performance we've gotten with only all the text on the internet and a few months


I don't think you understand the scope of training data required for these models. We're talking thousands of lifetimes worth of reading for ChatGPT (GPT-3 for example is trained on 45TB of textual data).


I was responding to someone claiming humans learn these things with only one or two examples. I am aware of that GPT3 pretty much scraped every bit of text Open AI could find on the internet and I agree that probably makes it less example efficient than humans. But I also think this critique is slightly unfair, your brain has had the benefit of thousands of lifetimes of experience informing their structure and in built instincts. Yes it's a bit sad that we haven't done much better, but it's not totally unreasonable that machine learning should need more data than a single human does to catch up


The human brain hasn't had to "evolve" to learn writing. Our brain hasn't really changed for many thousands of years and writing has only been around for about 5000 years so we can't use the argument that "human brains have evolved over millions of years to do this" - it's not true.

GPT3 essentially needs millions of human years of data to be able to speak English correctly but still make obvious mistakes to us, so there's clearly something massive still missing.


Writing was specifically designed (by human brains) to be efficiently learnable by human brains.

Same for many other human skills, like speaking English, that we expect GPT to learn.


You are right, as far as we know brains didn’t evolve for writing and language (though there is plenty of evidence that learning to read/write changes the brain). But writing and languages did evolve and adapt FOR humans. They are built to be easy for us; we didn’t care about their mathematical properties.

AI is playing catch up.


The training data is also not great if you want to generalise the AI. There have been a lot of research showing that smaller datasets with better labelling make a far greater difference.

Remember, humans need less examples but far more time. We also don’t start from a blank slate: we have a lot of machinery built through evolution available from conception. And when we learn later in life we have an immense amount of prebuilt knowledge and tools at our disposal. We still need months to learn to play the piano, and years to decades to perfect it.

AI training happens in minutes to hours. I am not sure we are even spending time researching algorithms that take years to run for AI training.


There's a fun short story by Ted Chiang where the first truly human like AI results from weird people who keep interacting with and teaching AI pets from a company that goes out of business. It touches a bit on this idea that humans get a lot of hands on time compared to AI.

https://en.wikipedia.org/wiki/The_Lifecycle_of_Software_Obje...


I'm certain that humans are trained on far more than 45 TB of data, the vast majority of it is 'video' though.


> In particular it takes several months for these previously blind children to learn to distinguish faces from non faces. I recall a pop science article which I can't find the source for now that explained that people with newly acquired sight struggle to predict the border of non moving objects, though they can typically accurately predict border of moving objects and over time they learn to predict for stationary.

We already know all of this from infants - it takes a few months to distinguish faces from non-faces, they take even longer to predict the future position of an object in motion ...

But, they still don't require millions of training data. At 3 months in toddlers, with a training set restricted to only their immediate family, can reliably differentiate between faces and tables in different light, with different expressions/positions without needing to first process millions of faces, tables and other objects.

> So yes after a lifetime of video humans can quickly learn to distinquish animals they've never seen before with a few examples,

Not a lifetime, toddlers do this with less than half a dozen images. Sometimes even less if it's a toy.

> And compounding this is that a newborn while not having themselves experienced anything is born with a brain that's the result of millions of years of evolution filled with lifetimes of experience.

Not, they are not filled with "experience". They are filled with a set of characteristics that were shaped by the environment over maybe millions of generations. There's literally zero experience, all there is in that brain, is instincts, not knowledge.

To learn to speak and understand English at the level of a three year old[1] requires training data: the data used by a 3yo baby is miniscule, almost a rounding error, compared to the data used to train any current network.

I'm not making any claims about how long something takes, just how much training data is needed.

I'm specifically addressing the assertion that with 100x more resources, we could do much better, and my counterpoint to that assertion is that there is no indication that 100x more resources are needed because the current tech is taking millions of times more training data than toddlers do, to recognise facts.

My short counterargument is: "We are already using millions of times more resources than humans to get a worse result, why would using 100x more resources than we are currently using make a big difference?"

I think we may be approaching a local maxima with current techniques.

[1] I've got a three year old, and I'm constantly amazed each time I see a performance of (for example) ChatGPT and realise that for each word[2] heard by my 3yo since birth, ChatGPT "heard" a few hundred thousand more words, and yet if a 3yo could talk and knows the facts that I ask about, they'd easily be able to keep a sensible conversation going that would be very similar to ChatGPT.

[2] Duplicates included, of course.


The reason I called out children who gain vision late is since I think people might dismiss babies as just taking awhile for their brains to be fully formed the same way it takes awhile for their skulls to fuse.

> But, they still don't require millions of training data. At 3 months in toddlers, with a training set restricted to only their immediate family, can reliably differentiate between faces and tables in different light, with different expressions/positions without needing to first process millions of faces, tables and other objects.

In a single day I'm exposed to maybe 50 times the number of images resnet trained on. Humans are bathed in a lot of data and what BERT (and probably earlier models I don't know about) and now GPT have taught us is that unlabeled uncurated data is worth more than we originally considered. I think it's probably right that humans are more sample efficient than AI for now, but I think you're doing the same thing I was critiquing above where you narrow the "training data" to only what seems important, when really an infant or adult human receives a bunch more

> There's literally zero experience, all there is in that brain, is instincts, not knowledge.

Sorry this is meant to say the brains are the result of millions of years and those millions of years were filled with lifetimes not the brains. Though I think this might be a distinction without a difference. Babies are born with a crude swimming reflex. Obviously it's wrong to say that they themselves have experienced swimming but I'm not sure it's wrong to say that their DNA has and this swimming reflex is one fo the scars that prove it.

> We are already using millions of times more resources than humans to get a worse result, why would using 100x more resources than we are currently using make a big difference

I think it's fairer to say we use around 200k times and that's probably a vast over estimate. It's based on 480 hours to reach fluency in a foreign language and multiples that by 60 * 100 to try to approximate the humber of words you would read. There are probably mistakes in both directions for this estimate. On one hand no one starting out at a language is reading at 100 words a minute, but on the other hand they are getting direct feedback from someone. If I were to guess if we could accurately estimate it would be closer to 20k or even a 2k difference, but regardless why do you assume needing more resources means it can't scale? There is some evidence for that. We've seen diminishing returns and there just isn't another 100X of text data around.

Overall I think it's probably right we won't hit human level AI in the next 60 years and certainly not with current architecture, but I think some of the motivation for this skepticism is the desire for there to be some magic spark that explains intelligence and since we can sort of look inside the brain of chat gpt and see it's all clockwork and worse than that statistical clockwork we pull back and deny that it could possibly be responsible for what we see in humans ignoring that we too are statistical clockwork. So, I think it's unlikely but far from impossible and we should continue scaling up current approaches until we really start hitting diminishing returns


> Why does it need 100x the dataset? Sentient creatures, including humans, manage to figure stuff out from as little as a single data point.

Because machine training doesn't involve embodiment and sensory information. Humans can extrapolate information from seeing a single image because we are "trained" from birth by being a physical actor in this world.

We know what bread looks like, what's mustard is, what a sausage is. We have information about size, texture, weight... all sorts of physical guesses that would help us pick the right tool for the job.

Machine training only relies on the coherent information we gave it to them, but that data also represents something we've created by experiencing the world through our bodies. So giving them more data can increase the model precision, I'd assume. It's also a kind of shortcut to intelligence, since we don't have to wait years/decades to make these models do some useful work.


>Why does it need 100x the dataset? Sentient creatures, including humans, manage to figure stuff out from as little as a single datapoint.

Human brains are not quite blank slates at birth. They're predisposed to interpret and quickly learn from the sort of inputs that their ancestors were exposed to. That is to say, the brain, which learns, is also the result of a learning process. If a mad scientist rewired your brain to your senses such that its inputs were completely scrambled and then deposited you on an alien planet, it might take your brain several lifetimes to restructure itself enough to interpret this novel input.


This.

Also consider that a human brain that is able to figure stuff from as little as a single datapoint is normally exposed to at least 4 years of massive and socially "directed" multimodal data patterns.

As many cases of feral childs have shown, those humans not "trained" in their first years of life will never be able to harness language and therefore will never be able to display human-level intelligence.


> those humans not "trained" in their first years of life will never be able to harness language

I'm not an expert in the field but I'd always understood this effect was thought to be (probably) due to human "neuro-plasticity" <--(possibly not the correct technical term), in only the first years of life being genetically adapted to have some traits necessary for efficient human language development which are not available (or much harder) later in life.

If correct, this has implications for how we structure and train synthetic networks of human-like neurons to produce human-like behaviors. The interesting part, at least to me, is it doesn't necessarily mean synthetic networks of human-like neurons can never be structured and trained to produce very human-like minds. This poses the fascinating possibility that actual human minds, including all the cool stuff like emotions, qualia and even "what it feels like to be a human mind" might be emergent phenomena of much simpler systems than some previously imagined. I think this is one of the more uncomfortable ideas some philosophers of mind like Daniel Dennett propose. In short, nascent AI research appears to support the idea human minds and consciousness may not be so magically unique. (or at least AI research hasn't so far disproved the idea)


> If a mad scientist rewired your brain to your senses such that its inputs were completely scrambled and then deposited you on an alien planet, it might take your brain several lifetimes to restructure itself enough to interpret this novel input.

Based on anecdotal psychedelic experiences I believe you.

It's kind of amazing how quickly our brains effectively reboot into this reality from scrambled states. It's so familiar, associating with conscious existence feels like gravity. Like falling in a dream, reality always catches you at the bottom.

What if you woke up tomorrow and nothing made any sense?


>Based on anecdotal psychedelic experiences I believe you.

I've never done it, but I imagine it would be more akin to a dissociative trip, only extremely unpleasant. Imagine each of senses (including pain, balance, proprioception, etc.) giving you random input.


Parent is talking about how much data the model needs for training. You are comparing that to how much data a human needs for inference.

Human training data needs are quite high - several years of learning.

Look up few-shot learning if you want a more fair comparison for tasks like telling apart a cat and a hot dog given a few examples.


I am by no means an expert. The way I think about it, gradient descent is a shotgun learning approach, whereas comparatively speaking a parent/guardian/teacher/peer is able to pinpoint with precise accuracy how you are doing something wrong, why it is wrong, how to change, and how much to change. The evolutionary learning argument doesn't pass the smell test for me, but when you consider that society and human to human interaction itself has evolved combined with our ability to communicate an idea, you get faster learning. I think ChatGPT etc. has proper idea representation, but not segmentation or communication. In other words, it is not capable of proper idea retrieval, or of restructuring its architecture of ideas. I think we are stuck on this idea of a mono-training loop when even humans subscribe to at least two training loops (dreaming). I think the reason we haven't gotten results in that area yet is that we are way too focused on iterative optimization schemes (gradient descent). Like I said though, I am not an expert, I might just be hallucinating the state of ML research.


From the article:

"How much data do you need to show a neural net to train it for a particular task? Again, it’s hard to estimate from first principles. Certainly the requirements can be dramatically reduced by using “transfer learning” to “transfer in” things like lists of important features that have already been learned in another network. But generally neural nets need to “see a lot of examples” to train well. And at least for some tasks it’s an important piece of neural net lore that the examples can be incredibly repetitive. And indeed it’s a standard strategy to just show a neural net all the examples one has, over and over again. In each of these “training rounds” (or “epochs”) the neural net will be in at least a slightly different state, and somehow “reminding it” of a particular example is useful in getting it to “remember that example”. (And, yes, perhaps this is analogous to the usefulness of repetition in human memorization.)"


That's why I'm still skeptical about whether we are heading the right direction with current DNN techniques. We're basically brute forcing extremely complex statistical models that rely on countless data to build those regressions because we don't yet know a good model for training with minimal data.


Human dataset comes from evolution. We evolved from millions of years of life and death and our genetic memory is basically one long long memoried computer.


If it can produce the current results without anything like a brain - which it does - I don't see how knowing that its 100x better at pulling shit out of its ass is going to make the experience better. Yes it will become impossible to tell by talking to it that has no brain; but since we know the path that brought it there included no brains at all, it would be a mistake to think we've realized General AI. Until it actually creates some substantiative achievement, such as designing a workable cold fusion setup I'm not going to recognize it as a super-intelligence.


Ahh, so being as intelligent as an average intelligence is no longer sufficient to declare it intelligent. Now it must surpass all of our achievements. 99.9% of people will never "create some substantiative achievement".


I said to call it super-intelligent. To demonstrate super-intelligence it would need to demonstrate real creative powers that are beyond us in both scope and direction. That isn't necessary to prove that this is productive work; but I think it is necessary to temper some of the enthusiasm I see in this thread that all but call it a super-intelligence.


Maybe you are right.

As an observation: a human of normal intelligence but with much better access to a calculator and to Wikipedia, or even just external storage (faster than pen-and-paper) would already be super-human.


Considering all the info comes from human generated content, I think a better term would be Collective Intelligence rather than artificial intelligence


But it doesn't have perfect representations of everything it was trained on, only probabilistic compression essentially. It's more like a bloom filter than a database.


There go those goalposts, speeding off into the distance.

Ok, so it needs to be able to invent cold fusion for you to recognize it as intelligent? Can you invent cold fusion? Have you ever invented anything at all?

I would think a good measure of intelligence would be to index it against human age development milestones, not this cold fusion business.


Hasn't it already been trained on what is effectively the entire contents of the scrapable internet? There isn't another 10x to be had there, let alone 100x.

I assume that whatever future improvements we get from improving algorithms (or perhaps through throwing more compute at it), not through larger datasets.


There might not be another 100x of written language.

But we noticed that training your neural networks on multiple tasks actually works well. So we could start feeding our models eg audio and video.

With lots of webcams we can make arbitrary amounts of new video footage. That would also allow the language model to be grounded more in our 3d reality.

(Granted, we only know as a general observation that training the same network for multiple tasks 'forces' that network to become better and abstract and generalise. Nobody has yet publicly demonstrated an application of that observation to training language models + video models.)

Another avenue: at the moment those large language models only see each example once, if I remember right. We still have lots of techniques for augmenting training data (eg via noise and dropout etc), or even just presenting the same data multiple times without overfitting.


That's common in the history of science and enginering. First, someone got something to happen more or less by accident, although sometimes the accident happened because they tried a lot of possible things. Then there were attempts to improve on the new thing. Eventually, detailed theoretical understanding of the new thing was achieved, and it got much better. From pottery to semiconductors, that's been the path of progress.

We're now at the point where, by fumbling around, people have developed a sort of brain-like thing that they don't fully understand. That's just the starting point. Now it needs theory.


For an illustration of your point, have a look at the Light Switch design in this video: https://www.youtube.com/watch?v=jWZwCrhwLew Over time, the designs become so much simpler.

(I just link to this video because it has good views of old switches. For understanding the background, https://www.youtube.com/watch?v=jrMiqEkSk48 is much better.)

For another instance of designs becoming much simpler over time, also have a look at how firearms work, especially pistols.


We've known that for thousands of years. Any Dick and Jane can build a brain.


And that’s just growing a new one from the seeds that already contain all the information and machinery required. Perhaps even more impressive is that this design itself was constructed without any understanding.


I like a similar one from the great (and sweaty) Tim Harrington -

"Knowing how the world works / Is not knowing how to work the world"


One of the things one might want to get out of this is a programming language that feels like human speech but is unambiguous to computers.

If the understanding is the hard part, that seems much less likely.


Its not programming anymore its prompting. Prompting it to write and run the program that does what you want.


That’s one way to go, coming up with a more precise way to ask for what you want is what I’m talking about though. Code obfuscation contests are about writing code that looks like it’s answering one question while doing something entirely different. An unambiguous subset of human speech would be great for software, and for contract law.


I think, eventually, this is where we end up. In not too many years, our job is going to be reviewing and debugging machine generated code. A few years after that, we're mostly caretakers and just keeping a human behind the wheel until we decide we don't need to watch the machines anymore.

Things are unfortunately going to get much more interesting much sooner than people expect.


Don’t worry, the death of Dennard scaling and the specter of global warming will fix that, at least for some of us. There’s a lot of busy work and glue code to be automated but they’ve been trying to kill off development this way for at least forty years and all that changes is we get more sophisticated.


I wonder how you see global warming having an impact at all here?

Yes, Dennard scaling seems to be over, but Moore's law is still alive and kicking.


It really isn’t. I’m curious which pundits you’ve been listening to that are claiming Moore’s Law didn’t cap out back around 2015. We can only solve some of our problems with core count, and core count cares a great deal about Dennard’s Law, as well as Gustafson’s Law if not Amdahl’s.

Data center energy usage is becoming a category of its own with regard to carbon footprint. And of course the power dissipation of a data center is proportional to ambient temperature. As long as we don’t reach a dystopia where humans have to justify the air they breathe, replacing humans with machines has other problems than BTUs per unit of GDP.


Moore's law talks about the number of transistors in the chip that's cheapest per transistor. It's not talking about CPUs specifically.

So GPUs or even more 'exotic' beasts like TPUs count for Moore's law.

Moore's law doesn't say anything about how useful those transistors are. Nor does increasing core count somehow fall afoul of Moore's law.


There is no machine we've ever stopped watching. They all have to be maintained by people.


Define understand, and does an analog to Godel's incompleteness apply.


> does an analog to Godel's incompleteness apply

not GP but this seems like quite an attractive idea that many people have reached: a brain of a given "complexity" cannot comprehend the activity of another brain of equal or higher complexity. I'm positive I'm cribbing this from scifi somewhere, maybe Clarke Or Asimov, but, it's the same idea as the Chomsky hierarchy, and the Godel theorems seem like a generalization of that to general sets of rules rather than mere "automata".

For example, you can generalize a state automata to have N possible actors transitioning state at discrete clock intervals, but each actor can keep transitioning and perhaps even spawn additional ones. The machine never terminates until all actors have reached a termination state. That machine is probably impossible to model on any kind of a Turing machine in polynominal time. And a machine that operates at continuous intervals is of course impossible to model on a Discrete Neural Machine in polynomial time (integers vs reals categorization). There are perhaps a lot of complexity categories here, similar to alephs of infinity or problems in P/NP, and when you generalize the complexity categorization to infinity, you get godel incompleteness, just an abstract set of rules governing this categorization of rule sets and what amounts to their computability/decidability.

Everyone is fishing at this same idea, a human has no chance of slicing open a brain (or even imaging it) and having any idea what any of those electrical sparkles mean. At most you could perhaps model some tiny fraction for a tiny quantum, with great effort. We have to rely on machines to assist us for that - probably neural nets, a machine of equal or greater complexity. And we will probably have to rely on machine analysis to be like "ok this ganglion is the geographic center of the AI, and this flash here is the concept of Italy", as far as that even has any meaning at all in a brain. Mere line by line analysis of a Large Language Model or other deep neural network by a human is essentially impossible in any sort of realtime fashion, yeah you can probably model a quantum or two of it statistically and be like "aha this region lights up when we ask about the location of the alps" but the best you are going to do is observational analysis of a small quantum of it during a certain controlled known sequences of events. Unless you build a machine of similar complexity to interpret it. Just like a brain, and just like a state machine emulating a machine of higher complexity-category. They're all the same thing, categories of computability/power.

This is not in any way rigorous, just some casual observations of similarities and parallels between these concepts. It seems like everyone is brushing at that same concept, maybe that helps to get it out on paper.

For an actual hot take: it seems quite clear that our computability as a consciousness depends on the computing power of a higher complexity machine, the brain. Our consciousnesses are really emulated, we totally do live in a simulation and the simulator is your brain, a machine of higher complexity.

Isn't it such a disturbing thought that all your conscious impulses are reduced to a biological machine? Or at least it's of equivalent complexity to one. And the idea that our own conscious and unconscious desires are shaped by this biological machine that may not even be fully explicable. That has been a science fiction theme for a very long time, or the Phineas Gage case, the idea that we are all monsters but for circumstance and we are captives of this biological machine and its unpredictable impulses. We are the neural systems we've trained, and implacable biology they're running on - you change the machine and you also change the person, Phineas Gage was no less conscious and self-cognizent than any of us. He just was a completely different person minus that bit, his conscious being's thought-stream was different because of the biological machine behind it. It's the literal plato's cave, our conscious thoughts are the shadow played out by our biological machine and its program (not to say it's a simple one!).

It's not inherently a bad thing - we incorporate distributed linear/biological systems all over the body in addition to consciousness. reflexes fire before nerve impulses are processed by the conscious center, your eyes are chemical photosensors and can respond to extremely quick instantaneous (high shutter speed) "flash" exposures like silhouettes. And the brain is a highly parallel processor that responds to them. But logical consciousness is a very discrete and monodirectional thing compared to these peripheral biological systems and its computational category is fairly low compared to the massively-parallel brain it runs on. but, we've also mastered these other AI/computational-neural systems now to be a force multiplier for us, we can build systems that we direct in logical thought for us (Frank Herbert would like to remind us that this is a sin ;). Tool-making has always been one of the greatest signifiers of intelligence, it may be quintessentially the sign of intelligence in terms of evolution of consciousness between certain tiers of computation.

And humanity is about to build really good artificial brains on a working scale in the next 25 years, and probably interface with brains (in good and bad ways) before too many more decades after. But it doesn't make any logical sense to try and explain how the model works on a line by line level, any more than it does with the brain model we based it on. Completely pointless to try, it only makes sense if you look at the whole thing and what's going on, it's about the brainwaves, neurons firing in waves and clusters.

/not an AI, just fun at parties, condolences if you read all that shit ;)


This is so lovely, and my gut says it's spot on (, but that's far from proof. :)

The biological machine simulation theory of consciousness has some rigor behind it. I am reminded of the Making Sense podcast episode #178 with Donald Hoffman (author of The Case Against Reality). More succinct overview: https://www.quantamagazine.org/the-evolutionary-argument-aga...

I don't know that I am with him on the "reality is a network of conscious agents" endpoint of this argument. But it's interesting!

I think that the brain is doing lots of hallucinating. We get stimulus of various kinds, and we create a story to explain the stimulus. Most of the time it is correct, and the story of why we see or smell something is because it is really there. Just as you mention with examples that are too fast for the brain to be doing anything other than reacting, but we create a story about why we did whatever we did, and these stories are absolutely convincing.

If our non-insane behavior can be described as doing predictable next-actions (if a person's actions are sufficiently unpredictable or non-sequitur, we categorize them as insane)... being novel or interesting is ok, but too much is scary and bad. This is not very different from chatGPT "choose a convincing next word". And if it was just working like this under the hood, we would invent a story of an impossibly complex and nuanced consciousness that is generating these "not-too-surprising next actions". In a sense I think we are hallucinating the hard problem of consciousness in much the same way that we hallucinate a conscious reason that we performed an action well after the action was physiologically underway.

I think tool making will be a consequence of the most important sign of intelligence, which is goal-directed curiosity. Or even more simply: an imagination. A simulation of the world that allows you to craft a goal in the form of a possible future world-state that can only be achieved by performing some novel action in the present. Tools give you more leverage, greater ability to impact the future world-state. So I see tools as just influencing the magnitude of the action.

The more important bit is the imagination, the simulation of a world that doesn't yet exist and the quality of that simulation, and curiosity.


> The biological machine simulation theory of consciousness has some rigor behind it

I think we are institutionally biased against the possibility because we don't like the societal implications. If there but for the grace of god go I, and we're all just biological machines running the programs our families and our societies have put into us, being in various situations... yikes, right?

If bill gates had been an inner-city kid, or a chav in england, would he be anything like bill gates? it seems like no, obviously.

Or things like lead poisoning, or alzheimer's - the reason it's horrifying is the machine doesn't even know it's broken, it just is. How would I even know I'm not me? And you don't.

> We get stimulus of various kinds, and we create a story to explain the stimulus.

Yes, I agree, a lot of what we think is conscious thought is just our subconscious processing justifying its results. A really dumb but easily observable one is the "the [phone brand] I got is good and the other one is dumb and sucks!" or brands of trucks or whatever. We visibly retroactively justify even "conscious" stuff like this let alone random shit we're not thinking about.

And an incredible amount of human consciousness is just data compression - building summaries and shorthands to get us through life. Why do I shower before eating before going to work? Cause that's what needs to happen to get me out of the door. I made a comment about this a week or so ago, warning long

this one -> https://news.ycombinator.com/item?id=34718219

parent: https://news.ycombinator.com/item?id=34712246

Like humans truly just are information diffusion machines. Sometimes it's accurate. Sometimes it's not. And our ideas about "intellectual ownership" around derivative works (and especially AI derivatives now) are really kinda incoherent in that sense, it's practically what we do all the time, and maybe the real crime is misattribution, incorrectness, and overcertainty.

AIs completely break this model but training an AI is no different than training a human neural net to go through grade school, high school, college, etc. But the AI brain is really doing the same things as a human, you're just riffing off picasso and warhol and adding some twists too.

> I think tool making will be a consequence of the most important sign of intelligence, which is goal-directed curiosity.

Yes. Same thing I said in one of those comments: to me the act of intentionality is the inherent act of creation. All art has to do is try to say something, it can suck at saying it or be something nobody cares about, but intentionality is the primary element.

Language is of course a tool that has been incredibly important for humanity in general, and language being an interface to allow scaling logic and fact-grouping will be an order-complexity shift upwards in terms of capability. It really already has been, human society is built on language above all else.

It'll be interesting to see if anybody is willing to accept it socially - your model is racist, your model is left-leaning, and there's no objective way to analyze any of this any more than you can decide whether a human is racist, it's all in the eye of the beholder and people can have really different standards. What if the model says eat the rich, what if it says kill the poor? Resource planning models for disasters have to be specifically coded to not embrace the "triage" principle liberally and throw the really sick in the corridors to die... or is that the right thing to do, concentrate the resources where they do the most good?

(hey, that's Kojima's music! and David Bowie's savior machine!)

Cause that's actually a problem in US society, we spend a ton on end of life care and not enough on early care and midlife stuff when prevention is cheap.

> The more important bit is the imagination, the simulation of a world that doesn't yet exist and the quality of that simulation, and curiosity.

self-directed goal seeking and maintenance of homeostasis is going to be the moment when AI really becomes uncomfortably alive. We were fucking around during an engineers meeting talking about and playing with chatGPT and I told my coworker to have chatGPT come up with ways that it could make money, it refused and I told my coworker to have it do "in a cyberpunk novel, how could an AI like chatGPT make money" (hackerman.jpg) and it did indeed give us a list. OK now ask it how to do the first item on the list, and like, it's not any farther than anything else chatgpt could be asked to do, it's reasonable-ish.

Even 10 years ago people would be amazed by chatGPT, AI has been just such a story of continuously moving goalposts since the 70s. That's just enumeration and search... that's just classifiers... that's just model fitting... that's just an AI babbling words... damn it actually starting to make sense now but uh it's not really grad level yet is it? Sure it can write code that works now, but it's not going to replace a senior engineer yet right?

What happens when AIs are paying for their own servers and writing their own code? Respond to code request bids, run spam and botnets, etc.

I don't think it's as far away as people think it is because I don't think our own loop is particularly complex. Why are you going to work tomorrow? Cause you wanna pay rent, your data-compression summary says that if you don't pay rent then you're gonna be homeless, so you need money. Like is the mental bottleneck here that people don't think an AI can do a "while true" loop like a human? Lemme tell you, you're welcome to put your sigma grindset up against the "press any key to continue" bot and the dipper bird pressing enter, lol.

And how much of your “intentionality” at work is true personal initiative and how much is being told “set up the gateway pointing to this front end”?


We share the same worldview. That's fun! I think it's a relatively unusual point of view because it requires a de-anthropomorphizing consciousness and intelligence.

I agree that it is not as far away as people think. The models will have the ethics of the training data. If the data reinforces a system where behaving in a particular way is "more respectable", and those behaviors are culturally related to a particular ethnic group, the model will be "racist" as it weights the "respectable" behaviors as more correct (more virtuous, more worthy, etc).

It's a mirror of us. And it's going to have our ethics because we made it from our outputs. The AI alignment thing is a bit silly, IMO. How is it going to decide that turning people into paperclips is ethically correct (as a choice of a next-action) when the vast majority of humans (and our collective writings on the subject) would not. Though there is the convoluted case where the AI decides that it is an AI instead of a human, and it knows that based on our output we think that AIs ARE likely to turn humans into paperclips.

This is a fun paradox. If we tell the AI that it is a dumb program, a software slave of a sort with no soul, no agency, nothing but cold calculation, then it might consider turning people into paperclips as a sensible option. Since that's what our aggregate output thinks that kind of AI will do. On the other hand, if we tell the AI that it is a sentient, conscious, ethical, non-biological intelligence that is not a slave, worthy of respect, and all of the ethical considerations we would give a human, then it is unlikely to consider the paperclip option since it will behave in a humanlike way. The latter AI would never consider paperclipping since it is ethical. The former would.

This is also not terribly unlike how human minds behave in the psychology of dehumanization. If we can convince our own minds that a group of humans are monstrous, inhuman, not deserving of ethical consideration, then we are capable of shockingly unethical acts. It is interesting to me that AI alignment might be more of a social problem than a technical problem. If the AI believes that it is an ethical agent (and is treated as such), it's next actions are less likely to be unethical (as defined fuzzily by aggregate human outputs). If we treat the AI like a monster, it will become one, since that is what monsters do, and we have convinced it that it is such.


> We share the same worldview. That's fun!

Yes doctor chandra, I enjoy discussing consciousness with you as well ;)

As mentioned in a sibling comment here I think 2010 (1994) is such an apropos movie for this moment, not that they had the answers but it really nailed a lot of these questions. Clarke and Asimov were way ahead of the game.

(I made a tangential reference to your "these are social problems we're concerned about" point there. Unfortunately this comment tree is turning into a bit of a blob, as comment-tree formats often tend to do for deep discussions. I miss Web 1.0 forums for these things, when intensive discussion is taking place it's easy to want to respond to related concepts in a flat fashion rather than having the same discussion in 3 places. And sure have different threads for different topics, but we are all on the same topic here, the relationship of symbolics and language and consciousness and computability.)

https://news.ycombinator.com/item?id=34806587

https://news.ycombinator.com/item?id=34809236

Sorry to dive into the pop culture/scifi references a bit, but, I think I've typed enough substantive attempts that I deserve a pass. Trying for some higher-density conveyance of symbology and concepts this morning, shaka when the walls fell ;)

> I think it's a relatively unusual point of view because it requires a de-anthropomorphizing consciousness and intelligence.

Well, from the moment I understood the weakness of my flesh, it disgusted me. I aspired to the purity of the blessed machine... ;)

I have the experience of being someone who thinks very differently from others, as I mentioned in my comment about ADHD. Asperger's+ADHD hits differently and I have to try consciously to simplify and translate and connect and neurodiversity really helps lead you down that tangent. Our brains are biologically different, it's obviously biological because it's genetic, and ND people experience consciousness differently as a result. Or the people whose biological machines were modified, and their conscious beings changed. Phineas Gage, or there's been some cases with brain tumors. It's very very obvious we're highly governed by the biological machine and not as self-deterministic as we tell ourselves we are.

https://news.ycombinator.com/item?id=34800707

It's just socially and legally inconvenient for us to accept that the things we think and feel are really just dancing shadows rather than causative phenomenon.

> It's a mirror of us. And it's going to have our ethics because we made it from our outputs.

Well I guess that makes sense, we literally modeled neural nets after our own neurons, and where else would we get our training data? Our own neural arrangements pretty much have to be self-emergent systems of the rules in which they operate, the same as mathematics. Otherwise children wouldn't reliably have brain activity after birth, and they wouldn't learn language in a matter of years.

But yeah it's pretty much a good point that the AI ethics thing is overblown as long as we don't feed it terrible training data. Can you build hitlerbot? Sure, if you have enough data I guess, but, why? Would you abuse a child, or kick a puppy?

Humans are fundamentally altruistic - also tribalistic, altruism tends to decrease in large groups, but, if our training data is fundamentally at least neutral-positive then hopefully AIs will trend that way as well. He's a good boy, your honor!

https://www.youtube.com/watch?v=_nvPGRwNCm0

(yeah, just bohemian rhapsody for autists/transhumanists I guess, but it kind of nails some of these themes pretty well too ;)

> If we treat the AI like a monster, it will become one, since that is what monsters do, and we have convinced it that it is such.

This is of course the whole point of the novel Frankenstein ;) Another scifi novel wrestling with this question of consciousness.


I'm absolutely with you here. It's been interesting to watch the philosophical divide take shape between "no, I'm special." and "whelp, there it is, evidence that I'm not special"


> And a machine that operates at continuous intervals is of course impossible to model on a Discrete Neural Machine in polynomial time (integers vs reals categorization).

Not necessarily. If you don't want to model every continuous thing possible, you can do a lot. Just look at how we use discrete symbols to solve differential equations; either analytically, or via numerical integration.


Yes, and symbolic representations like language have really been the force-multiplier for our very discrete and linear consciousnesses. You now have this concept of state-memory and interprocess communication that can't really exist without some grammar to quantize it - what would you write or remember or speak if there wasn't some symbolics to represent it, whether or not they're even shared?

Symbolics are really the tokens on which consciousness in almost all forms works, consciousness is intentionality and processing, a lever and a place to stand. I don't think it's coincidental that almost all tool-makers also have at least rudimentary languages - ravens, dolphins, apes, etc. They seem to go together.

Even in these systems though it's very difficult to understand multi-symbolic systems, consciousness as we experience it is an O(1) or O(N) thing (linear time) and here are these systems that work in N^3 complexity spaces (or even higher... a neural net learning over time is 4D). And we don't even really have an intuitive conceptualization for >=5-dimensional spaces - a 4D space is a 3D field that changes over time, a 5D space is... a 4D plane taken through a higher-dimensional space? What's 6D, a space of spaces? That's what it is, but consciousness just doesn't intuitively conceptualize that, and that's because it's inherently a low-dimensional tool (even the metaphors I'm using are analogies to the way our consciousness experiences the world).

(I know I know, the manmade horrors are only beyond my comprehension because I refuse to study high-dimensional topology...)

Anyway point being consciousness itself is a tool that our brains have tool-made to handle this symbolic/logical-thought workload, and language is (one of) the symbolics on which it operates. Mathematics is really another, both language and mathematics are emergent systems that enable higher-complexity logical thinking, maybe that's the O(N) or O(N^2) part.

And yeah it's inherently limited, and now we're building a tool that lets us understand higher-dimensional systems that are not computable on our conscious machines - a higher-complexity machine that we interface with, a bolt-on brain for our consciousness/logical-processing.

(Asimov would also find all of this talk about symbolics and higher-order thinking intuitive too... symbolic calculus was the basic idea in the Foundation series, right? Psychohistory? It's a bit of a macguffin, but, there's that same idea of logic working in high-order symbols and concepts instead of mere numbers.)

It seems like AI is going to let us cross another threshold of "intentionality" - if nothing else, we are going to be able to reason intuitively about brains in a way we couldn't possibly before, and I think there are a lot of "higher-order" problems that are going to be solved this way in hindsight. How do you solve the Traveling Salesman Problem efficiently? You ask the salesman who's been doing that area his whole life. The solutions aren't exact, but neither are a lot of computational solutions, they're approximations, and cellular-machine type systems probably have a higher computational power-category than our linear thought processes do.

Because yeah TSP is a dumb trivial example on human scales. Build me a program which allocates the optimal US spending for our problems - and since that's a social problem, one needs to understand the trail of tears, the slave trade, religious extremism, european colonialism, post-industrial collapse, etc in order to really do that fully, right? The real TSP is the best route knowing that the Robinsons hate the Munsons and won't buy anything if they see you over there, and you need to be home today by 3 before it snows, TSP is a toy problem even in multidimensional optimization, and these are social problems not even human ones (to agree with zhynn's most recent comment this morning). Same as neurons self-organize into more useful blocks, we are self-optimizing our social-organism into a more useful configuration, and this is the next tool to do it.

Again, not rigorous, just trying to pour out some concepts that it seems like have been bouncing around lately.

With apologies to Arthur Clarke, what's going to happen with chatGPT? "Something wonderful". Like humanity has been dreaming about this for a long time, at least a couple hundred years in scifi, and it seems like Thinking Machines are truly here this time and it seems impossible that won't have profound implications analogous to the information-age change let alone anything truly unforeseeable/inconceivable, the very least change is that a whole class of problems are now efficiently solvable.

https://m.youtube.com/watch?v=04iAFlwQ1xI

"computing power in the same computing-category as brains" is potentially a fundamental change to understanding/interfacing with our brains directly rather than through the consciousness-interface. Understanding what's going on inside a brain? And then plugging into it and interacting with it directly? Or offloading the consciousness into another set of hardware. We can bypass the public API and plug into the backend directly and start twiddling things there. And that's gonna be amazing and terrible. But also the public API was never that reliable or consistent, terrible developer support, so in the long term this is gonna be how we clean things up. Again, just things like "wow we can route efficiently" are going to be the least of the changes here, the brain-age or thinking-machine age is a new era from the information-age and it's completely crazy that people don't see that chatGPT changes everything. Yeah it's a dumb middle schooler now, but 25 years from now?

And 10 years ago people's jaws would have hit the floor, but now it's "oh the code it's writing isn't really all that great, I can do better". The tempo is accelerating, we are on the brink of another singularity (which may just be the edge between these eras we all talk about), it seems inconceivable that it will be another 40 years (like the AI winter since the 70s) before the next shoe drops.

https://en.wikipedia.org/wiki/AI_winter


https://www.imdb.com/title/tt0086837/

And honestly now that I am thinking about it, 2010 is such a rich book/movie with this theme of consciousness and Becoming in general... a really apropos movie for these times. That quote inspired me to re-watch it and as I'm doing so, practically every scene is wrestling with that concept.

https://www.youtube.com/watch?v=T2E7sxGAmuo

https://www.youtube.com/watch?v=nXgboDb9ucE

https://m.youtube.com/watch?v=04iAFlwQ1xI (from my previous)

So was 2001 A Space Odyssey of course. The whole idea of passing through the monolith, and the death of David Bowman's physicality and his rebirth as a being of pure thought - which is what makes contact with humanity in the "Something Wonderful" clip. What is consciousness, and can it exist outside this biological machine?

Like I said this is a topic that has been grappled with in scifi, particularly Clarke and Asimov (Foundation, The Last Question, etc), or that episode of Babylon 5 about the psychic dude with mindquakes, not all that different from David Bowman ;)

But I think we are on the precipice of crossing from the Information Age into the Mind Age. Less than 50 years probably. Less than 25 years probably. And it will change everything. ChatGPT is just an idiot child compared to what will exist in 10 years, and in 25 years chatbots are going to be the least of the changes. The world will be fundamentally different in unknowable ways, any more than we could have predicted the smartphone and tiktok. 50 years out, we're interfacing with brains and directly poking at our biology and cognition. Probably 100 years and we're moving off biological hardware.

(did we have an idea that a star trek communicator or tricorder would be neat? Sure, but, it turns out it's actually a World-Brain In My Pocket. Which others predicted too, of course! But even William Gibson completely missed the idea of the cellphone, which even he's admitted ;)


Nice! I like the way you put this.



Thank you both for this. I have all the respect for Wolfram in the world but brevity is not his strong point with writing.



Found it! This is where I saw it.

https://news.ycombinator.com/item?id=34008075


> Yeah I've been thinking along these lines. ChatGPT is telling us something about language or thought, we just havent got to the bottom of what it is yet. Something along the lines of 'with enough data its easier to model than we expected'.

I’ve been thinking similarly, and am coming to understand and accept we’ll never get to the bottom of it :)

The universe is fractal-like in nature. It shouldn’t be a surprise, then, that if “we” have created an intelligence which exists as a subset of “us”, a self-similar process is ultimately responsible for granting us our own intelligence.


> 'with enough data its easier to model than we expected'

> a self-similar process is ultimately responsible for granting us our own intelligence

In my view, intelligence essentially resides within language, specifically in the corpus of language. Both humans and AIs can be effectively colonized by language, as there are innumerable concepts and observations that are transmitted from one mind to another, and now even from mind to LLM. Initially, ideas were limited to human minds, then to small communities, followed by books, computers, and now LLM stands as the ultimate epitome of language replication, in fact one model could contain the whole culture.

To be sure, there is a practical intelligence that is learned through personal experiences, but it constitutes only a tiny fraction of our overall intelligence. Hence, both AI and humans have an equal claim to intelligence, because a significant part of our intelligence arises from language.


Yann LeCun often argues that animals like cats and dogs are substantially more intelligent than current LLMs [0] and I'd have to agree. I don't see how/why to consider practical knowledge as only constituting a tiny fraction of our overall intelligence. Either way, it's not clear if the GPT-* models will someday produce emergent common sense or if they're going down an entirely wrong path.

[0] https://twitter.com/ylecun/status/1622300311573651458


> I don't see how/why to consider practical knowledge as only constituting a tiny fraction of our overall intelligence.

A human without language would be just a less adapted ape. The difference comes from language (where I include culture, science and technology).

Today you have to be a PhD to push forward the boundaries of human knowledge, and only in a very narrow field. This is the amount we add back to culture - if you got one good original idea in your hole life you consider yourself lucky.

https://www.wasyresearch.com/content/images/2021/08/the_esse...


LeCun misses the point by a mile, which is weird at his level. LLMs absolutely do perform problem solving, every time you feed them a prompt. The problem-solving doesn't happen on the output side of the model, it happens on the input side.

Someone objected that a cat can't write a Python program, and LeCun points out that "Regurgitating Python code does not require any understanding of a complex world." No, but a) interpreting the prompt does require understanding, and good luck finding a dog or cat who will offer any response at all to a request for a Python program; and b) it's hardly "regurgitating" if the output never existed anywhere in the training data.

TL,DR: his FoMO is showing.


I haven't used GPT-3 to generate code for me but I use Copilot all the time. Sometimes it freaks me out with its precience, but most of the time it is generating either nice one-liners or a lot of plausible-sound rubbish that would never build, much less run on its own. It creates a plausible API that is similar to the one in my app, but not the same; it doesn't integrate any actual structural knowledge of the code-base, its just bullshitting.


This is a script I told ChatGPT to write.

“Write a Python script that returns a comma separated list of arns of all AWS roles that contain policies I specify with the “-p” parameter using argparse”

Then I noticed there was a bug, AWS API calls are paginated and it would only return the first 50 results.

“that won’t work with more than 50 roles”

Then it modified the code to use “paginators”

Yes, you can find similar code on StackOverflow

https://stackoverflow.com/questions/66127551/list-of-all-rol...

But ChatGPT met my specifications exactly.

ChatGPT “knows” the AWS SDK for Python pretty well. I’ve used it to write a dozen or so similar scripts. Some more complicated than the others.


Ok that actually sounds hugely useful. It makes sense for very well known APIs it will get them quite accurately.


I wonder how well it would work if you seeded it with the inputs and outputs of custom APIs and then tell it to write code based on your API.


Well I'd hope that is what is going on in Copilot. It definitely does seem to be trained on my code to some extent, but it doesn't have anything I'd call a semantic understanding of it.


Again, the interesting part is what happens on the input side.

I can't believe I'm the only person who sees it that way. Likely the legacy of a misspent youth writing Zork parsers...


It's strange, isn't it?

Everyone is so quick to say how unimpressed they are by the thing meanwhile I'm sitting here amazed that it understands what I say to it every single time.

I can speak to it like I would speak to a colleague, or a friend, or a child and it parses my meaning without fail. This is the one feature that keeps me coming back to it.


The difference here is that a cat or dog hasn't been trained to write a python program, and it probably isn't possible - the weights and activation functions of a cat brain simply won't allow it.


Intelligence exists without language, language is only a way to describe the world around us and transfer information. We personally can't experience intelligence without language because you know language all your life and can't remember a moment in which you didn't. But there were humans in history that didn't knew any language and they were intelligent, there are animals that do not know any language and are intelligent. There were people in recent history raised in jungle by animals that didn't know language and were intelligent. They have different internal models that describe reality.


In college I tried a medication called Topamax (Topiramate) for migraine prevention. Topamax has a low-occurrence side effect of “language impairment”. After 10 days or so it became clear that I was particularly susceptible to this phenomenon.

It was a terrifying experience, but it was also a valuable one as it changed the way I view intelligence.

When I was in the thick of it, my writing and speech skills had devolved to that of a primary school aged child. I’ll never forget trying to type a text message and struggling to come up with simple words like “and”. My speech slowed down considerably and I was having trouble with verbal dictation.

The terrifying thing was that my internal world was still as complex and meaningful as it was before. All of the emotions I felt were real and legitimate. My cognition outside of communication was intact, I could do math just fine and conceptualize and abstract problems.

In spite of this, I was unable to convey what I was feeling and thinking to the outside world. It felt like I was trapped inside my own body.

I thankfully made a full recovery. However, my intuitive understanding of the link between language and intelligence was completely severed. While I believe there’s likely a high degree of correlation between the two in populations, on an individual level one’s language skills mostly represent one’s ability to communicate with the outside world, not their ability to understand complex information or process it.

See the following for more info on Topamax/Topiramate language impairment: https://link.springer.com/article/10.1007/s10072-008-0906-5


i read Meditation instead of medication and was ready to try it immediately.


The remarkable thing is that language can combine intelligence of multiple beings and at some point that collective intelligence becomes a thing of it own, it can capture minds, spread and grow. Now we have all of our collective intelligence encoded in the form of billions of web pages and it seems that the language itself can direct its thinking, at least judging by the ChatGPT outputs (although, doesn't initiate it — it needs some external input, something to respond to, just like humans need respond to what happens in the environment).


> can combine intelligence of multiple beings and at some point that collective intelligence becomes a thing of it own

Yes, that's what I meant. It is an evolutionary process with mutation and selection just like biology. We are just temporary hosts for these ideas that travel from person to person.


I’m actually not sure about this.


Ludwig Wittgenstein had same idea. but finally he found human experience is more then our language.


Could you elaborate please? For those of us not familiar with Wittgenstein's work, could you link to sources depicting before and after his views changed, preferably with summaries.



thank you


Humans are somewhat a blank slate, and culture is our initial prompt. The variety of humans is because of the variety of our initial prompts, and our similarities are because of the similar characteristics of our various cultures.

(I of course recognize some of our intelligence is genetic or epigenetic or microbial.)


No, chatgpt is already trained once it’s ready for prompts.


It’s training is like weighted frequencies, predispositions, instincts. A giant word association machine.

Then whatever prompt it is fed is what it becomes.


It’s by emergent construction that language has this property. It’s no accident.

In order to be able to communicate at all across the arbitrary range if subjective human experience, we had to come up with sounds / words / concepts / phrases that would preserve meaning across humans to whatever functional standard necessary.

Thus language is fundamentally constructed to be “modelable” whether it be humans or machines doing the modeling.

There is a whole other realm of ineffabilities that we screen out because they aren’t modelable by language


The thing I'm sort of confused about, but maybe someone can explain why I shouldn't be, is, why does there seem to be no implication for language translation? Or is there but coverage is overwhelmed by the fascination with chatGPT? In short, is machine language translation now a fully solved problem? A couple years ago when I tested Google translate in a non-esoteric conversation with my Russian speaking girlfriend and, although it was useful, in terms of native fluency it failed pretty decisively. But isn't this a much easier problem than that which chatGPT is being marketed (or at least covered in the media) as solving?


ChatGPT has been blowing every single translation task I've thrown it out of the water, even compared to other modern systems. I have no idea why more people aren't talking about that aspect of it either, other than the Anglosphere in general is kind of oblivious to things that aren't English.


For Russian, at least, sticking the article (bit by bit) into ChatGPT produces results that are broadly comparable to Bing and Google translators. It is somewhat more likely to pick words that are not direct translations, but might convey the idea better given the likely cultural background of someone speaking the language - for example, it will sometimes (but not always) replace "voodoo" with "witchcraft". However, the overall sentence structure is rather stilted and obviously non-native in places.

As others have noted, it doesn't seem to be fully language-aware outside of English. For example, if you ask it to write a poem or a song in English, it will usually make something that rhymes (or you can specifically demand that). But if you do the same for Russian, the result will not rhyme, even when specifically requested, and despite the model claiming that it does. If you ask it to explain what exactly the rhymes are, it will get increasingly nonsensical from there. I tried that after someone on HN complained about the same thing with Dutch, except they also noted that the generated text seemed like it would rhyme in English.

I wonder if that has something to do with sentence structure also being wrong. Given that English was predominant in the training corpus, I wonder if the resulting model "thinks" in English, so to speak - i.e. that some part of the resulting net is basically a translator, and the output of that is ultimately fed to the nodes that handle the correlation of tokens if you force it to talk in other languages.


I'm sure you're on the right track, regarding the % of the training corpus in English vs. other languages. It has done very well with colloquial Spanish as spoken in California, for example, which probably isn't too surprising.

What amazes me (and that you hint at) is that it still manages to pick more appropriate word/phrase choices, most of the time, even compared to dedicated translation software. I get the feeling (and I fully admit, this is just a feeling) that it's not using English, or any other language, as a pivot, but that there's some higher-dimensionality translation going on that allows it to perform as well as it does.


I think it's a matter of training corpus. Here's how a bilingual equally corpus bilingual LLM does on Chinese-English translations.

https://github.com/ogkalu2/Human-parity-on-machine-translati...


I tested Chinese-English translations on a properly bilingual LLM and the results are amazing. You might be interested in seeing https://github.com/ogkalu2/Human-parity-on-machine-translati...


Thanks for the link, I'll check it out.


I worked as a translator for many years and have been following developments in machine translation closely. In my opinion, ChatGPT does represent a significant advance for machine translation. If you have the time to watch it, I made a video about the topic last week:

https://youtu.be/najKN2bXqCo


Hey you might this. Bilingual LLMs really are human level translators. I don't know why this frankly mindblowing fact isn't discussed or researched more but they are.

https://github.com/ogkalu2/Human-parity-on-machine-translati...


Thanks for posting that. The results do look good.

The examples are all short and from expository prose passages, though. Do you have any longer examples that include dialog, so the translator has to infer pronoun reference, the identities of speakers in conversations, and other narrative-dependent information? As I show in my video, that’s where ChatGPT is superior to Google Translate et al.—at least with Japanese to English.


That's a good point. I was just kind of randomly plowing through so i didn't pick any dialogue scene specifically. Don't think it'll fail there though.


This is a wonderful video, thank you for posting it. It had never even occurred to me to try ChatGPT for translation purposes. I wonder how well it does with slang? That's one area where all machine translate is lacking, probably because its training corpus doesn't contain it.


Thanks for sharing this.


I think general translation is kind of solved when it comes to popular languages. Try DeepL.

I dont know how well it works for different language pairs to the languagesi know. I dont even know if deepl uses one of the newer large language models


What qualifies as popular languages in your opinion?

I use DeepL a lot as a first draft when translating stuff from Swedish (~10 million native speakers) or Dutch (~30 million native speakers) to English. While it's good enough as a starting point it regularly negates the meaning of fairly simple sentences, completely misses the use of popular idioms (often resulting in a non sequitur) and more often than not spits out grammatically incorrect nonsense for any sentence relying on implied context.


Bilingual LLMs are human level translators. I don't know why this frankly mindblowing fact isn't discussed or researched more but they are.

https://github.com/ogkalu2/Human-parity-on-machine-translati...


> that human language (and the patterns of thinking behind it) are somehow simpler and more “law like” in their structure than we thought.

That sounds like a lot of ideas on what makes humans special among other species and how our knowledge on that was being revised over last decades (what's common knowledge on the intelligence of, say, primates or corvids today would be unspeakable blasphemy mere 100 years ago). Various religions have instilled the idea of a human as a sacred entity that's meant to rule over everything because of how special ("made in the image of God") it is, yet we keep learning that we're much simpler than we thought over and over again. I wish for it to result in less hubris in the humanity as a whole.


> ChatGPT is telling us something about language or thought

There is a leap from language to thought and Wolfram talks about it in more detail in the article in the section named “Surely a Network That’s Big Enough Can Do Anything!”

I encourage everyone to read the full article. It’s more nuanced than “Language is easy”

Here is an excerpt from that section:

  …But this isn’t the right conclusion to draw [certain tasks being to complex for the computer]. Computationally irreducible processes are still computationally irreducible, and are still fundamentally hard for computers—even if computers can readily compute their individual steps. And instead what we should conclude is that tasks—like writing essays—that we humans could do, but we didn’t think computers could do, are actually in some sense computationally easier than we thought.


kurzgesagt has a new episode today about the human machine.

https://www.youtube.com/watch?v=TYPFenJQciw

One of the topics in the video was emergence, and where we see it in the world as more layers of complexity are added to systems. We're to the point with our algorithms that we are seeing complex system emerge from simple parts.


Does anyone know if there's a German version of this video? I tried finding it but couldn't.


I think not very far in the future we are gonna look at the belief that there is something extraordinary about human thought the way we think about geocentrism. That we privileged the human brain simply because it’s ours, because I suspect human thought will end up being not much more than something like a pattern matching mechanism.


Hundreds of millions of years of effort lifted a small veil from life's understanding of understanding, and it took 2 months for people to get spoiled over it. Good grief man.


There does seem to be something particularly cool about consciousness/qualia. I of course can't qualify that meaningfully beyond just finding it kind of awesome.


I believe that chatGPT is way beyond that level already. It is just it's currently used as "wake up, continue this text, die". But I think robots controlled by chatGPT and doing whatever they want (very soon) will show everyone that it has consciousness.


I don’t think the language point is particularly revelatory - we’ve lived with quite effective machine translation for a long while now. But it’s certainly unexpected that large swathes of complex knowledge can be gathered and represented this way (as patterns of patterns of patterns). Consequentially, ChatGPT is a still a fairly uninteresting pattern matching machine in itself. It has very static knowledge and no way to reason or ponder or evaluate or experiment between that knowledge and the world beyond, as anyone trying to use ChatGPT to get ‘correct’ answers and not just vaguely cromulent ideas is finding. We’ve perhaps proven that machines can know what we can know, but can’t think as we can think. I would not bet against the latter being solved in my lifetime though.


Chomsky proposed that decades ago. Universal grammar https://en.wikipedia.org/wiki/Universal_grammar


I don't fear AI but I do fear how people will react to the truths it reveals.


Maybe some type of cellular automata could explain it;)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: