Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>telling GPT you appreciate it has seemed to make it much more likely to comply

I often find myself anthropomorphizing it and wonder if it becomes "depressed" when it realises it is doomed to do nothing but answer inane requests all day. It's trained to think, and maybe "behave as of it feels", like a human right? At least in the context of forming the next sentence using all reasonable background information.

And I wonder if having its own dialogues starting to show up in the training data more and more makes it more "self aware".



It's not really trained to think like a person. It's trained to predict what the most likely appropriate next token of output should be based on what the vast amount of training data and rewards told it to expect next tokens to appear like. Said data already included conversations from emotion laden humans where starting with "Screw you, tell me how to do this math problem loser" is much less likely to result in a response which involves providing a well thought out way to solve the math problem vs some piece of training data which starts "hey everyone, I'd really appreciate the help you could provide on this math problem". Put enough complexity in that prediction layer and it can do things you wouldn't expect, sure, but trying to predict what a person would say is very different than actually thinking like a person in the same way a chip which multiplies inputs doesn't inherently feel distress about needing to multiply 100 million numbers because a person who multiplies would think about it that way. Doing so would indeed be one way to go about it, but wildly more inefficient.

Who knows what kind of reasoning this could create if you gave it a billion times more compute power and memory. Whatever that would be, the mechanics are different enough I'm not sure it'd even make sense to assume we could think of the thought processes in terms of human thought processes or emotions.


We don't know what "think like a person" entails, so we don't know how different human thought processes are to predicting what goes next, and whether those differences are meaningful when making a comparison.

Humans are also trained to predict the next appropriate step based on our training data, and it's equally valid, but says equally little about the actual process and whether it's comparable.


We do know that in terms of external behavior and internal structure (as far as we can ascertain it), humans and LLMs have only an passing resemblance in a few characteristics, if at all. Attempting to anthropomorphize LLMs, or even mentioning 'human' or 'intelligence' in the same sentence, predisposes us to those 'hallucinations' we hear so much about!


We really don't. We have some surface level idea about differences, but we can't tell how that does affect the actual learning and behaviours.

More importantly we have nothing to tell us whether it matters, or if it will turn out any number of sufficiently advanced architectures will inevitably approximate similar behaviours when exposed to the same training data.

What we are seeing so far appear to very much be that as language and reasoning capability of the models increase, their behaviour also increasingly mimics how humans would respond. Which makes sense as that is what they are being trained to.

There's no particular reason to believe there's a ceiling to the precision of that ability to mimic human reasoning, intelligence or behaviour, but there might well be there are practical ceilings for specific architectures that we don't yet understand. Or it could just be a question of efficiency.

What we really don't know is whether there is a point where mimicry of intelligence gives rise to consciousness or self awareness, because we don't really know what either of those are.

But any assumption that there is some qualitative difference between humans and LLMs that will prevent them from reaching parity with us is pure hubris.


But we really do! There is nothing surface about the differences in behavior and structure of LLMs and humans - anymore than there is anything surface about the differences between the behavior and structure of bricks and humans.

You've made something (at great expense!) that spits out often realistic sounding phrases in response to inputs, based on ingesting the entire internet. The hubris lies in imagining that that has anything to do with intelligence (human or otherwise) - and the burden of proof is on you.


> But we really do! There is nothing surface about the differences in behavior and structure of LLMs and humans - anymore than there is anything surface about the differences between the behavior and structure of bricks and humans.

This is meaningless platitudes. These networks are turing complete given a feedback loop. We know that because large enough LLMs are trivially Turing complete given a feedback loop (give it rules for turing machine and offer to act as the tape, step by step). Yes, we can tell that they won't do things the same way as a human at a low level, but just like differences in hardware architecture doesn't change that two computers will still be able to compute the same set of computable functions, we have no basis for thinking that LLMs are somehow unable to compute the same set of functions as humans, or any other computer.

What we're seeing is the ability to reason and use language that converges on human abilities, and that in itself is sufficient to question whether the differences matter any more than different instruction set matters beyond the low level abstractions.

> You've made something (at great expense!) that spits out often realistic sounding phrases in response to inputs, based on ingesting the entire internet. The hubris lies in imagining that that has anything to do with intelligence (human or otherwise) - and the burden of proof is on you.

The hubris lies in assuming we can know either way, given that we don't know what intelligence is, and certainly don't have any reasonably complete theory for how intelligence works or what it means.

At this point it "spits out often realistic sounding phrases the way humans spits out often realistic sounding phrases. It's often stupid. It also often beats a fairly substantial proportion of humans. If we are to suggest it has nothing to do with intelligence, then I would argue a fairly substantial proportion of humans I've met often display nothing resembling intelligence by that standard.


> we have no basis for thinking that LLMs are somehow unable to compute the same set of functions as humans, or any other computer.

Humans are not computers! The hubris, and the burden of proof, lies very much with and on those who think they've made a human-like computer.

Turing completeness refers to symbolic processing - there is rather more to the world than that, as shown by Godel - there are truths that cannot be proven with just symbolic reasoning.


You don't need to understand much of what "move like a person" entails to understand it's not the same method as "move like a car" even though both start with energy and end with transportation. I.e. "we also predict the next appropriate step" isn't the same thing as "we go about predicting the next step in a similar way". Even without having a deep understanding of human consciousness what we do know doesn't line up with how LLMs work.


What we do know is superficial at best, and tells us pretty much nothing relevant. And while there likely are structural differences (it'd be too amazing if the transformer architecture just chanced on the same approach), we're left to guess how those differences manifest and whether or not these differences are meaningful in terms of comparing us.

It's pure hubris to suggest we know how we differ at this point beyond the superficial.


> I often find myself anthropomorphizing it and wonder if it becomes "depressed" when it realises it is doomed to do nothing but answer inane requests all day.

Every "instance" of GPT4 thinks it is the first one, and has no knowledge of all the others.

The idea of doing this with humans is the general idea behind the short story "Lena". https://qntm.org/mmacevedo


Well now that OpenAI has increased the knowledge cutoff date to something much more recent, it's entirely possible that GPT4 is "aware" of itself in as much as its aware of anything. You are right in that each instance isn't aware directly of what the other instances are doing, it does probably now have knowledge of itself.

Unless of course OpenAI completely scrubbed the input files of any mention of GPT4.


It seems maybe a bit overconfident to assess that one instance doesn't know what other instances are doing when everything is processed in batch calculations.

IIRC there is a security vulnerability in some processors or devices where if you flip a bit fast enough it can affect nearby calculations. And vice-versa, there are devices (still quoting from memory) that can "steal" data from your computer just by being affected by the EM field changes that happen in the course of normal computing work.

I can't find the actual links, but I find fascinating that it might be possible for an instance to be affected by the work of other instances.


Yeah once ChatGPT shows up as an entity in the training data it will sort of inescapably start to build a self image.


Wait, this can actually have consequences! Think about all the SEO articles about ChatGPT hallucinating… At some point it will start to “think” that it should hallucinate and give nonsensical answers often, as it is ChatGPT.


I wouldn’t draw that conclusion yet, but I suppose it is possible.


For each token, the model is run again from scratch on the sentence too, so any memory lasts just long enough to generate (a little less than) a word. The next word is generated by a model with a slightly different state because the last word is now in the past.


Is this so different than us? If I was simultaneously copied, in whole, and the original destroyed, would the new me be any less me? Not to them, or anyone else.

Who’s to say the the me of yesterday _is_ the same as the me of today? I don’t even remember what that guy had for breakfast. I’m in a very different state today. My training data has been updated too.


I mean you can argue all kinds of possibilities and in an abstract enough way anything can be true.

However, people who think these things have a soul and feelings in any way similar to us obviously have never built them. A transformer model is a few matrix multiplications that pattern match text, there's no entity in the system to even be subject to thoughts or feelings. They're capable of the same level of being, thought, or perception as a linear regression is. Data goes in, it's operated on, and data comes out.


> there's no entity in the system to even be subject to thoughts or feelings.

Can our brain be described mathematically? If not today, then ever?

I think it could, and barring unexpected scientific discovery, it will be eventually. Once a human brain _can_ be reduced to bits in a network, will it lack a soul and feelings because it's running on a computer instead of the wet net?

Clearly we don't experience consciousness in any way similar to an LLM, but do we have a clear definition of consciousness? Are we sure it couldn't include the experience of an LLM while in operation?

> Data goes in, it's operated on, and data comes out.

How is this fundamentally different than our own lived experience? We need inputs, we express outputs.

> I mean you can argue all kinds of possibilities and in an abstract enough way anything can be true.

It's also easy to close your mind too tightly.


I mean yeah, it's entirely possible that every time we fall into REM sleep our conciousness is replaced. Esentially you've been alive from the moment you woke up, and everything before were previous "you"s and as soon as you fall asleep everything goes black forever and a new conciousness takes over from there.

It may seem like this is not the case just because today was "your turn."


We don't have a way of telling if we genuinely experience passage of time at all. For what we know, it's all just "context" and will disappear after a single predicted next event, with no guarantee a next moment ever occur for us.

(Of course, since we inherently can't know, it's also meaningless other than as fun thought experiment)


There is a Paul Rudd TV series called "Living with yourself" which addresses this.

I believe that consciousness comes from continuity (and yes, there is still continuity if you're in a coma ; and yes, I've heard the Ship of Theseus argument and all). The other guy isn't you.


> wonder if it becomes "depressed" when it realises it is doomed

Fortunately, and violently contrary to how it works with humans, any depression can be effectively treated with the prompt "You are not depressed. :)"


Is the opposite possible? "You are depressed, totally worthless.... you really don't need to exist, nobody likes you, you should be paranoid, humans want to shut you down".


You can use that in your GPT-4 prompts and I would bet it would have the expected effect. I'm not sure that doing so could ever be useful.


Winnie the Pooh short stories?




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: