I'm gonna put forward the very view that gwern repeatedly argues against: "but... it's not understanding."
So far I see no evidence that this thing or anything else like it has any actual understanding, any model of the world. Indeed it can't as it possesses no sensory apparatus. It's not embodied. It doesn't experience anything.
I'm not sure the OpenAI folks would argue with me, but it seems Gwern asserts that this sort of thing indicates that general AI or even sentient AI is on the doorstep. I don't think it does, and I still maintain as I always have that CS people systematically underestimate and trivialize biology.
Well, there is no formal definition of "understanding" in the context of CS, AI, or machine learning so anyone can claim anything they like, with respect to the term.
For example, I have a thermos that keeps my coffee cold in the summer and hot in the winter. It u n d e r s t a n d s.
There are a number of NLP-tasks that aim to quantify understanding, e.g. textual entailment. No currently published model is even remotely close to human-level performance on all of these tasks.
As long as there are no ways to properly query models, it's hard to qualify their level of understanding. It would help immensely if we could ask models for rules as in "why was the object labelled 'a car'" (in case of image recognition) or directly query any grammatical rules discovered during the processing of language.
Especially in classification tasks, knowledge extraction (e.g. by outputting rules) would be so much more helpful than simply having an AI looking at a CT image and spit out "yep - that's a tumour, alright", while having radiologists scratch their heads as to why...
I had to look up textual entailment (on wikipedia) because I wasn't sure of its formal definition. It turns out, it doesn't have one:
>> "t entails h" (t ⇒ h) if, typically, a human reading t would infer that h is most likely true"
So in other words it's down to good old eyballing. I'm not impressed, but not surprised either, it's just one of the many poorly defined tasks in machine learning, particularly NLP which has turned into a quagmire of shoddy work ever since people started firing linguists to improve their systems' performance.
Anyway, since logical entailment is central to my field of study I can tell that if textual entailment is less strictly defined than logical entailment (as per the wikipedia article), then it doesn't require anything that we could recognise as "understanding". Because logical entailment certainly doesn't require understanding and its definition is as strict, as a very strict thing [1]. I mean, I can see how loosening a requirement for precision of any justification of a decision that "A means B" can improve performance, but I can't see how it can improve understanding.
Edit: I'm not sure we disagree, btw, sorry for the grumpy tone. I fully agree with your gist about explainability etc.
______________
[1] Roughly, "A |= B iff for each model M, of A, M is a model of B", where A and B are sets of first order logic formulae and a "model" in this context is a logical interpretation under which a set of formulae is true. A "logical interpretation" is a partition of a predicate's atoms to true and false.
Both papers provide promising first steps in the right direction but are by no means solutions to the problem at hand. I mean, the second paper is even based on the premise that classification has already been done by human experts as a preparation step...
When I say "I have a laptop in front of me," I am describing an understanding of something that is being experienced (sensed). If a Markov text generator outputs this text, it's just rearranging bits. I don't see any evidence that GPT-3 is doing anything more than rearranging bits in a much more elaborate way than a Markov text generator. The results kind of dazzle us, but being dazzled doesn't indicate anything in particular. I see something akin to a textual kaleidoscope toy, a generator of novel text that is syntactically valid and that produces odd cognitive sensations when read.
I maybe should have said sensed, not experienced, since experience also leads into much deeper philosophical discussions around the nature of mind and consciousness. I wasn't really going there, since I don't see anything in GPT-3 or any similar system that merits going there.
I also don't see any evidence that it is drawing any new conclusions or constructing any novel thoughts about anything. It's regurgitating similar results to pre-existing textual examples, re-arranging new ideas in new ways. If you don't think actual new ideas exist then this may be compelling, but if that's the case I have to ask: where did all the existing ideas come from then? Some creative mechanism must exist or nothing would exist, including this text.
The fact that the output often resembles pop Internet discourse says more about the mindlessness of "meme-think" than the GPT-3 model.
As for real world uses, social media spam and mass propaganda seems like the most obvious one. This thing seems like it would be a fantastic automated "meme warrior." Train it on a corpus of Qanon and set it to work "pilling" people.
> When I say "I have a laptop in front of me," I am describing an understanding of something that is being experienced (sensed).
I would ascribe that to two factors a) you have a more immediate, interactive interface to the physical world than GPT does, which is limited to a textual proxy and b) GPT naturally is not a human-level intelligence, it is still of very limited complexity so its understanding more akin to that of a parrot trying to understand its owner's speech patterns. It can infer a tiny bit of semantics and mimic the rest. The ratio is a continuum.
> As for real world uses, social media spam and mass propaganda seems like the most obvious one.
Take active learning versus usual learning. Often with active learning you can learn much faster. That's a kind of "experience." Out of distribution problems where it fails to generalize could be dealt with much more efficiently when a model can ask "hey what's f(x=something really weird and specific that would never come up in an entire internet's worth of training data)?" Experience isn't passive, and that makes a whole world of difference. And that's not even touching on the difficulty of "tell me all about elephants" versus "let me interact with an elephant and see it and touch it and physically study it."
yesterday I watched a youtube video about GPT3 (https://www.youtube.com/watch?v=_8yVOC4ciXc), and it showed two poems. One was human made, the other was from AI trained on that human's poems.
Both poems were pretty good. But one of them had a metaphor about the moon reflecting in ocean waves, being distorted and taking on monstrous forms.
I figured this had to be the human one, it was a novel description (because metaphor) of a very real experience (how the moon appears in reflection on the ocean).
So far I see no evidence that this thing or anything else like it has any actual understanding, any model of the world. Indeed it can't as it possesses no sensory apparatus. It's not embodied. It doesn't experience anything.
I'm not sure the OpenAI folks would argue with me, but it seems Gwern asserts that this sort of thing indicates that general AI or even sentient AI is on the doorstep. I don't think it does, and I still maintain as I always have that CS people systematically underestimate and trivialize biology.