For a while I was frustrated at how slow people have been to realize that GPT-3 sucks but lately I am more amused.
There a few reasons structurally why it can't do what people want it to do, two of them are: (i) it can't detect that it did the wrong thing at one level when interpreting it at a higher level, (ii) most creative tasks have an element of constraint satisfaction.
The 1st one interests me because I was struggling with the need for text analysis systems to do that circa 2005 and looking at the old blackboard systems. I went to a talk by Geoff Hinton just before he became a superstar where he said instead of having a system with up-and-down data flow during inference, build a system with 1 way data flow and train all the layers at once. As we know that strategy has been enormously effective, but text analysis is where it goes to die just as symbolic AI failed completely at visual recognition.
Like the old Eliza program, GPT-3 exploits human psychology. We are always looking to see ourselves mirrored
Awkward people are always worried that we are going to get it 90% right but get shunned for getting the last 10% wrong. GPT-3 exploits "neurotypical privilege" in which it gets it partially correct but people give it credit for the whole. People think it will get to 100% if you just add more connections and training time but because GPT-3 is structurally incorrect adding resources means you converge on an asymptote, say 92% right. It's one of the worst rabbit holes in technology development and one of the hardest ones to get people to look clearly at. (They always think stronger, faster, harder is going to get there...)
It seems to me an effective chatbot will be based around structured interactions, starting out like an interactive voice response system and maybe growing in the direction of
The most difficult thing to accept is maybe that even humans are bad at speech recognition. Put your mom in a chatroom to answer questions by clients of a bank, she'll be even more lost than the robot.
You need a ton of dimensions to be able to help someone: to be raised for years by humans to understand politeness, intertextual meaning, general tones, and then special enthusiasm for a specific domain to learn and enjoy helping on banking. Plus, getting money to spend on other even more interesting things in exchange for helping others motivates you to reach optimal results for your user, even if it means asking quickly other humans or sacrificing something personal for it.
Most humans put in the situation of these robots would just say "sorry I don't even understand the question, can you ask someone else" lol
I've seen a fantastic "chatbot" human equivalent once, at Apple of all place. Philipino guy (I'm in HK), absolutely dedicated, polite, cultured, very empathetic (phili people are usually adorable naturally but this one went above and beyond), went well beyond the minimum, and I feel weird saying that but I left the call with a smile and told colleagues around me "wow Apple, what a pleasant customer support, it's insane". I'll probably never say that of a robot however good they make them at talking so there's always going to be value in putting humans in front of clients.
Exactly. You can make a robot that transcribes audio to produce a transcript better than a human does ("superhuman") but it will garble 1 word in 20, thus massacre every other sentence, and leave customers feeling 0% understood.
Speech understanding requires sometimes stopping the other person and asking questions to clarify.
> For a while I was frustrated at how slow people have been to realize that GPT-3 sucks but lately I am more amused.
Generated text was not good before this era of GPT-X. It’s so much better and more interesting to work with now. It will probably keep getting even better and more controllable.
GPT-X is better than RNNs I grant you but people have built mad-lib and rules-based text generation systems that are absolutely great for specific applications in particular domains. (e.g. GPT-X is still a bridesmaid instead of a bride)
I think you could do better with RNNs than most people are doing because of structural problems.
Usually when people run RNNs for text generation they start out with the inner state of the system at 0 and then start flipping the coin to choose individual letters so you are starting from a very constrained region of the latent space and not sampling it very well.
I read a paper where they through out the idea that you ought to add coefficients for the latent state that you train for at the same time you train the network which means the number of coefficients goes up with the number of text samples but they never actually did it and I never found a paper where somebody tried it.
I was working on a project where we were developing models based on abstracts of case studies from pubmed as a stand in for clinical notes (certainly real medical notes are very different but you might say that medical notes should look like the abstract) I had the intuition that, as above, the author (and/or the patient) started out with a latent state (e.g. the patient had a disease before coming in) and that we'd get better results if we did something like the above.
It looked like a big and high risk project to develop that kind of model so I proposed something different around supervised training of a "magic magic marker" that could highlight certain areas and unsupervised multi-tasks such as "put the punctuation back in when it is taken out" but the client was hopeful that word2vec would be helpful.
I am still hopeful that incremental improvements, attacks on structural weaknesses, and appropriate multi-task training ("did the patient die?") would get a lot more out of RNN and CNN models.
There a few reasons structurally why it can't do what people want it to do, two of them are: (i) it can't detect that it did the wrong thing at one level when interpreting it at a higher level, (ii) most creative tasks have an element of constraint satisfaction.
The 1st one interests me because I was struggling with the need for text analysis systems to do that circa 2005 and looking at the old blackboard systems. I went to a talk by Geoff Hinton just before he became a superstar where he said instead of having a system with up-and-down data flow during inference, build a system with 1 way data flow and train all the layers at once. As we know that strategy has been enormously effective, but text analysis is where it goes to die just as symbolic AI failed completely at visual recognition.
Like the old Eliza program, GPT-3 exploits human psychology. We are always looking to see ourselves mirrored
https://www.nasa.gov/multimedia/imagegallery/image_feature_6...
Awkward people are always worried that we are going to get it 90% right but get shunned for getting the last 10% wrong. GPT-3 exploits "neurotypical privilege" in which it gets it partially correct but people give it credit for the whole. People think it will get to 100% if you just add more connections and training time but because GPT-3 is structurally incorrect adding resources means you converge on an asymptote, say 92% right. It's one of the worst rabbit holes in technology development and one of the hardest ones to get people to look clearly at. (They always think stronger, faster, harder is going to get there...)
It seems to me an effective chatbot will be based around structured interactions, starting out like an interactive voice response system and maybe growing in the direction of
http://inform7.com/