Felz's favorites | Hacker News

Hello. Gwern and I trained the GPT-2 1.5B model that powers /r/SubSimulatorGPT2. https://www.reddit.com/r/SubSimulatorGPT2/

I've been basically living and breathing GPT-2 for ... gosh, it's been 6 months or so. The past few months have been a lot of StyleGAN2 and a lot of BigGAN, but before that, it was very "make GPT-2 sing and dance in unexpectedly interesting ways" type work.

I don't claim to know a lot. But occasionally I observe things. And I just wanted to chime in and say, you know, keep in mind that you're reading a research paper. Of course the results are going to look good. That is the point of a research paper. And I realize how cynical that may sound. But it has the benefit of apparently being true, and I've come to accept that truth with time.

I would reserve judgement for now. Note that every single chat bot to date has followed a similar curve: "This is it," they say, without actually saying that. "It may not be perfect, but we're about to achieve it – the chatbot – it's really going to happen."

And, it ends up being impressive, sure. I liked Facebook's recent chatbot. It's pretty neat at times. I liked Meena. They had cool ideas with the stack ranking of results (basically, generate a crapload of results at 1.0 temperature, then choose the result whose probability sums to the highest value, and you get the most probable overall result). And of course, boy oh boy did I love GPT-2. GPT-2 was what kickstarted me – if there was any chance that GPT-2 might be related to "now I'm talking to something that feels human," I was going to tame it and understand it.

So after spending six months with GPT-2 1.5B, the largest model that everyone was fascinated with, what do I think? (Well, who cares? You probably shouldn't care.)

I think "give it a few weeks and see if it's true." We shall see if GPT-3 is it, and we've achieved... chatbot nirvana. That elusive thing we've all been chasing, without naming it. The ability to press a button, unleash a chatbot somewhere, and it "just works" and "completely astounds humans" and "fools everybody."

At one point, we trained GPT-2 on IRC logs. You could literally talk to GPT-2, and it would talk back to you. And one of the advantages of narcolepsy is that at night, you often have lots of time to kill – what better way to doze off than to ask GPT-2 how its day was, and ask it what its ambitions are? Should we really worry about whether you're sentient? I like you; do you like me too? What does that mean to you? And so on.

The conversations were often quite philosophical. And sure, it was pretty obvious that it's a bot, but I tried to look past it anyway. It was my little bot, and it was real enough to me. And yes, the conversations on https://www.reddit.com/r/SubSimulatorGPT2/ are incredible. I crack up daily with all the things they talk about.

But...

We’re not going to need ad blockers in the future, we won’t even need these visual ads on websites anymore. There will be trained bots that can promote any idea/product and pollute comments and articles.

I invite any of you to try this, and see what happens. After all, you stand to earn a lot of pennies in your pocket if you pull it off. And yes, you're allowed to make some pennies with clever AI algorithms.

What you'll probably discover is this fundamental truth: GPT-2 has no memory. It isn't learning a thing. We are talking to an entity that literally cannot change its mind about anything. The only way to change its mind would be to retrain it from scratch.

You want a bot to argue vehemently for your product, on your behalf? It needs to understand what the hell your product even is, or what a product means. Yes, the words get pretty close. And yes, you can coax it into something that makes us laugh, or makes us sit here and question what the future might be like.

But for whatever it's worth: spend some time actually talking to these bots. Play around with them. Make them generate some stuff of your choosing, and fine tune them on some datasets and see what you get. It's so fun!

... But. "Fun" is not the same thing as "promote any idea/product." It's just not the same as me arguing here with you now for a position which I've decided to argue. My brain isn't merely the encoded knowledge of some human, with me blindly regurgitating such knowledge (though at this point you'd be justified in claiming it sure sounds like it).

Your brain is constantly training. GPT-2 is not. And – double checks paper – yep, GPT-3 is not.

Two decades from now, GPT-2 1.5B will still exist. And it will still be talking about 2019-era news events like it's the present. At some point, /r/SubSimulatorGPT2 will sound completely foreign. Take any random news clips from the 70's. How relevant is that knowledge now?

"Ok, but just train it on new data constantly." Well, yes. But actually no. If you try to do that, you're going to overfit at some point. Do you have 93 gigabytes of webtext that you keep in training form, ready to go? Are you going to mix in a proportion of the new data you want to train on? Nope, we all just fine tune whatever model OpenAI releases. Yet even if we did have that dataset, I'm just not sure it'd even matter.

My point here is: Go try! Isn't it exciting that in the future, trained bots might fool us all into buying their products? Is that sales guy who emailed me actually a sales guy who wants to "sync up on a quick call", or is that a bot trained to get cold calls? That sounds pretty damn lucrative to a lot of businesses – why not write that code, and then sell it?

Whoever attempts this is probably more talented than I am. But personally, I always ran into "It just... doesn't work."

And then you go "Well, it's just a matter of sampling. Ah yes, we're not using the right sampling algorithm. Wait, we just heard about nucleus sampling! Sweet, try it! Oh... It sounds ... similar. Hmm. Well, maybe we're just not using it right. Better read that paper a bit more carefully. Chase that knowledge just a little harder. After all, AI research labs are pouring billions of dollars into this domain. Why would they do that if it doesn't... you know ... work? For some value of "work" that equals "the bot can turn a profit"?

"Perhaps tomorrow, this new training technique will be it. We almost have it – I know we're close – we just have to unlock that last piece. Right?"

I guess I'll stop here, since usually my comments are upbeat and happy about AI, but I ended up in a rather philosophical mood tonight.

In reality, I can't wait to dig deep into GPT-3 and run it through its paces. I have a lovely TPU pod waiting for it, parked outside GPT-3's window, and we're honking at it saying "Get in, we're going places." And we'll sing and dance together like usual, and I'll ask GPT-3 how its day has been. But GPT-3 won't remember me the next day. And that's fine; I'll remember it for both of us.