We have to raise a lot money to get a lot of compute, so we've created the best structure possible that will allow us to do so while maintaining maximal adherence to our mission. And if we actually succeed in building the safe AGI, we will generate far more value than any existing company, which will make the 100x cap very relevant.
What makes you think AGI is even possible? Most of current 'AI' is pattern recognition/pattern generation. I'm skeptical about the claims of AGI even being possible but I am confident that pattern recognition will be tremendously useful.
What makes you so sure that what you're doing isn't pattern recognition?
When you lean a language, aren't you just matching sounds with the contexts in which they're used? What does "love" mean? 10 different people would probably give you 10 different answers, and few of them would mention that the way you love your apple is pretty distinct from the way you love your spouse. Though, even though they failed to mention it, they wouldn't misunderstand you when you did mention loving some apple!
And it's not just vocabulary, the successes of RNNs show that grammar is also mostly patterns. Complicated and hard to describe patterns, for sure, but the RNN learns it can't say "the ball run" in just the same way you learn to say "the ball runs", by seeing enough examples that some constructions just sound right and some sound wrong.
If you hadn't heard of AlphaGo you probably wouldn't agree that Go was "just" pattern matching. There's tactics, strategy(!), surely it's more than just looking at a board and deciding which moves feel right. And the articles about how chess masters "only see good moves"? Probably not related, right?
You have pointed to the examples where the tasks are pattern recognition. I certainly agree that many tasks that humans perform are pattern recognition. But my point is that not ALL tasks are pattern recognition and intelligence involves pattern recognition but that not all of intelligence is pattern recognition.
Pattern recognition works when there is a pattern (repetitive structure). But in the case of outliers, there is no repetitive structure and hence there is no pattern. For example, what is the pattern when a kid first learns 1+1=2? or why must 'B' come after 'A'? It is taught as a rule(or axiom or abstraction) using which higher level patterns can be built. So, I believe that while pattern recognition is useful for intelligence, it is not all there is to intelligence.
What I'm trying to point out is that if you had asked someone whether any of those examples were "pattern matching" prior to the discovery that neural networks were so good at them, very reasonable and knowledgeable people would have said no. They would have said that generating sentences which make sense is more than any system _which simply predicted the next character in a sequence of characters_ could do.
Given this track record, I have learned to be suspicious of that part of my brain which reflexively says "no, I'm doing something more than pattern matching"
It sure feels like there's something more. It feels like what I do when I program or think about solutions to climate change is more than pattern matching. But I don't understand how you can be so sure that it isn't.
> And it's not just vocabulary, the successes of RNNs show that grammar is also mostly patterns.
The shape of resultant word strings indeed form patterns. However, matching a pattern is, in fact, different than being able to knowledgeably generate those patterns so they make sense in the context of a human conversation. It has been said that mathematics is so successful because it is contentless. This is a problem for areas that cannot be treated this way.
Go can be described in a contentless (mathematical) way, therefore success is not surprising (maybe to some it was).
It is those things that cannot be described in this manner where 'AGI' (Edit: 'AGI' based on current DL) will consistently fall down. You can see it in the datasets....try to imagine creating a dataset for the machine to 'feel angry'. What are you going to do....show it pictures of pissed off people? This may seem like a silly argument at first, but try to think of other things that might be characteristic of 'GI' that it would be difficult to envision creating a training set for.
Anyone that argues AGI is possible intrinsically believes the universe is finite and discretized.
I have found Quantum ideas and observations too unnerving to accept a finite and discretized universe.
Edit: this in in response to GO, or Starcraft or anything that is boxed off -- these AIs will eventually outperform humans on a grand scale, but the existence of 'constants' or being in a sandbox immediately precludes the results from speaking to AI's generalizability.
Your arguments seem to also apply to humans, and clearly humans have figured out how to be intelligent in this universe.
Or maybe you're saying that brains are taking advantage of something at the quantum level? Computers are unable to efficiently simulate quantum effects, so AGI is too difficult to be feasible?
I admit that's possible, but it's a strong claim and I don't see why it's more likely than the idea that brains are very well structured neural networks which we're slowly making better and better approximations of.
Unless you assume some magic/soul/etc, then a human brain is a proof that there exists a non-impossible algorithm that learn to be a General Intelligence, and it can run on non-impossible hardware.
Yes, I assume a magic/soul/etc. and I believe that the human brain is not stand-alone in creating intelligence.
Check out this exciting video for discussion on how 'thinking' can happen outside brain.
https://neurips.cc/Conferences/2018/Schedule?showEvent=12487
I share the sentiment with you. Majority of the 'research' from OpenAI has been scaling up the known algorithms and almost all the models have been built on top of research from outside of OpenAI. My assessment is that OpenAI is currently not the leader in the field but they want to get there by attracting talent through PR and money, which IMHO is a fine strategy.
- ML is getting more powerful and will continue to do so as time goes by. While this point of view is not unanimously held by the AI community, it is also not particularly controversial.
- If you accept the above, then the current AI norm of "publish everything always" will have to change
- The _whole point_ is that our model is not special and that other people can reproduce and improve upon what we did. We hope that when they do so, they too will reflect about the consequences of releasing their very powerful text generation models.
- It is true that some media headlines presented our nonpublishing of the model as "OpenAI's model is too dangerous to be published out of world-taking-over concerns". We don't endorse this framing, and if you read our blog post (or even in most cases the actual content of the news stories), you'll see that we don't claim this at all -- we say instead that this is just an early test case, we're concerned about language models more generally, and we're running an experiment.
Finally, despite the way the news cycle has played out, and despite the degree of polarized response (and the huge range of arguments for and against our decision), we feel we made the right call, even if it wasn't an easy one to make.
> - The _whole point_ is that our model is not special and that other people can reproduce and improve upon what we did. We hope that when they do so, they too will reflect about the consequences of releasing their very powerful text generation models.
If this is your whole point, then I think you are missing something fundamental. Implementing these models doesn't require reflection, or introspection, or any sort of ethical or moral character whatsoever; and even if it did, all that will happen eventually is someone (without the technical background) will simply throw a lot of money at someone else (with the technical background, but who needs to, you know, eat, and pay rent, and so on) to implement it. You are fooling yourself if you think your stance makes a single mote of difference in this arms race.
>You are fooling yourself if you think your stance makes a single mote of difference in this arms race...
In fairness, if that's true, then no one has any need of her model.
More seriously speaking, why does anyone need, say, "training set x", or "model y", to make their implementation work? You don't. So I don't really understand why everyone is so worked up about not releasing this stuff? If you want to do it, do it. If not, don't. But there's no need to say, "I demand everyone do it, and I'll have a meltdown if they don't."
No one is saying "I demand everyone do it." There are two points:
- If they are going to publish the research, and want to claim it as research (which they will, either by submitting it to a conference or putting on arxiv for the citations), then they should publish the supporting material, because without the supporting material it is impossible for reviewers or other researchers to evaluate. This is not just the model--they are also not publishing the training code or the dataset.
In short, they want to have it both ways, by having their work accepted as scientific research, yet providing absolutely no way of determining if the results are reproducible. That is a horrible, horrible standard. (other companies are guilty of this as well, btw). I mean, think about how absurd it is that they are saying "our scientific results are too good to publish. Trust us." Why is this acceptable? because it sure as hell wouldn't be acceptable if was a random person releasing a paper claiming incredible accomplishments, yet they provided absolutely no evidence.
- The other criticism is that the justification for why they aren't publishing (which is that they are too concerned with the moral and ethical implications of their work) is, well, a load of crap. They aren't doing anything to contribute to the ethical or moral use of these tools by doing this and they aren't slowing research into the area one bit. If they really wanted to have an impact here they should have just not said anything (but of course, then the authors couldn't put this on their resume...).
Whether they are releasing the model is not the issue, own its own, and I don't think anyone is throwing a fit because someone doesn't release their model. It's the _why_ and the implications that bother people.
Exactly. This is like holding up spam samples or how spammers operate from the spam detecting work. That side (and the cultural discussions) needs all the headstart it can get, not be complacent that some arbitrary "experts" will patronizingly "protect" them.
If you look at it as a PR stunt, it is almost certainly a good idea. If a bad actor can auto-generate text that is not really distinguishable from something written by a human, how does a community with open membership (eg, HN) protect itself? I imagine this technology will enable interesting new attacks against online communities; we havn't seen that for a while.
OpenAI are extremely sensible to draw attention to the fact that AI is approaching a boundary that has practical implications. It is good that everyone is being alerted that that boundary might be crossed at any time in the foreseeable future.
But ... it's not novel. We could already generate convincing gibberish years ago.
Now the novelty is that this can be better targeted. But even simple Markov-chain based text generators were good enough to fool people for a bit.
And there was always people that had too much free time to write. A lot. (See for example the crackpots and conspiracy theorists that bombard physics forums. See the 9/11, Zeitgeists, etc. movies. See how much has been written about anti-vaxx, about quantum woo, etc.)
Reputation systems work pretty well for countering spammers.
And against APTs (advanced persistent threats, spearfishing attacks, etc) there's no real "universal" protection anyways. (You need a competent security team to out think and out resource the attackers in every possible dimension.)
This AI is the same as the paid Russian trolls and the unpaid scammers, and so on.
The OpenAI samples are leaps and bounds ahead of traditional Markov-chain generated text. I don't think you can compare the two. It's the fluency and plausibility that gives pause around a public release.
I agree with your last point though - it falls into the same category as paid Russian trolls. I think that's exactly why they were hesitant to release the pre-trained models - they didn't want to make it easier/cheaper for a bad actor to replicate the 2016 election.
It remains to be seen whether their decision will make an iota of a difference. But I understand their motivation.
No, I'm sorry, I wasn't precise enough. Yes, it's an amazing feat of engineering, and a truly great peak of text generation. But it's that. Text generation.
Yes, it can serve as great customized propaganda generator, and yes, people can be spin 'round and 'round with it. But they can be already with pretty much anything, from the simplest of phrases from "make X great again" to the elaborate scams of new age bullshit.
I simply disagree on the "virulence" or weaponization factor of this with others. (Especially when it comes to the possible "defenses", none can be "deployed" in 6 months. You can't teach critical thinking to billions of people overnight.)
Markov-chain generators are extremely lacking in long-term coherency. They rarely even make complete sentences, much less stay on topic! They were not convincing at all-- and many of the GPT-2 samples are as "human-like" as average internet comments.
Conjecture: GPT-2 trained on reddit comments could pass a "comment turing test", where the average person couldn't distinguish whether a comment is bot or human with better than, say, 60% accuracy.
That's an indictment of reddit comments more than AI. Remember that conditioned on the human-provided seed prompt, there is no statistical surprise (the definition of information) in the generated text. If all reddit comments are are riffs on the OP based on second-hand information, well then they may as well be bot-generated already.
At this stage, these AI's can only help. Imagine we are given this tool that can generate samples from the "uninformative but realistic looking text" distribution, we can then put it in a discriminator to filter out blabbering bots and humans together, or invert it to summarize the small kernel of information, and that would be a great thing. The better these models learn about typical human behavior the better off we are at identifying the truly exceptional. It's when AI starts to sense and incorporate novel information from the non-human environment that you really have to worry.
>That's an indictment of reddit comments more than AI.
Perhaps, but that's the world we live in. I suspect the average reddit commenter is already more articulate than the average person (citation needed, I know. But reddit skews highly educated young male in a first-world country. There's no way they do worse than a worldwide average).
I know they are extremely lacking, but compared to that a hyper-fancy NN with layers and layers of the darkest of black magic, trained at the zenith of the night for thousands of man years in the crypts of the terror itself, the TPU ... yeah, so it's not surprising it's better.
But it's no symbolic reasoning. It's not constructing a counter-argument from your argument. It simply lives off previous epic rap battles of internet flamewar history about .. well, about anything, since it's the Internet, and people like to chat, talk, write essays on every topic there is. Satire too. So there is always something to build that lang model on.
I'm not sure it has much in the way of implications.
There is no real profit to be made by generating realistic looking text. Spammers don't work that way, spammers haven't cared about realistic looking text for years. Nor have spam filters cared much about text for a long time, exactly because it's so easy to randomise. Anti-spam is not a good reason to hold back on language generation models, in my view.
As for HN, if bots can write posts as good as humans, great, why hold back?
You’re fooling yourself if you think there are no significant uses of text generation. Fake news, propaganda, advertising, fake reviews, fake everything. Fabricated email from friends family and colleagues. Whole online communities fabricated out of whole cloth. It is a weapon, and a powerful one.
No, it's useless and I speak from experience of dealing with spammers who forged mail from friends family and colleagues in the past.
People are not trivial automatons who can have their opinions rewritten on the fly by auto-generated text. If auto-generated text reaches into its giant grab-bag of learned expressions and produces something actually interesting or insightful, people might be interested in that new line of thinking, but if - like many of these examples - it's essentially rambling if coherent nonsense then it won't have any impact at all.
So I rather think it's you fooling yourself. You've been reading comments online for years without knowing who or what produced them. If you discovered half of them were artificial tomorrow, what difference would it make? The people around you are already judging arguments based on the content, not their volume or who wrote them.
No, a more effective PR stunt would be to release the model, and better ones, and make it so easy any idiot could use them. THAT would catch the attention of Congress, and THAT would result in funds and lesiglation to combat it. This won’t even register on a sub committees staffers wet dream. It is not human nature to pay attention to far off hypothetical abstract threats, only concrete and immediate ones. You could release a thousand papers like this and it wouldn’t do anything even approaching the effect of congressmen and their staff getting assloads of fake but convincing email/docs/etc, the press being indicated with thousands of fake but convincing tips, of tens of thousands of people calling the police because some asshats are spamming them with convincing letters from their dead grandma or whatever, of convincing communication to banks or brokers, letters to agencies claiming widespread danger (ie there is salmonella in half the food at xyz), kids sending forged letters to their school from their supposed parents to let them leave campus, and so on. I’m sure you can think of better examples.
I’m not entirely sure that that bad actor would get any more scalablity form it than from a Mechanical Turk farm, at least as far as impact goes.
It seem that as far as information warfare goes “less is more” works quite well and they rely on targeted people to spread the news for them.
When you want to drive an agenda you don’t need unique 100,000 comments you need a good copy pasta.
Overall I’m sick of this dramatization of the AI catastrophe until there will be a proven path with agency for it to actually operate in the real world.
A chat bot isn’t a threat to anyone even if it turns homicidle.
But a Mechanical Turk is traceable and definitely not anonymous. Using a self contained model somewhere on a server/cluster/workstation could be.
Regarding an agenda, sure, good pasta is fine and all, and regular ol people are fine, but it is not cost effective. This is a million times cheaper, which means you can use it everywhere, not just the obvious places, you can be everywhere, and you can do more than just push a couple big items, you could push tens of thousands of them, micro targeted all the way down to the individual. Don’t dismiss it so easy—the potential scale is far, far larger than anything existing to date.
And I would note that the reason 100,000 comments aren’t effective now is precisely because they are too formulaic, too obviously fake when used on such a large scale. This has the potential to create real, live, seemingly active and believable online communities of millions of people, all at fractions and fractions and fractions of a penny compared to current methods. People read news, then comments (or reviews or whatever), because they use them to determine the validity of the content they just read; if it’s no longer possible to tell from the comments what’s a scam and what isn’t... well, you could do a lot of things with that.
Ok but isn’t this the opposite of OpenAI’s “nukes are safer when multiple actors have them” strategy wrt AI?
I’m also confused by the threat models earnestly put forth in your blog post. Are we really concerned about deep faking someone’s writing? The plain word already demands attribution by default: we look for an avatar, a handle, a domain name to prove the person actually said this.
Yep. Maybe I misunderstood the subtler points of OpenAI’s “democratize AI” strategy, and this has been the plan all along. But AFAIK they haven’t put an “among a few rational state actors” asterisk on anything up until now.
Regardless, I agree with TFA that this is a silly and arbitrary time to yell “fire.” It’s PR.
> But AFAIK they haven’t put an “among a few rational state actors” asterisk on anything up until now.
True. On the PR side though, it'd be incredibly hard to say "we want to make replication moderately difficult, but not too difficult." Everyone would end up arguing exactly how much should be released, how it would prevent X,Y,Z folks from contributing to AI, etc.
> Regardless, I agree with TFA that this is a silly and arbitrary time to yell “fire.” It’s PR.
Alternatively, it does provide good insight into the reactions in the community as a whole, and continues the conversation on exactly how much should be released. Maybe I'm not far enough into the ML community, but the decision not to put the "keys to the kingdom" on github for every script kiddie to weaponize seems reasonable to me, especially as a precedent.
Mostly it's scary not because it's good - as writing goes, it's quite bad. It forms coherent sentences, but otherwise it's nonsense. I've seen similar nonsense producers in early 90s on basis of Markov chains and what not.
No, the scary part is how much it reminds me of what I am reading in the media all the time. My current pet concern is that AIs will start passing the Turing test not because AIs are getting so good but because humans are getting so bad. A bunch of nonsensical drivel can easily be passed as a thoughtful analysis or a deep critical think-piece - and that's not my conjecture, have been repeatedly proven by submitting such drivel to various academic journals and it being accepted and published. I'm not saying people are losing critical thinking skills - but they are definitely losing (or maybe never even had?) the habit of consistently applying them.
> I've seen similar nonsense producers in early 90s on basis of Markov chains and what not.
Exactly. When it comes to generating a large volume of apparently-good sentences, non-AI (or classical) approaches are still better than good. Those will be equally disruptive, since the defending side is yet to develop a proper countermeasure based on the "sensible"-ness of content. Plus, they will be much easier to customize and adapt to the situation, while ML-based solutions often need remodeling and retraining when repurposed.
> My current pet concern is that AIs will start passing the Turing test not because AIs are getting so good but because humans are getting so bad
AI will start deceiving the public even before it pass Turing test. It's much harder to spot bots amidst people than in a 1vs1 chatroom.
Actually this shows why OpenAI matters. Google have been training and refining Transformer architectures for years; how unlikely is it nobody tried training a language model at this scale or larger with similar results?
Yet from Google we heard nothing. Which is the optimal decision for them - they only lose by blowing the whistle.
A lot of people have results similar to this - but most people generating a paragraph of slightly_weird_but_plausible_if_you_read_quickly text using a primped version of BERT one time out of 25 regarded it as more or less pointless. But journalists don't.
This would be ok if this is the first time that anyone had a media go wild over AI story. But actually this has happened 10000 times this year already.
Seems like the way it worked is that the blog post was discussed here and on Twitter and many people thought it was interesting. Then some journalists picked it up and wrote about it.
That much is nothing out of the ordinary. It is interesting (at least to those of us who aren't natural language researchers) so why shouldn't we talk about it? Why shouldn't journalists write about it?
Inevitably their mildly controversial decision to hold some data back got a lot of people discussing whether it was necessary. Which is also perfectly okay.
So, in the end, the complaint is just about why people don't have smarter takes on things. I don't know what to tell you; that's just how social media works sometimes.
I'd shrug and move on, but the problem is that I believe that these flaps about AI are distracting attention from the real concerns and forces that are having a serious impact on people now.
The distortion of public debate caused by community exclusiveness on social platforms, by the curation and manipulation of social feeds and by the dynamics of online debate where the loudest and angriest voices dominate is one place that we could do with some focus.
Another place is the management of simple models - plain Jane stuff like a learned classifier - people are making these with Python and R and releasing them into infrastructures and apps and we don't know what they are and where they are and how they are interacting.
Instead we have wizard of oz style stories to distract us from who's actually hiding behind the curtain. If we fall for this then we may find ourselves living in a vicious totalitarian society with no obvious way out of it.
Journalists should write about it in an informed and professional way, that's fine. But they need to write about stories that are impactful and important, and if they were to write about this one in this way ("text scrambler makes a pretty good paragraph one out of 30 tries, has no idea of what is going on") they would get no clicks (there will now be a second wave of follow-ups like that to ride on the coattails of the story). Instead they have to make it sound like robots are going to take children from schools and experiment on them live on TV, and this makes them famous and rich.
There is no real revision of the story because the follow on stories disappear from view while search engines and other journalists use the original hysteria. Look at what happened with the two negotiating bots at facebook (the game was to negotiate to get books and balls, the bots tended to use a short hand to negotiate rather than the english they were trained on) This was "Facebook researchers have to pull the plug on AI that they no longer understand", and that is the narrative that we will have on that story more or less forever.
I've just read i.e https://twitter.com/gdb/status/1096098366545522688 and even though it's "best of 25" (I guess cherry-picked by a human) - this is mind-blowing. I am actually having a very hard time believing this is legit generated text.
I couldn't be more disappointed with this bullshit honestly. The texts have almost zero coherence and keep repeating the same patterns (which they presumably learned from the data set) over and over again. If this is their best out of 25 samples then they aren't going to fool anyone.
>Recycling is NOT good for the world.
>It is bad for the environment,
>it is bad for our health,
>and it is bad for our economy.
>Recycling is not good for the environment.
>Recycling is not good for our health.
>Recycling is bad for our economy.
>Recycling is not good for our nation.
The first paragraph keeps repeating the <X> is <bad | not good> for the <Y> pattern 8 times.
>And THAT is why we need to |get back to basics| and |get back to basics| in our recycling efforts.
"get back to the basics" is repeated twice in the same sentence.
>Everything from the raw materials (wood, cardboard, paper, etc.),
>to the reagents (dyes, solvents, etc.)
>to the printing equipment (chemicals, glue, paper, ink, etc.),
>to the packaging,
>to the packaging materials (mercury, chemicals, etc.)
>to the processing equipment (heating, cooling, etc.),
>to the packaging materials,
>to the packaging materials that are shipped overseas and
>to the packaging materials that are used in the United States.
It literally repeated packaging 5 times in the same sentence and the overall structure was repeated 9 times. Also what type of packaging is based on mercury?
The parts you criticise are the parts I was most impressed with. These sorts of repetitions can be persuasive in writing/arguments, and it's impressive to me that a model learned this type of writing.
> These sorts of repetitions can be persuasive in writing/arguments
That is the saddest part. It's not because AI is good, it's because we count saying "X is good/bad" 3 times as a persuasive argument. It won't be hard to learn this kind of "arguing", it's just sad that's what we're teaching our AIs to do and get excited when they do it.
> saying "X is good/bad" 3 times as a persuasive argument
I didn't say that it's a persuasive argument, I said that it can be persuasive IN arguments. There's nothing sad about an AI learning it, or people being happy with it, it's very impressive.
Why? It is pretty much a well juxtaposed mix of random internet comments. And it's the best of 25, which means the other 24 is even more regular internet banter noisy.
(This of course doesn't make it an amazing feat of computer engineering.)
The overarching narrative is great, but that's probably driven by the great antithesis supplied by the experimenter.
It'd be interesting to know how this works, what happens if less or more is given as thesis/antithesis/assignment, and after how much output it turns into gibberish (or repeats).
Definitely impressive work, but the fact that this is hard to distinguish from human text, if true, is pretty sad for humans. Even sadder if anyone reading this could be swayed by such an argument.
Heck, maybe having to compete with this will raise human discourse (Joking).
It's impressive in terms of having a coherent flow - there is a clearly stated "opinion" in the beginning and everything that follows is in support of that opinion. However, the dead giveaway is that there is zero reasoning, just related statements linked together.
I read it and found it to be a bunch of walking in circles and repetitive baloney. It starts with a bunch of claims that is just the reversal of a pro-recycling poster and then goes into a repetitive meandering exploration about paper being made from materials, which is made from another materials. Probably something a model would regurgitate if fed with some popular literature about recycling. The most astonishing fact for me is that people actually think it's somehow surprisingly good.
Have you done a plagiarism search on that text to see how similar it is to the input corpus? I'm by no means an ML expert, but I've played around with models for random name generation and one thing I've noticed is that as the models become more accurate, they also become much more likely to just regurgitate existing names verbatim. So if you search the list of names and notice something that seems particularly realistic, it could be because it's literally taken in whole or in part from the training data set!
(The talking unicorn example on their page is also meant to demonstrate that, no, it's not just memorizing, but I think it's a bit more compelling to check from the raw samples)
-Secrecy? but how will you continue to exist on the PR scene if you don't release anything?
-Are you willing to pay every developer who is able to replicate your paper, more than what the black market would pay?
-How are you working on incentive alignment to make sure that all people who can replicate your results have more incentive to do good than bad, specially in the current environment where users and valuable data are silo-ed by a few companies?
-Misdirection to keep an edge, i.e. planting bugs/ Not fixing bugs for public ; spreading false results; only working on problems that need high resources to limit the number of actor who will be able to replicate ?
-Tracking the people who have the competence to replicate and take preemptive measures.
-Restrictions on GPU/CPU/silicone wafer.
Who can regulate? How can we regulate? What are the negative consequence of regulation? What happens if we don't, at what odds and time horizon?
This seems very reasonable to me. All the outcry seems... disproportionate.
That said, withholding the pretrained models probably won't make much difference, because bad actors with resources (e.g., certain governments) will be able to produce similar or better results relatively quickly.
All it will take is (1) one or two knowledgeable people with the willingness to tinker, (2) a budget in the hundreds of thousands to a few millions of dollars at most, and (3) a few months to a year. Nowadays a lot of people are familiar with Transformers and constructing and training models across multiple GPUs.
I think you should at least release a small portion of the training data (e.g. anything recycling related) so people can measure to what extent the model is generating new sentences and to what extent it's just regurgitating training data.
One of the reason Elon distanced himself because of what OpenAI team wanted to do. I am wondering if this new paper has anything to do with that? Or what it is in general that Elon doesn't agree with what OpenAI is doing?
EDIT (I work at OpenAI and wrote the statement about the variance of the gradient being linear): Here's a more precise statement: the variance is exponential in the "difficulty" of the exploration problem. The harder the exploration, the worse is the gradient. So while it is correct that things become easy if you assume that exploration is easy, the more correct way of interpreting our result is that the combination of self play and our shaped reward made the gradient variance manageable at the scale of the compute that we've use.
Re variance, the argument is not entirely bullet proof, but it goes like this: we know that the variance of the gradient of ES grows linearly with the dimensionality of the action space. Therefore, the variance of the policy gradient (before backprop through the neural net) should similarly be linear in the dimensionality of the combined action space, which is linear in the time horizon. And since backprop through a well-scaled neural net doesn't change the gradient norm too much, the absolute gradient variance of the policy gradient should be linear in time horizon also.
This argument is likely accurate in the case where exploration is adequately addressed (for example, with a well chosen reward function, self play, or some kind of an exploration bonus). However, if exploration is truly hard, then it may be possible for the variance of the gradient to be huge relative to the norm of the gradient (which would be exponentially small), even though the absolute variance of the gradient is still linear in the time horizon.