For what it's worth, we're serious about replicating GPT-3. books3 is just one piece. You will notice I never claimed equivalency to GPT-3's training data.
books3 may be 10%, but The Pile is building the rest:
And again, I find books3 extremely cool and important work on your part, looking forward to the rest.
I just have a minor gripe with saying that "now we can train world class GPT model" thanks to that, as you said it's just one piece, and as a typicial HNist I had to point it out :).
Believe it or not, I appreciate and relate to that sentiment.
But after spending roughly one year acquiring knowledge related to this work, I feel I can say with a fairly high degree of certainty that this dataset alone is enough to train a model that will achieve "world class" status in some area. Writing books, perhaps.
Which part of my logic do you feel is mistaken, and why? I am actually quite interested to hear thoughts from someone who is very pedantic about such things.
I don't think you are mistaken, I guess it's just that there is just so much information you can convey in a tweet, when I read "world class GPT model" I understand a model that will beat (or equal) on general NLG, which is not what you meant it seems.