Well ok. A shout-out to my own 9798990595101 and its ebook buddies 9798990595118 and 9798224433186. Some stats: Copies sold: 0. Royalties $0. Family and friends: noncommittal and confused. Personal feeling of accomplishment: priceless I guess. $0.99 ebook promotions so far 2 months, 0 copies. Looking to extend the record for 2 more months with 0 more copies. On-demand technology is fun, but boring if there is no demand. Amazon "Best Sellers" rank: #6,757,387 in Books.
None of those are real unless someone orders them. Lots of scalping by db nerds going on because these companies do not charge on-ramp fees for items.
I priced the paperback at $15 and chose to put the price on the back cover ISBN barcode for Amazon/IngramSpark paper copies, so in every respect it is a traditional book equivalent to those in stores or in libraries, got a Lib or Congress number too and sent them one. I see AbeBooks has '2' of them with the (*101) ISBN, so that seller will order those fake-copies on Amazon and pocket the extra.
But there is another network Draft2Digital who insists on No price anywhere on the book and assigning their own ISBN (the *186). They are aggressive internationally and cater to a lot of deep pocket impulse buyers (I guess the rich are like that the world over), so you can see your book 'listed' for 3x the price. Only when someone pays that price will they pay the $15 to D2D and drop ship to the customer. The no-price policy is so those people will not be annoyed to see $15 once it is in their lap.
Wow I can only imagine how unwieldy that was to use. Was it just one big list by date published or did they did they try to order by title/author/publisher? Books in Print nowadays is tens of thousands of (mostly garbage) records monthly - I imagine the volume was a lot less back then, but still had to be a lot to sort through on paper.
eBooks and self publishing weren't quite as big a thing in 2010. I wonder what result a similar count would produce today, when one can "write" a book, have it published and listed for sale, and even printed on demand, all in a matter of hours.
Bowker (the ISBN-issuing authority in the US) tracks issued ISBNs by publishing category. At last check a few years ago, this was on the order of 300k "traditional" books (that is, produced through an established publisher) and another 1--2 million or so "nontraditional" books.
Latest report I can find is from 2013, now only available as an archive:
It's interesting to consider books published vs. total market. For the US, there is a reading population of about 300 million people (I'm presuming ~30m are either pre-reading age or nonliterate). For 300k books, that's 1,000 readers per book. (Edit: Not "100" as initially written.) The highly asymmetric long-tail dynamics of book publishing, with a small handful of titles selling 1m+ copies per year, and most having sales of far fewer (often largely library sales) becomes highly evident.
The US Library of Congress also publishes new additions annually as part of the Librarian's Report to Congress:
As someone who has subscribed to Bowker’s Books in Print data for the last four years, I’d take any stats based on their data with a huge grain of salt. Bowker does issue ISBNs (and the BiP data has tens of millions of them), but they do very little validation, with their data largely input by publishers often long after the ISBN has been issued and with varying standards. For example, their attempt to identify overarching “works” (i.e. The Fellowship of the Ring as a literary work vs its various editions and reprintings) across ISBNs is unusably inaccurate, even for mainstream published titles.
Also as the article mentions, ISBNs are issued for all sorts of things most people would not consider a “book”, like journals (the kind you write in, not the academic kind), coloring books, sales displays, maps, bulk lots of books for schools, box sets, reprints of Wikipedia, calendars, etc and these are not always particularly well distinguished in their data because it’s seemingly up to the publisher to categorize it correctly, and some fly-by-night Wikipedia article reseller is just not going to put in accurate data.
Maybe Bowker has data they don’t include in BiP that would make their stats believable…but I kind of doubt it. LoC seems more reliable, but their corpus is (intentionally) much smaller and more focused, and generally the books libraries care about doesn’t 100% overlap with “all things published that most people would consider a book” since that’s not their purpose. OpenLibrary is doing good work in this space, but it’s still kinda early and struggles with data quality. It does ultimately depend how you on how you define a “book”, but for my money I’d say your numbers are low, though you’re spot on that only a very small fraction of those get widely read.
300m / 300k largely just simplifies the maths. That's useful for very rough napkin calculations.
Going beyond that: for starters you can exclude children to a certain age (say 5, 10, 15 years), based on limited literacy, and adults in later years with visual and cognitive deficiencies (glaucoma, macular degeneration, dementia, other cognitive conditions). Ten-and-unders alone are about 10% of the total population: <https://www.neilsberg.com/insights/united-states-population-...>.
Then there's actual measured adult literacy rates which are ... far more sobering than you might think. At least half the U.S. adult population would struggle strongly with any modestly complex text, fiction or nonfiction:
My sense is that it's more of an "it is what it is" situation. That is, if you're operating in a domain which requires or presumes literacy, then you'll do better to have a realistic appraisal of what the reality is.
Among other factors, the level seems to be relatively consistent over time, it corresponds to other similarly nuanced measures (the OECD computer literacy survey mentioned in my linked 2021 comment, Jean Piaget's work on intellectual attainment levels, presumably based on 1950s/1960s France), and other broad measures.
The US has a strong sense of the actual literacy situation because it actually tests for this, where many other countries apparently do not, or don't publish their findings. "Highly literate" is a pride and prestige factor for many countries, and rates of 95--99% literacy (often given) likely are based on very low minimum standards.
I also suspect that there may be some negative consideration given the large immigrant / non-native-English speaking population in the US (where some of the latter is in fact native-born but in insular communities), where individuals may have literate capabilities in their native language but not English. Given that the lowest rates of adult literary attainment are in southern border communities (most notably in the Big Bend region of Texas) this seems at least possible.
If you are highly literate and technical you're all but certainly an outlier amongst the general population, and your own immediate experience and that of those you encounter most often is probably not a generalisable one.
In the technical context I've called this the Tyranny of the Minimum Viable User, which addresses both the fact that widely-used computer interfaces must be exceedingly basic (to avoid disenfranchising the vast majority of the population) and that this means that proficient or expert users face challenges in trying to address their own complex needs on such systems unless there are ready means of extending the system capabilities to match their personal ability and needs. The tension here is absolutely innate and inevitable.
Also, if you're trying to sell books, you're selling into roughly 10--20% of the population at best, most of the time. Which is why other forms of media (music, video, games) tend to be so much more popular, in all senses of that word.
It’s a damn good thing there’s so much very, very good pre-“AI” media that one could be entertained and engaged and educated for three lifetimes with it. And that’s just the best stuff!
as the beginning of the article says: it all depends on what qualifies as a book
if you're including epub novels, that have never had a print done... then yeah, at least is gonna be the operating word, likely multiple times considering just how many fiction books are being produced on webnovel sites and then published as epubs for their fans to buy and support the author.
It also depends on what qualifies as 'published'. If a bot generates 100 nonsensical children's books with an LLM, lists them all on Amazon, and then takes them down a day later before anyone notices any of them (see https://news.ycombinator.com/item?id=40779643), then were they ever really published?