"Guillotining books is unnecessary in order to acquire a good image". I don't know. I remember fondly a copier with a rounded screen at some university library years ago (University of Colorado?) where one could plunk down a bound volume and get a good copy without damaging the spine. I did foolishly buy a book--specially ordered--which turned out to have been printed from page images taken by copying on a flat-platen copier. Better than fifty percent of the pages lost anything from a letter to several letters in the "gutter".
"based on a pseudoscientific notion that books on wood-pulp paper are quickly turning to dust". Some are, some aren't. Some years back my parents shipped me the set of Mark Twain I had read as a boy. An awful lot of the pages were falling apart. Perhaps our family hadn't protected them well enough from light and air--but it's hard to see how one can read a book without exposure to both.
We recycle probably 10" of newspapers--The Washington Post and The New York Times--ever week. In the days before the newspapers lost all the revenue from classified ads, it would have been twice that. So a year of a big metropolitan paper from the 1990s could amount to forty feet of newspaper. That's a fair bit to make room for. I have some sympathy for the librarians there.
We talk a lot about archiving stuff - great, but I hardly see the same emphasis on finding things. Search is equally important because thats where the rubber meets the road - actual utility of archive is realized and benefits of long term archives are reaped. Of course, there are queries such as "Find all newspapers for July 26, 1931" if something important had happened on that day - this would be an O(1) query for a library if they've indexed newspapers by date. Finding all mentions of "Bankruptcy of Cooks Mills in Lawson, Texas in 1931" is much harder problem if it is a physical storage. I'd be ok with converting all newspaper archives in digital format that's searchable.
How much of the physical version should be preserved? For how long?
Singletons are probably not enough. But why save the physical aspects? Why not simply digitize people discussing it, take pictures of the original form, etc.? Why invest the time and energy maintaining items of dubious future need?
at least a few of everything forever is a good starting point, as we move more and more into digital spaces it will become be even more feasible to archive physical copies when they exist
(whit everything I also include WalMart depliants)
a copy of a daily newspaper from Los Angeles in 1937 is not the same as the Gutenberg Bible.
Newspapers were meant to be disposable, then and now, its good to store the information, but I'm highly skeptical of the need to hold onto the physical copies after scanning.
I think it's worthwhile to keep a few foundational books around, especially those that set a language or were otherwise milestones in technology and cultural progress.
We should not be hoarders. There is a balance to be struck.
I recommend considering folks can be multifaceted and aren't solely bound to one's opposing opinion as they read down a comment chain.
My understanding is that there are specialized copy devices that can do a decent job of bound volumes. But to the basic point...
Paper takes up a lot of space. You can't keep physical copies of everything forever. It is unfortunate that it seems as if film and poor quality scans replaced physical copies in some cases too early. But libraries do run out of space, especially well-indexed space.
I suggest you walk past a paper mill sometime to truly understand how environmentally horrific they are. The one up near me finally paid to put scrubber systems on their air exhaust. Before then when the wind was blowing the right direction it made all of north portland smell like rotting cabbage. Even with the scrubbers you still smell it at the airport occasionally.
Another fun detail is the parking lot workers park in has a drive through car wash/rinse device. The employees use it every shift when they leave... or at least they should if they don't want the acid fumes to eat the paint off their car.
Paper mills cause far more pollution than any potential carbon sequestration benefit, which would be temporary in the long term anyhow. Books will rot unless you put them in a controlled atmosphere.
I don’t know nearly enough about the process to challenge your point, but I want to point out that just because something smells bad or produces acid in huge quantities doesn’t mean it’s permanently bad for the environment like unsequestered carbon emissions.
The dose makes the poison and pretty much everything is toxic to a local environment at industrial scales. If the acid is stripping paint from cars faster than parking it on the beach, you can bet its a problem for the vast majority of an ecosystem that evolved alongside a freshwater river.
I'm not suggesting we make extra paper just for the purpose of sequestering carbon. The would obviously be foolish. All I'm saying is that if we are going to keep printing newspapers anyway (and at the moment we are) and the paper that goes into them is a sunk cost, then it might make sense to keep them around both for their historical value and as a carbon sink.
But my tongue was part-way in my cheek when I suggested this because you'd have to warehouse an awful lot of newspapers to make a dent in the carbon problem.
Actually, that's a specific thing that does not work (at least long term). Trees die and rot if not cut down, and the carbon from that process has to be absorbed by the younger trees growing in the same space. So once a (mature) forest exists at a given location, it will no longer absorb any more carbon unless some of the trees are cut down and used for something that doesn't allow their carbon content to return to the atmosphere.
The issue is that in America most books are printed on lower quality paper to reduce costs. This kind of paper decays much faster. To see how better quality books look like, look at high end academic publications, which are still produced to be stored in libraries for a long time.
Pulp paper is largely an artefact of the 19th and 20th centuries: cheaper than rag, but also far less durable. It remains in use and is suitable for ephemeral content, but materials meant for long duration are created with archival-quality paper, inks, and other treatments.
Anyone that has taken Edward Tufte's[0] Data Visualization course (I don't know if he gives it anymore), has seen his favorite props, which are a couple of copies of Euclid's Geometry. Supposedly, these are real, 400-year-old books, printed by a 17th-century mathematician.
He likes them, because of pop-out examples of geometric shapes.
I think they are more modern reproductions, but I could be wrong. They are in great shape.
This review implies that the main use case of microfilm transfer was to free up shelf space by destroying books and putting them on microfilm. To my knowledge, that is not the case: the main use case was for periodicals like newspapers, magazines, and journals, not books.
And certainly, microfilm was not intended to free up shelf space in the library: most items in a library's collection aren't on the shelves, they're stored in warehouses and basements.
Paying to store miles of old newspapers was not sustainable for most library systems, and despite what this review implies, newsprint does not last forever. The alternative to microfilm was not that we would have pristine copies of our history easily accessible to every library patron, it was that most copies of a periodical would simply be destroyed.
The remit of public libraries is not to preserve books, but to provide access to information. There's a crucial difference. Microfilm was not ideal, but in a pre-digitization world it was about the best option available to provide access to those old periodicals that would otherwise have been lost, or at least hard to access.
the point of the book of that for many publication not even one library opted to keep physical copies and that when the new York times offered cloth paper for libraries (the same paper used in centuries old books) they refused.
overall it turned out paper has a longer shelf live than microfilms
Yes, in fact, copying for reasons of preservation and research is one of the few copyright exceptions that libraries actually have as opposed to the far-ranging get out of jail free card that many people seem to think they have.
There are numerous cases of media being perserved strictly through unauthorised copying, falling well outside even the broadest interpretation of any library exception.
The Google Books project would be a case in point, as well as (AFAIU) the early stages of the Internet Archive's book-scanning project. Numerous early television programmes were only preserved through off-the-air copies by viewers. Up through the early 1970s, television news programmes did not generally preserve broadcasts, or even have a research library (see Epstein's News from Nowwhere, 1973).
I'm not sure where the Vanderbilt Television News Archive (dating to August 5, 1968) originated (Wikipedia claims the Nashville affiliate provided recordings), though I recall a case in which a member of the public, Marion Stokes, recorded nightly news broadcasts for decades.
Note that this is not written by Scott Alexander --- it's part of his "your book review" series in which readers submit book reviews and he invites his audience to read them and vote on them.
The irony... that you're reading some brutally simplified, crude, static, easily-archived HTML text whose fulltext copy has also been mailed out to thousands of readers[1] and archived in their personal mail readers and computers? A lot of bad things one could, and I have, said about Substack's technology, but "it will linkrot and can't be archived" is not one of them.
[1] ACX is #2 on the Technology Substack leaderboard so it has 'thousands'; since my own newsletter is ~6k and Scott is many times more popular than I am in our circles, I'd guess the number of email copies of this ACX post is closer to 50k than 1k.
You made me check the HTML of substack, and you're right, once you find the "article" tag it is precisely this, the article. In addition to everything you said, making a substack scrapper to extract precisely the article would be trivial.
I'm not sure why the downvotes. I see at https://en.wikipedia.org/wiki/Link_rot for instance that links have a half-life of two years. Of course, links differ from content, but the point is that online content is ephemeral (the wonderful Wayback machine nonwithstanding).
I know we as HN readers have a tendency to look in the comments for a TL;DR, but this time, do yourself a favor and read the article first. You won't find anything below that wasn't already covered, with more nuance and historical context, in the original linked piece.
We as a society need to become more comfortable with the destruction of information. Not every piece of past data is valuable. The hypothetical future value of most past data is less than the cost of preserving it.
One advantage of keeping everything is that it's a "neutral" approach. Especially with our history as a species, it's easy to think that we should preserve everything to protect us against certain groups of people or way of thinking that would erase some part of history.
If there's someone very interested in archiving a given set of information I'm all for it.
Where I think we collectively have an issue is when we're super into someone else archiving it. And that they then need to figure out a way to finance it all.
It can be hard to know what should be kept and be tossed. And reasonable people can disagree about which applies. But it's certainly a fair point that we probably don't want to preserve every byte of ephemera forever.
The argument that I hear often is that people in 200/300 years would precisely love that kind of thing. My history book in high school was filled with things like that. Articles from journals, letters from soldiers sent to their families. I think we don't need to keep every single high school newspaper, but I wouldn't know where to start if I had to choose which ones to keep.
"We started dumping stuff that we thought was obviously of no future use, groups that specialized in a lot of talk and no substance, so to speak. For example, fairly early on there was a newsgroup about abortion which specialized in violent arguments."
That's why not only the very earliest Usenet posts, before Spencer started archiving in 1981 (Usenet began in 1979) but even some of the posts in the 1980s are still lost. It's too bad; today, wouldn't more of us rather see what was being said about abortion in 1984 than sift through the arcana of bug fixes in systems that have probably been long since retired? "It was perfectly reasonable from the viewpoint of stuff that we might want to use again, but a little sad from today's viewpoint," Spencer admits.
--------
One of the challenges is that, even if storage comes close to free, the management of "everything" isn't. I've been going through this with cleaning up my photo library. It's not that a TB or so of storage matters much one way or the other but getting rid of dupes and near-dupes, adding better metadata, and so forth is a lot of work.
I agree, it will take a lot of time and energy to sift through this data to find things of value. But on the other hand, it's not even a fraction of what would be necessary if the data was lost.
But whose time and energy? I'm not personally going to spend hundreds of hours scanning and organizing stuff because someone someday might get some value out of it. Maybe a few things I find especially noteworthy or that I myself might want a digital copy of. But not in general.
Researchers mostly. They are the one currently going through stuff from the Roman Empire through the 80's to try to understand history better. You can see a thread on HN that is about what people ate in the Roman Empire https://news.ycombinator.com/item?id=27310179. I also remember a youtube channel called "Depression Cooking" where an old lady explained what they ate and how they lived during the great depression. You can argue about the value of knowing what romans or people during the great depression ate, but our society seems to value knowing the past.
My point is that future historians and researchers can spend however much time and energy they want. But I mostly will not go out of my way to assist them. Not out of any active desire not to help them but because I'm mostly not willing to put a lot of time/effort into it.
This is perilously close to the mental illness that results in hoarding behavior.
A friend of my mom's saved every bank statement, utility bill, and cancelled check (this was when you actually got your cancelled checks back from the bank every month).
She thought her kids "might be interested in looking at it" some day. Of course they weren't and when she died they threw it all in a dumpster.
Newspapers and books are one thing. Every random photo on your phone and piece of paper that comes through your home is quite another. Nobody is intersted in it. Throw it away.
I don't understand your comment. We were precisely talking about "newspapers and books" which can have an historical value, and you present an anecdote about a friend of your mom that was a hoarder, while calling data preservation "perilously close to the mental illness that results in hoarding behavior.". Do you not understand the difference between keeping newspaper, books, photos and "every random [...] piece of paper that comes through your house"?
> Nobody is intersted in it. Throw it away.
How about you focus on how to live your life and I'll handle how to live mine?
A ton of modern historical research comes from trawling through very boring old records. Import/export financial records, church tithe ledgers, birth lists, etc.
You might be surprised what a researcher in 2060 will get out of your high school newspaper (especially as the ability to computationally aggregate old digitized primary records continues to advance).
I've actually been doing some scanning because I picked up a large format scanner for a non-profit I'm on the board of but haven't been able to arrange a handoff.
The problem I find is that I have a ton of stuff that's neither "must keep" or "trash can" but, while I don't really begrudge it the space in my house, for now, it's also not stuff I'm going to spend hours and hours scanning against some future want.
> need to become more comfortable with the destruction of information.
I've come to observe that we're much too much dominated by a clerkish worldview - the imperial chinese civil service exams (sold in the West as meritocracy) have mostly taken over the world.
And what file clerk does not shudder at the thought of not keeping archives?
"based on a pseudoscientific notion that books on wood-pulp paper are quickly turning to dust". Some are, some aren't. Some years back my parents shipped me the set of Mark Twain I had read as a boy. An awful lot of the pages were falling apart. Perhaps our family hadn't protected them well enough from light and air--but it's hard to see how one can read a book without exposure to both.
We recycle probably 10" of newspapers--The Washington Post and The New York Times--ever week. In the days before the newspapers lost all the revenue from classified ads, it would have been twice that. So a year of a big metropolitan paper from the 1990s could amount to forty feet of newspaper. That's a fair bit to make room for. I have some sympathy for the librarians there.