DEAR READER: Digital archives don't last: A tale of corruption and crashes

Friday, April 15, 2011 | 2:56 p.m. CDT; updated 5:05 p.m. CDT, Tuesday, April 19, 2011
Digital archive consultant Vicky McCargar used this corrupted photo as an example of how digital information does not necessarily last forever.

Dear Reader,

There’s an information-keeping crisis out there.


Related Media

Digital information has a short shelf life. All that news — especially those photos and articles that bypassed a print edition — aren’t safe.  Software systems become obsolete. Files become corrupted. Hard drives become worn out.

How long is that photo of your trip to the beach safe on your CD? If you said 10 years, you’d probably be wrong.

Frederick Zarndt, an entrepreneur and digital archives expert, told me you’re pushing your luck if you assume even five years. (Like all things, though, it depends: Record on a high quality CD, keep it stored just so, don’t use it and you could get 50 years. But who does that?)

How about those perfectly preserved newspaper pages that have been digitally fossilized? They’re usually stored on hard drives, which can wear out quicker than your grandmother’s underwear.  

Earlier this week, I sat in with a group of archivists, scholars, vendors and newspaper editors at Reynolds Journalism Institute. These were smart people who clearly had spent a long time thinking about the myriad issues related to preserving our nation’s newspapers.

Digital archive consultant Vicky McCargar found the above photo decaying in the e-files of the Los Angeles Times. It looks like an interpretive illustration, but it’s actually a documentary photo, one that has been corrupted over time.

That piece of history could be gone forever if the original negative wasn’t saved.

And what’s a newspaper to do with all those negatives and prints?

The Daily Oklahoman has 1.6 million photos, a treasure trove inaccessible to people because most of it isn’t digitized.

The digital format with the longest shelf life is microfilm. Zarndt says it can last 500 years. It’s more accessible, but hardly easy to use.

Other problems:

  • Print archives of out-of-business newspapers. After they die, the archives might go to a local library. They might go to a private collector. Or they might go to the trash.
  • What's an "edition" on an online news site? When news can be updated continuously, what is the right moment in time?
  • Technologies change. For instance: When the Missourian migrated from one content management system to another, three years (2001-2004) of digital articles were corrupted. Stories weren't lost, but all the paragraph marks flew away. Missourian librarians had to put them back in, by hand.
  • Another local example: Massive failure. Missourian librarian Nina Johnson says the files from 1986 to 2002 were lost because of a server crash.
  • Digital news is now a participatory activity, with professional organizations, eyewitnesses and citizens adding to the combined knowledge. What should be saved? And who should save it?

Some people dedicate their careers to trying to prevent the loss of images.

“Even if we can capture a bucket, but we can never capture it all,” said the Library of Congress’s Martha Anderson.

She’s trying. Anderson is the director of program management for the National Digital Information Infrastructure and Preservation Program. That’s a fancy title for an effort to preserve at-risk content.

Fixes all cost money. Don’t look for publishers to pay unless there’s an economic incentive.

How much will it cost?

If a single newspaper page costs 50 cents to convert and preserve to microfilm, a Tuesday Missourian would cost $8 to $12, including advertising supplements. Not so bad, right?

Not until you do the math.  Five or six days a week, 52 weeks a year, 103 years …

And that’s just for the print edition.

By midday Monday, I was ready to call in the dogs and call it a day. It all seemed too complicated and costly. But this was a curiously optimistic bunch. No one was backing away.

That’s a good thing for you and me.


Like what you see here? Become a member.

Show Me the Errors (What's this?)

Report corrections or additions here. Leave comments below here.

You must be logged in to participate in the Show Me the Errors contest.


Michael Williams April 16, 2011 | 9:13 a.m.

Interesting to compare the gist of this article to how long the Dead Sea scrolls lasted.

Or Egyptian/Mayan hieroglyphics. Cuniform, anyone?

We can store incredibly more massive amounts of data.....for shorter lengths of time. Progress???

I say, "BOOKS, NOT HARD DRIVES!!!!!!!!"

(somewhat tongue-in-cheek.....kinda-sorta...maybe...hmmmm)

(Report Comment)
Michael Williams April 16, 2011 | 9:49 a.m.

A few more thoughts.......

There is no doubt that digitization of information allows soooo much more than we had before data could be converted into zeros and ones. Heck, in writing my previous post, I couldn't remember how to spell hieroglyphics. So, I googled the word with a gross misspelling trusty computer came back with "Hey, dumbass...did you mean "hieroglyphics?" I can take a lot of software abuse when time and laziness are at issue.

But, my well-worn dictionary on the shelf by my side didn't get...well...any more well-worn.

When was the last time a student in high school or college actually had to read a book chapter to find that one single tidbit of needed information? Now, we simply "google", find a sentence that seems to provide the answer and totally lose context and the wealth of information that comes from simply "perusing" stuff. I wish I knew how often that single "tidbit" got me through a time-dependent effort, but the other 99% of information I really wasn't looking for came back to help me years in the future.

Are our brains REALLY sponges when we selectively feed it only the stuff we need to know right now? Well...yes...I guess even a damp sponge is just as much of a sponge as a really wet one. It's just not using all its capacity.

Sorry...soapbox time. I'm off-topic, so I'll quit.


(Report Comment)
Tom Warhover April 17, 2011 | 8:44 a.m.

It gets worse, Michael. Even paper doesn't last as long anymore. Zarndt told me that newsprint today is much more acidic than, say, a hundred years ago. It breaks down quicker. Books, too.

Inscriptions on stone, to the best of my knowledge, still last a good while. They hardly fold well though.

(Report Comment)
Ellis Smith April 17, 2011 | 11:27 a.m.

Which is more important, the medium or the information printed upon, or inscribed upon, or digitally displayed upon the medium (in the third case, read "screen")? I have my answer, what's yours?

Michael, you've given me a trip down memory lane. Thank you. In 6th grade public school in Miss Weisbrod's class each student had a small dictionary furnished by the school (all copies were the same). We would have "dictionary drills." Miss Weisbrod would call out a word, and there would be a race among all students to locate the word in their dictionaries. Miss Weisbrod didn't give the students the word's spelling, either. The student who first located the word was then to spell it and give its definition to the rest of the class.

(Report Comment)
Michael Williams April 17, 2011 | 11:59 a.m.

Tom: True, stone doesn't fold well.

And can you imagine the chiropractor bills for our little rugrats carrying around all those heavy backpacks? They'd have strong legs, tho.

Ellis: I dunno....medium or the info? I can make a good case that if the ancients had better media for writing stuff down, we'd know one heckuva lot more about them. In the end, I think the next huge job opportunity for a couple of million gov't workers is to archive our archives that were archived back when they were first archived.

(PS: Heck, the medium AND the info get lost. Do you have any 6" floppies lying around? That you forgot to transfer over to 3.5" and then CDs? Bet you don't have the hardware to read them even if you DID have the floppies. Like I said....archive our archives of the original archives. Wow, talk about the electronics industry and their "consumables" racket! It even slops over into our schools and homes and businesses....we need new computers to load the new software that will be upgraded to more megabytes necessitating new computers. I was in the wrong damn business.

CHALKBOARDS ARE FOREVER! Send the school computers to the junkyard and save MILLIONS IN TAXES EACH YEAR!!!!!!!!

(Report Comment)
Tom Warhover April 19, 2011 | 2:59 p.m.

Forbes cited the same archives conference in a column published yesterday. It has more about business models, or the lack of them:

(Report Comment)

Leave a comment

Speak up and join the conversation! Make sure to follow the guidelines outlined below and register with our site. You must be logged in to comment. (Our full comment policy is here.)

  • Don't use obscene, profane or vulgar language.
  • Don't use language that makes personal attacks on fellow commenters or discriminates based on race, religion, gender or ethnicity.
  • Use your real first and last name when registering on the website. It will be published with every comment. (Read why we ask for that here.)
  • Don’t solicit or promote businesses.

We are not able to monitor every comment that comes through. If you see something objectionable, please click the "Report comment" link.

You must be logged in to comment.

Forget your password?

Don't have an account? Register here.