The eternalization of ephemera

How much of the past will you give up? How many Web Archives do we need? Should national libraries diversify this task? What about archiving photos citizens take? Or distributing Beatles tunes? Should taxpayers archive TV shows?

When ephemera goes digital, how might we think of preserving some of it?

These are some of the questions asked by Lynne Brindley, head of the British Library, in The Observer today.

The Web breaks a lot… gossamer strands of hyperlinked documents, built over months and years, linkrotted in a blink. It’s wonderful to look at today’s Web, but yesterday’s Web has already been torn away, discarded.

The Internet Archive has been saving the Web for over a decade now. It works by regularly spidering, timestamping, and storing the Web. Enter an URL and you’ll see all the archived pages.

They already collect movies and other ephemera as well as webpages… one early media archive was the Macromedia Collection, a listing of the 1990s multimedia CD-ROMs submitted to the “Made With Macromedia” program.

In 2003 its beta search engine offered a way to trace word occurrences back through time, similar to Google Labs’ timeline view offered for webpages alive today. It would still be nice to search back through every bit of info that’s ever been part of The Web though… the Zoetrope project [PDF] at the Adobe Advanced Technology Labs shows possible interfaces for such engines. We may not have a search engine for the full timebased Web yet, but at least we’ve got one archive.

The Internet Archive is essential, but single-sourcing history is scary. This effort lives on donations. It is mirrored, but one big earthquake could halt the archiving process.

It would be very good to have multiple groups, in multiple regions, archiving as much of the Web as each can handle. It shouldn’t be left up to one group. There’s a risk of politicalization if it’s all politically funded — what to monitor, what to feed to the memory hole — so diversity among archivers seems desirable. We definitely need to be able to search the Web as it was.

If the British Library can archive .UK websites during the London 2012 Games, that would be a great help.

But at bottom is the question: “How much of our digital lives should we preserve, and how much should we let just float away?” I’m definitely concerned about “Saving the Web”, and have always admired the Internet Archive folks for getting up and doing something about it.

But “think of those thousands of digital photographs that lie hidden on our computers” doesn’t seem to me as vital — the good photos usually get on the Web at some point. And “a right to the Beatles”, that’s sounding rather grandiose. Start small, archive the Web as it tears apart each day, that seems more important.

I don’t know… do you ever feel a guilty pang about not archiving a hard drive before tossing it away? How much of our bits should we keep?

3 Responses to The eternalization of ephemera

  1. Chris Brind says:

    Unless I missed it, you don’t actually explain *why* archiving the Internet is essential.
    I feel a great loss when I read about ancient monuments and paper based records being destroyed, but the web is not that. I don’t believe it was ever intended that the Internet’s primary purpose was for recording historical information. That can be the function of sites that choose to make it such.
    When does the archiving stop? And what makes one site more important than another? Would you archive the Yellow Pages, for instance?
    To me, the majority of the web is more like a phone call; a short blast of communication. It can be noted or recorded for training purposes, but that’s usually, it is transitory. Don’t tell me you record all your phone calls?

  2. Karl says:

    Meet the personal Web “Time Machine” (aka metaphor of Apple backup system).
    When we surf the Web, we should be able to have a local archive of what we have surfed in time. Be it by Web site or Web page or everything is a UI question. But the principle is that we should be able to all keep memories of it if we wish. The “fragility” of the Web is the lack of duplication. Imagine for a reason X or Y and Google disappears. There was only one copy in addition of the Web site.
    When a library is burning or closing, it is likely that there will be other copies elsewhere in the world at people’s place and/or in other libraries. Usenet has been in the past saved from darkness because of the way the protocol duplicates the content on different sites.
    So I want to have my own local Web archives of what I have been through. The same URI displayed through time. Visited in May 17, 1994. Visited in March 24, 2001. etc.

  3. John Dowdell says:

    Chris, true, I wasn’t trying to convince people who don’t think the Web should be saved, or who think the Web should not be saved. Lynne Brindley made some arguments on that case. (I agree with your overall quest of figuring what’s worth saving.)
    Karl, good point that an individual will be able to store what they see.