I attended the Personal Digital Archiving conference in San Francisco last week. Some of the usual suspects in the world of digital preservation where there, most of whom are affiliated with institutions (including myself).
But there were also a few rugged individuals who, out of passion or some other impulse, are working alone to collect digital content.
These lone preservers deserve our thanks. Future users will thank them even more.
Most big collecting institutions–libraries, archives and museums–have yet to fully adopt their their attention to digital content, most especially born digital material. The problems, wildly generalized, are fundamental:
- Resource demands for managing traditional, non-digital holdings remain substantial.
- New resources are hard to come by, and prospects for cuts loom.
- Digital content is new and trendy, and may seem frivolous; it is hard to know which of it merits saving.
- Many–most?–staff have spent careers apart from digital material and are not eager to deal with it.
- Many–most?–institutions have limited technological capacity or infrastructure to manage digital holdings.
Individuals acting on their own are free from these concerns. They don’t have big legacy collections to worry about. They don’t have to defend their actions to overseers. It’s easy to get cheap technology to do the job.
The prime example of the lone collector is Brewster Kahle of the Internet Archive, which hosted PDA 2011. Kahle and his helpers had web archiving to themselves for the first few years, when there was plenty of skepticism about the the value of the content. Around 2000, some institutions began to selectively capture websites, often working in concert with the IA. Today, large-scale web capture is underway around the world: there are now over 30 national libraries and other entities devoted to the job.
Jason Scott spoke at the conference. Scott, proprietor of textfiles.com and collector of “marginalized data, the textfiles and message bases of dial-up bulletin board systems of the 1970s, 80s, and 90s,” is a self-described “tiring activist.” He said that much digital information was at risk, facing a “danger of deletion, a danger of being lost, a danger that a piece of history, with its value unrecognized and a lack of interest in what it might mean, might just be lost forever.”
Scott talked about a recent project to download a copy of the websites formerly housed on the Geocities web hosting service. He passionately defended the value of this information against “the current natural order of things for hosting user-generated content [which] is this: Disenfranchise. Demean. Delete.” Scott also advocated individual responsibility for one’s own personal content. “Go to your own computer, plug in a USB stick and copy your documents folder, because that’s the only thing that nobody’s going to be able to save.”
All of this leads me to speculate that, when it comes to digital content, our culture is reverting back to an era when we depended on high-minded individuals to build singular collections of art, books, manuscripts and other documentary material. The survival of much important information is due solely to individual initiative, as its true value only became apparent years later.
Perhaps it is appropriate that the era of user-generated content also includes the return of the heroic private collector. A twist is that the heroics are now scalable. The far end of the scale has people like Kahle and Scott. At the near end are everyday people who do their best to keep family photos and the occasional email.
Libraries, archives and museums, of course, still have a major role to play. If history is a guide, they will eventually assume stewardship responsibility for some private digital collections, and they will also expand their own curatorial interests into this realm.





