Ars Technica is one of the best sources anywhere for insight into technology and its ever-expanding impact. I was especially pleased that the site ran 10 separate articles about digital preservation during the past year.
Special credit goes to John Timmer, “Science Editor et Observatory moderator.” He wrote six excellent pieces on the challenge of preserving and providing meaningful access to scientific data. He treats the issue superbly, bringing it to life using his real-life experience as a genetics and biology laboratory researcher.
Timmer put together a three-part series on scientific data preservation. Part I: Preserving science: what to do with raw research material? refers to the recent fuss about the UK’s Climatic Research Unit, particularly its messy data management. “Poorly commented computer code. Data scattered among files with difficult-to-fathom formats… But the chaos, confused record keeping, and data that’s gone missing-in-action sounded unfortunately familiar to many researchers, who could often supply an anecdote that started with the phrase “if you think that’s bad…”
In Part II: Preserving science: what data do we keep? What do we discard?, he tackles one of the most sensitive—and vexing—issues out there. “The reality is that we simply can’t save everything. And, as a result, scientists have to fall back on judgment calls, both professional and otherwise, in determining what to keep and how to keep it.”
The inescapable matter of digital media obsolescence is considered in Part III: Jaz drives, spiral notebooks, and SCSI: how we lose scientific data. “Over the course of my research career, archiving involved magneto-optical disks, a flirtation with Zip and Jaz drives (which ended when some data was lost by said drives), a return to big magneto-optical disks, and then a shift to CDs and DVDs. Interfaces also went from SCSI to Firewire to USB. Anything that wasn’t carefully moved forward to the new formats was simply left behind.”
Timmer also weighed in on Changing software, hardware a nightmare for tracking scientific data. “My work relied on desktop software packages that were discontinued, along with plenty of incompatible file formats. The key message is that, for even careful researchers, forces beyond their control can eliminate any chance of reproducing computerized analyses, sometimes within a matter of months.”
How science funding is putting scientific data at risk highlighted the stark reality that adequate money is all too frequently not provided to maintain important data. Keeping computers from ending science’s reproducibility explores a huge barrier that gets in the way of confirming research results. “Traditional science involves a complex pipeline of software tools; reproducing it will require version control for both software and data, along with careful documentation of the precise parameters used at every step.” But “this work may run up against the issues of data preservation, as older information may reside on media that’s no longer supported or in file formats that are difficult to read.”
Ars ran two articles about preserving video games. The first, Preserving games comes with legal, technical problems referred to a paper in the International Journal of Digital Curation, Keeping the Game Alive: Evaluating Strategies for the Preservation of Console Video Games. “Hardware becomes outdated and the media that houses game code becomes obsolete, not to mention the legal issues with emulation.”
The final two articles focused on Library of Congress actions (full disclosure: I work with the Library digital preservation team). Why the Library of Congress cares about archiving our tweets delved into the huge interest that flowed from the Library’s announcement about acquiring the Twitter archives. Historic audio at risk, thanks to bad copyright laws discussed a report from the National Recording Preservation Board about problems preserving the complex digital formats that underlie much of today’s music .
Let’s hope that Ars continues its coverage of digital preservation into 2011. There is quite a bit to talk about.