File formats and the software capable of reading them are living longer than previously thought, according to a British Library and UK Web Archive study.
Formats over Time: Exploring UK Web History (PDF, slides as PDF) considers 2.5 billion files author Andrew N Jackson retrieved with the help of the Internet Archive and the Joint Information Systems Committee (JISC). All the files come from
the UK web domain and come from the period between 1996 and 2010.
Jackson used Apache Tika and PRONOM’s DROID tool to inspect the files and determine the format they use. Central to the research was Jeff Rothenberg’s 1997 prediction that
Digital Information Lasts Forever — Or Five Years, Whichever Comes First. Jackson is also keen on a rebuttal from David Rosenthal, who he quotes as saying “when challenged, proponents of [format migration strategies] have failed to identify even one format in wide use when Rothenberg [made that assertion] that has gone obsolete in the intervening decade and a half.”
Jackson’s take is that file formats seem to last rather longer than five years even if they don’t survive forever — via redwolf.newsvine.com