Over the last few years, it’s been wonderful to see more and more researchers taking an interest in web archives. Perhaps we’re even teetering into the mainstream when a publication like Forbes carries an article digging into the gory details of how we should document our crawls in How Much Of The Internet Does The Wayback Machine Really Archive?
A few months ago, a colleague suggested that we should come up with ways of helping people learn about the main stages of web archiving, and to help them understand some of the more common technical terminology.
As a computational physicist working in a library, my background and training is quite different to the curators and researchers I now work with. Therefore, I do try to spend some time following developments in the digital humanities more generally, trying to understand the kinds of questions being asked, the techniques that are being used, and the assumptions that lie beneath.
I gave the following presentation at the 2015 IIPC GA. If you prefer, you can read the rough script with slides (below the fold) rather than watch the video.
Following Vint Cerf’s talk at AAAS, the “Digital Dark Age” is in the news again (see DSHR’s blog for a good summary, or one of the ~200 other news articles about it!). The coverage spun me into a Twitter rant (documented here), but after reflecting on my reaction, I feel it’s worth exploring the issues in a bit more detail…