The provenance of web archives

Originally published on the UK Web Archive blog on the 20th November 2015.

Over the last few years, it’s been wonderful to see more and more researchers taking an interest in web archives. Perhaps we’re even teetering into the mainstream when a publication like Forbes carries an article digging into the gory details of how we should document our crawls in How Much Of The Internet Does The Wayback Machine Really Archive?

Read More

Posted: 2015-11-20

Web Archives  BUDDAH  Data Mining  Digital Preservation

Sentiment Trajectories

As a computational physicist working in a library, my background and training is quite different to the curators and researchers I now work with. Therefore, I do try to spend some time following developments in the digital humanities more generally, trying to understand the kinds of questions being asked, the techniques that are being used, and the assumptions that lie beneath.

A particularly interesting example came up recently in a blog entry called “Revealing Sentiment and Plot Arcs with the Syuzhet Package” by Matthew L. Jockers.

Read More

Posted: 2015-04-28

Visualisation  Data Mining  Digital Humanities


Fighting entropy since 1993

© Dr Andrew N. Jackson — CC-BY