Reflections on the IIPC Web Archiving Conference 2023

Posted: 2023-06-20 | From the Failed-To-Write-A-Full-Conference-Report department.

First published on the UKWA blog

My main goal for the conference was to support the adoption and development of shared open source tools. I’ve been involved in the IIPC project Browser-based crawling for all, and at the conference I helped run a workshop where attendees could start exploring Browsertrix Cloud and give feedback to the project and to Webrecorder. There were some initial problems with the capacity of the demo system, but these were quickly resolved and the workshop was a success and provided useful feedback for future work.

Photograph taken during the Browsertrix Cloud workshop

I also ended up chairing the SolrWayback session, which showed many great examples of how that search interface and the underlying indexing tools (developed by UKWA) have been used by different web archives to help explore and analyse their collection. It’s heartening to see more and more web archives doing this kind of thing.

There were a lot of good presentations and discussions around tools, but I’d particularly like to recommend that you all check out Warchaeology by the National Library of Norway - Web Archive, and Scoop by the Harvard Library Innovation Laboratory.

Both the Scoop presentation and the Bellingcat keynote provided important insights into what it takes for web archives to be legally-admissible evidence (see also e.g. this post about Scoop and this post from Bellingcat). There are interesting questions here about our tools and workflows, like whether the WARC or WACZ formats are sufficient in their current form, and whether there are opportunities for deeper collaboration across the domains of cultural heritage, law, and open source investigation.

Finally, across a number of presentations, the conference also raised questions about the current and future role of cultural heritage institutions. Are our approaches to information literacy fit for an age of fake news and ChatGPT pollution? Is there something libraries and archives can learn from how Bellingcat and fact checkers like Full Fact are helping people find reliable information and avoid conspiracy theories? Can web archives do more to to fight disinformation? I look forward to seeing more about this at future conferences!

Next in series: Robust file transfers ... » « Previous in series: What makes a large web...

Web Archives

Building Web Archives

Describing our evolving web archiving framework and tools. In particular, the aim is to document how our crawl architecture has evolved to become more modular, and to explore the idea of using APIs to make these systems more manageable over time.

Blog Series

  Building Web Archives 16

  Digital Preservation Lessons Learned 7

  Digital Dark Age 7

  Format Identification 3

  Format Registries 6

  Mining Web Archives 17

Recent Posts


Websites (13) Travels (47) General (1) Development (7) Top Tips (4) Science (7) Rants (3) Top Links (2) Reviews (2) Visualisation (3) Digital Preservation (45) Procrastination (2) Data Mining (16) Open Access (1) Web Archives (34) Representation Information (2) Format Registry (4) SCAPE (3) webarchive-discovery (7) War Stories (1) Preservation Actions (2) BUDDAH (5) Publications (3) Digital Humanities (1) Collaboration (1) Keeping Codes (6) Lessons Learned (6) Reports (5)

Posted: 2023-06-20 | anj


Fighting entropy since 1993

© Dr Andrew N. Jackson — CC-BY