Digital Preservation Librarian Corey Davis talks passion for web archiving

 Corey Davis

You are the Digital Preservation Librarian. What does a typical library day look like for you?

I start early, around 5 in the morning, and try to get a good bit of work done before the 4-year-old gets up and wants a lollipop for breakfast. From 5 to 9 (what a way to make a living!), its black coffee in my hastily constructed basement office/survivalist bunker, early morning Zoom calls with colleagues out East, email and Slack, checking the news, shaking my head, more coffee… you get the picture. For the rest of the day, my wife and I juggle childcare and work and life as best we can. I need to remind myself, especially when things feel a bit overwhelming, that we’re not working from home at this point, we’re at home during a pandemic trying to get some work done. By the time evening arrives, I like to have a glass of wine and catch up on email and professional reading for a few hours. As per usual, work and life are rather intertwined these days, but so far, so good!

You have written a case study regarding archiving the web. Tell us more about the challenges involved with web archiving.

I like to use the analogy of the proverbial shoe box tucked away in grandma’s attic. Letters, photos, postcards, old medals and maybe the odd cassette tape with a recording of grandpa. These things all help tell the family story, not just in words and images, but through the artifacts themselves, as things we keep and treasure across generations. Now fast forward to 2070.

What story will I be able to tell my grandchildren when they’re all grown up?

My correspondence is through Gmail, Facebook and Instagram (at least for now), and my photos are in the Apple cloud somewhere locked behind a password. My albums are a playlist on Spotify, and basically, my entire documentary presence is an ephemeral gathering of 0s and 1s mediated by an incredibly complex matrix of software and hardware, most of it controlled by large Silicon Valley corporations. Now think about the cultural record writ large. What will we leave behind as a society (and who gets to decide)? How will researchers of 2070 get access to the raw materials of history as it unfolds before us? Look at the forward to any book written about the 1918-19 Spanish Influenza pandemic and observe the customary acknowledgments to the myriad of archivists and librarians involved. Now flip to the list of primary sources, and consider the contemporary counterparts of those analog sources. If you don’t shudder for the historians of the future trying to do similar work in 50 years on COVID-19, well then, you don’t scare easily.

You are the Co-Principal Investigator for a project funded by CANARIE. What is it about?

We’ve created a Canadian instance of DuraCloud at the University of Toronto data centre, which will enable archivists and librarians and researchers to easily deposit materials into distributed cloud storage infrastructure. It’s the kind of digital research infrastructure that might not be super sexy at first glance (or second), but it’s absolutely critical to build this kind of stuff if we want to tackle the grand challenges of digital preservation in Canada.

For the last two years, you were on secondment as the Digital Preservation Coordinator for the Council of Prairie and Pacific University Libraries (COPPUL). What were your responsibilities, and how did you contribute to COPPUL?

Working at the consortia or ‘network’ level is really interesting stuff. You get to see things from a certain altitude, and there’s significantly less bureaucracy up there, so things can move quickly. At COPPUL, I worked with people from across Canada and the world to create and extend preservation infrastructure like distributed storage, to help raise awareness of digital preservation amongst key stakeholders, and to build capacity at the member institution. All of this required spending lots of time on airplanes and in hotels. Remember those? 

Currently, you are working with several organizations, namely the Canadian Research Knowledge Network (CRKN) Platform Technical Task Group; Research Data Alliance (RDA) International Indigenous Data Sovereignty Interest Group; and Research Data Alliance (RDA) Preservation e-Infrastructure Interest Group.  What role do you perform while serving these groups?

Everything from setting strategic priorities to sending dreaded Doodle polls half-way across the world. Overall, though, my work at this level mostly involves making connections, building relationships, and helping create systems and services people can use to make a difference at their institutions.

What inspired you to pursue data preservation and web archiving for your career?

I was walking home from the pub about seven years ago with a good friend who manages IT for Islands Trust. I happened to start talking about digital preservation. I quickly realized, under a rather withering cross-examination, that I didn’t really know what the heck I was talking about. As a librarian, this really shook me up. So I resolved to make it my personal mission to increase my understanding of this incredibly important topic, and—if possible—to make a difference in my professional life. As I started to sink my teeth in, I realized that as a problem area, it’s endless, and endlessly interesting, and the people engaged in the field are also some of the best people you could ever work with.

What is the most common thing you have heard people say about web archiving which is not true?

Frankly, I wish web archiving was sufficiently understood for people to be misinformed about it.

Describe your library work using one word?


Read more about Corey's library work.


Interview conducted by Zehra Abrar