AI and archives: Current challenges and prospects of digital and born-digital archives

The Archives in the UK/Republic of Ireland & AI (AURA) network, which is funded by Arts and Humanities Research Council (AHRC) and the Irish Research Council, aims to discuss how to use artificial intelligence (AI) to unlock cultural heritage archives and explore the challenges around access. The network focuses on three major themes which are to be investigated with three workshops: ‘Open Data versus Privacy’, ‘AI and Archives: Current Challenges and Prospects of Born-digital archives’ and ‘AI and Archives: What comes next?’

Last month, The National Archives and the British Library partnered for the second workshop, which explored the current challenges and prospects of born-digital archives. The first day of the workshop, held on 28 January, was organised by The National Archives, and the second, on the following day, was organised by The British Library (read more about the second day here). Both workshops were held virtually.

The first day’s workshop started with two presentations. Catherine Elliott, Head of Digital Services at The National Archives, presented her team’s work on ‘Transforming how our users engage with the archive online’, which explores what we would create if we were to re-format our website, as part of our new strategy, Archives for Everyone (for more information on this project, please visit this page). Bernard Ogden, Research Software Engineer at The National Archives, and Lora Angelova, Head of Conservation: Research & Audience Development at The National Archives, jointly presented their work on the Deep Discoveries project, which is a foundation project within the AHRC-funded Towards a National Collection programme. Their presentation, entitled ‘Towards computer vision search and discovery of our national collection’, explored challenges and prospects in accessing image collections.

After their talks, the speakers proposed two questions or challenges that the group could discuss in smaller break-out rooms. The questions that came out of the morning session were:

  1. How might we use service design and data to encourage smart assistants to present primary sources to users? (suggested by Catherine).
  2. How should user interfaces for search/discovery of visual collections in digital archives (or visual archives) look? (jointly suggested by Lora and Bernard).

The group split up in four break-out rooms to discuss the suggested questions using Mentimeter, a tool for interactive presentations and discussions.

Some of the responses to the first set of questions. Example comments include 'Smart assistants are designed for personalization, they could provide primary sources using a Pandora model' and 'Linked data seems to be a useful solution that could easily be picked up'.
Some of the responses to the first set of questions

The afternoon session included two more presentations. Lorna Hughes, Professor in Digital Humanities at Glasgow University, discussed the ethical considerations when linking and searching community-generated content, focusing on issues related to copyright, metadata and working with data at scale. Nora McGregor, Digital Curator: European and Americas Collections at the British Library, spoke on ‘The evolution of the British Library digital scholarship staff training programme’, presenting the British Library’s journey from HTML to Ethics in AI.

From the afternoon session, the group was given two more questions to explore in a break-out sessions:

  1. Who is in the archive? When using and analysing digital archives, how can we see and respect individuals and their identities within the data? (suggested by Lorna).
  2. To what extent should cultural heritage institutions take responsibility for mitigating ethical risks in the application of AI technologies to their digital collections either by staff and/or researchers? (suggested by Nora).

Some of responses given to the second set of questions. An example comment includes 'When accepting submissions encourage creative commons and always use ethical consent - so people know exactly what might happen to their materials/data'.
Some of the responses given to the second set of questions

Each break-out room discussion was led by an expert from The National Archives: Jenny Bunn, Head of Archives Research; Mark Bell, Senior Digital Researcher; John Moore, Head of Emerging Technologies Research; and Leontien Talboom, PhD student at University College London and The National Archives funded by The London Arts and Humanities partnership (LAHP). At the end of the workshop, Pip Willcox, Head of Research at The National Archives, held a roundtable discussion between the speakers, the break-out room leaders and the workshop participants.

Example of the responses given to the first set of questions. An example comment includes 'Visual search is I believe the most effective way, provided there is enough clarity and transparency about how the algorithmic process selects the images to display'.
Examples of the responses given to the first set of questions

The interactive sessions of the workshop allowed fruitful conversations between the participants, the speakers, the project team and the workshop organisers. The conversations explored the current challenges and prospects of digital and born-digital archives, focusing on access, ethics, emerging formats and AI by bringing together experts from a range of disciplines, including archival science, data and computer science and the humanities, with experts and practitioners from cultural heritage institutions.

We had a wide range of participants from around the world and from a diverse range of organisations, libraries, universities and institutions located in Europe, Africa, the US and the UK, who brought a variety of perspectives on access both from infrastructural and user perspectives.

The conversation also highlighted the ethical implications of the use of AI and advanced computational approaches to archival practices and archival research, based on participants’ experiences deriving from different research contexts and parts of the world.

The final workshop of the AURA network will discuss the next steps of the network. It will be organised by University of Edinburgh on 16 March. For more information, please follow the web page of the event.

Thank you to Rachel Smillie, Head of Academic Partnerships and Liz Fulton, Academic Communications and Impact Officer, both at The National Archives, for their help organising the workshops. Thank you to Patrick McInerney, Lecturer in Computer Science, and Larry Stapleton, Senior academic and international consultant, both at Waterford Institute of Technology, for chairing the morning and afternoon sessions.

Leave a comment

Visit this page for family history and other research enquiries. Please do not post personal information. All comments are pre-moderated. See our moderation policy for more details.

Your email address will not be published. Required fields are marked *