There has been a long-term project within the Cataloguing, Taxonomy and Data (CTD) department here at The National Archives to export the top three levels of our public catalogue for inclusion within the Archives Portal Europe (APE) website.
APE is an ambitious project, bringing together data from over 7,000 European archival institutions. The web portal allows users to perform searches across all of these archives from a single interface. It provides access to almost 290 million descriptions, as well as information on archival institutions throughout the continent.
Adding our data to APE involved us having to export our catalogue records for department, division and series as a particular data format, known as apeEAD. Technical staff at APE provided us with examples of differently structured apeEAD which would be accepted by their system, and my colleagues in the CDT were able to confirm the structure which best fit our data.
I then set about writing a Python application which made use of the Discovery API – a tool allowing programs to query The National Archives’ catalogue directly. The application is provided with a list of department codes to export, and the catalogue data for each department is retrieved from the API, down to the series level in the catalogue.
The data was then processed to conform to the apeEAD format, and any hyperlinks in the description or other fields are re-written to create links which point to the relevant page in Discovery, The National Archives’ online catalogue. The resulting data files where then checked using a validation tool provided by APE, and uploaded to a version of the APE website which they provide for data checking. Once the data was checked and approved, I was able to upload it onto the live APE site for publication.
In June of this year, our data was published for the first time on the APE website. The National Archives’ catalogue content covers information from over 402 departments, and comprises over 22,000 records. The records range from the year 1086 to the present day. As well as the record descriptions, I was able to include other useful information from the catalogue, including the record creator’s history, and the archival (administrative) history, where available. There are also links to information about the government department or other organisation which created the records.
The CTD team will be providing regular updates to keep our data on APE current. We are very pleased to have had our data go live on the APE system, and to be contributing to such a useful resource for the users of archives worldwide.
Further resources
To explore our content and the portal, have a look at: