You may have heard about the ‘digital Dark Age‘ in recent media reports…
For us, similarly to other institutions in the archives sector across the world, managing, preserving and providing access to born-digital records (records natively created in a digital format, such as emails, documents and spreadsheets) is a major challenge, now and for the years to come.
Why is this important now?
This year some government departments are due to transfer born-digital records to The National Archives to meet their legal obligations under the Public Records Act.
I have been leading the Digital Transfer Project since July 2014 to ensure that we, The National Archives, as well as other government departments, are ready to embrace this challenge.
We have been very busy over the past year and a half, and our philosophy has been ‘learning by doing’. To avoid reinventing the wheel, we reviewed what other archival institutions around the world were doing in the field of digital records management and transfer. We interviewed key UK government departments in order to identify their challenges early, and be able to proactively find solutions. Alongside this we launched a series of pilot transfers to design and test the new process to appraise, select, sensitivity review, transfer, preserve and give access to born-digital records.
Two transfers completed!
We are proud to say that two transfers have already been completed. You can find born-digital records from both the Welsh Government, (see WA 11, WA 12 and WA 13) and The National Archives (RW 33) on our online catalogue Discovery. These records are available to download for free from anywhere in the world. Four further transfers are planned for the coming months.
Learning by doing
We’ve learnt that two of the main challenges experienced by government departments as part of this transfer process are:
- extracting meaning from unstructured digital record collections in order to make appraisal and selection decisions. We found that up to two thirds of government departments’ information is held on unstructured shared drives. Some departments also had up to 190 terabytes of information in email servers
- sensitivity reviewing born-digital records at scale without having to read all the individual documents
We decided to look at what existing technologies could offer in the field of digital search, digital information management, digital appraisal and selection and sensitivity review to address these challenges. The results are really promising.
We explored whether technology-assisted-review – a process involving expert document reviewers using a combination of computer software and tools to electronically classify records – could have interesting applications for the archives sector. Technology-assisted review typically uses eDiscovery software. This type of software was originally designed to extract meaning or identify sensitive information from large unstructured digital collections for the purpose of disclosing electronic information between parties before a trial. Our underlying assumption was that if these technologies were good enough for the legal profession and the courts, they could also be good enough for information and records management.
What we learnt was really exciting. Technology-assisted review is starting to be widely accepted in court cases in the United States. Last year these technologies were also endorsed in a lawsuit by the High Court in the Republic of Ireland. Technology-assisted review can also be as, if not more, accurate than manual review. We found that traditional ‘keyword’ searches return only 20% of relevant documents whereas it is possible for technology-assisted review to return a lot more. We also found that on average 40% of a digital collection is duplicated therefore having a tool that can separate the wheat from the chaff and reduce the amount to review can be particularly helpful!
Although there is no ‘silver bullet’ or completely automated solution, technology-assisted review offers ways to prioritise and reduce the information to be manually reviewed. Particularly useful functionalities include categorisation and clustering, which groups contextually similar information, and therefore allows for macro-level decisions, be they appraisal and selection decisions or sensitivity review decisions.
We have just published two reports that detail these findings. The first is a snapshot of the digital landscape in the UK government, highlighting some of the current challenges experienced by government departments in the management and transfer of born-digital records. The second showcases how technology-assisted review could help addressing some of these challenges. You can download both reports for free.
We feel we have started our evolution from a digital dark age to a digital enlightenment. It is still early days and there is still a lot of work to be done, both collaboratively across UK government and also working with third party partners and the academic community. It’s also important to note that the answers to these challenges are not set in stone. We will have to adopt a ‘lean’ approach – evolving our solutions as technology evolves, which is a really exciting prospect!
Don’t hesitate to contact us at DigitalRecordsTransfer@nationalarchives.gsi.gov.uk if you have questions or comments or want to contribute your ideas to address these exciting challenges!