My name is David Clipsham and I have been employed as the File Format Signature Developer for a month, having previously worked as Customer Service Manager for the cross-government social collaboration tool, Civil Pages. My role is to improve the coverage of The National Archives’ PRONOM file format registry. The internal and external signature information contained in the PRONOM registry is utilised by our file format identification tool DROID, which is used to identify file formats so we can make informed decisions about the long term preservation of digital records.
My day is typically spent researching obscure and not-so-obscure file formats, picking through the internal code of each format and identifying the key characteristics that make the file format what it is, as described in Ross Spencer’s recent blog post. I then recreate the key byte sequences, test them against sample files and upload them to PRONOM, ready for our bi-monthly signature release.
How do I focus my research?
Continue reading »
Greetings from Digital Preservation!
One of the challenges we face in our department is coordinating our efforts to satisfy the requirements of The National Archives, other government departments and a wider preservation and archives sector community that make use of our tools DROID and PRONOM. With such a diverse audience we work hard to listen to colleagues who visit government departments or who actively take part in discussions about preservation and digital continuity. We also maintain mailing lists and have an email address which allows users to contact us directly. This has allowed us to develop strong relationships with organisations across the pond in the US and in the antipodes. We rely on these relationships to help develop our content and improve our services.
On Monday 28 January, the Digital Preservation Coalition (DPC) hosted a file formats day of action, creatively titled ‘Bring Out Your Dead (Files)’ at the Wellcome Collection. As The National Archives’ resident File Format Signature Developer, I was invited to deliver a presentation on DROID and PRONOM, our file format identification tool and file format registry, and a workshop on Developing File Format Signatures for PRONOM.
My own talk reviewed DROID and PRONOM developments in 2012:
- DROID 6.1 was released in August. DROID development has switched to Github, and we have a Google Groups discussion page open for support enquiries
- The PRONOM registry has grown considerably, with 100 new file formats, 177 new file format signatures, and a full time researcher appointed
- PRONOM has been able to grow this much in part due to the wealth of external contributors who continue to provide us with file format signature and research information. Over a dozen institutions and individuals contributed last year
- Finally I was delighted to announce that the download for our DROID tool now has a permanent home on The National Archives’ own website.
My workshop focused on demystifying the file format research and signature development processes I undertake and allowed willing participants the chance to try developing their own signatures. Continue reading »
Ask anyone in our department at The National Archives and they will say I’m never short of words… Okay, ask anyone out of half a dozen or more departments at The National Archives and they’ll pretty much agree too! Well, that was up to today I suppose. Perhaps it’s writer’s block, perhaps it’s just the natural wrapping up of my duties given that (note it down Wikipedia!) tomorrow, 7 September, is my last day at the organisation. It has been three years, three months and seven days since I started, a fresh-faced C++ developer from the Midlands. My humanities background was Digital Culture at Kings College London and, between you and me, I think I might have confused digitisation with digital preservation at my interview (they let me through the net though!)
In three years, I’ve seen quite a lot happen in the world of digital preservation. I thought my last blog post for The National Archives might be an opportunity to put a shout-out to some of the existing community projects and initiatives which have already done enormous amounts for the cause and look set to continue this trend for a long time.
Digital Preservation Coalition - Save the Bits
Digital Preservation Coalition
While I am sure I was introduced to the Digital Preservation Coalition long before this, in February 2010 Planets held one of its ‘The Planets Way’ training events in London. The first day of the event was in a conference format and, just after lunch, William Kilbride from the DPC took the opportunity to say a few words about the work they do. The statement he made to the room that resonated with me to this day, and a sentiment that can make us all smile in digital preservation, was (to paraphrase):
“Once you solve the problem of digital preservation, I can retire.”
Continue reading »
I think that was the quote we were looking for? Ok, maybe not but If I mention the word DROID you might figure the right one out!
Tenuous links over, in Digital Preservation today we’ve released a new version of the DROID (Digital Record and Object Identification) tool – version 6.1. We’ve spoken about the tool before when I blogged about the PRONOM and DROID user consultation we held at The National Archives last year. The day resulted in a consultation wiki where contribution is invited by all members of the public with an interest in a potential DROID 7. The wiki page lists requirements that users of the tool have for DROID 7 and all future versions.
DROID 6.1 User Interface
As I sit and reflect in my home one evening, thinking back to the day’s events and looking around me, I can begin to see a rich digital tapestry woven into my life. This is prompted by thinking about a conversation I was having with a colleague who was trying to understand an export he had relating to horse racing results and wondered if the data could be extracted to be of any potential use.
Looking around, I see my digital piano in the middle of the room and wonder, beyond the MIDI output I can capture, what exists within its ‘mechanics’ to enable the various functions it performs; I receive an email on my iPhone which I know is downloaded from my Gmail account which potentially means two different storage formats for that email; and I flick through the channels on my digital TV which makes me realise the data which allows me to see a seven-day electronic programme guide must actually be stored as a digital format or data structure within the box to allow it to be displayed and searched through.
Other formats that surround me in my daily life include my mp3 collection, GPS fitness information from cycle trips, and even my computer games and the data video games use, such as save files. Look around you, what formats do you see?
Look around you: How is digital woven into your daily life?
Continue reading »
As part of my Opening Up Archives traineeship at the West Yorkshire Archive Service, I am looking into the world that is Digital Preservation. Similar to a fellow trainee, my knowledge of digital preservation was pretty much nonexistent. When presented with the term, although I had my assumptions of what its true meaning was, I didn’t want to rely on that alone. With a background in IT and languages, getting to grips with digital preservation was a little easier than learning about archives as a whole. Digital Preservation, as mentioned in the previous Trainee Tuesday blog post: Tales from the Dark Archive, is the challenge to preserve digital material so that it can be accessed in the future.
In May, I attended the Digital Preservation Training Programme (DPTP) and it brought clarity to the concepts, models and acronyms associated with digital preservation. Practical activities enabled the other attendees and I to think about the subject, what issues there are surrounding it and to see if we could relate the topics to what we do in our own organisations. One benefit was that the OAIS functional model was broken down into sizeable chunks and discussed in great detail. The Open Archival Information system (OAIS) model is a reference model created to give understanding and knowledge of concepts and processes of digital preservation.
Now after five months, I am comfortable talking about checksums, ingest procedures and software involved as well as knowing how an archives works thanks to a lot of reading on my part and a lot of patience from my colleagues.
A packed out room of eager listeners
So this week at the Our Stories Community Archives Conference 2012, I was asked to deliver a workshop for community groups on digitising collections with regards to planning and long term care. This was a great opportunity because it was my first time delivering a workshop at a conference and I could put my knowledge to good use.
Continue reading »
Wouldn’t it be cool if every digital file created could be identified with a signature or ‘magic number’ of some kind? This would make preservation, and the concept of knowing what you’ve got in order to be able to preserve it, that much easier.
By design or otherwise, for some file formats this is actually the case. The title of today’s blog post provides two such magic numbers used to identify Java class objects and Java pack200 files. These aren’t the first examples to use magic numbers but 0xCAFEBABE is the one I find the most striking as an introduction to the concept. You can read more on the origins of 0xCAFEBABE at Wikipedia. Continue reading »
In computing, emulation is the practice of creating a virtual environment in order to replicate a different, usually older computer system. I first encountered emulation in the 1990s, when I chanced upon a community of Sinclair enthusiasts who had created an emulator for my beloved ZX Spectrum. I could play the games of my childhood again! In the wider world, emulation has practical applications for computer science and digital preservation.
Microsoft Dos 6.22, Windows 3.11, and Word for Windows disks
Continue reading »
The best way to explain the title of this blog is to begin by quoting directly from the Hedgehog Street website:
“Through Hedgehog Street, we are asking people to become Hedgehog Champions to rally support from their neighbours and work together to create ideal hedgehog habitat throughout their street, estate or communal grounds.”
I saw this initiative on BBC Springwatch a while back, specifically, one simple thing we can all do to become Hedgehog Champions – link your garden. Again to quote the Hedgehog Street website:
“Hedgehogs travel around one mile every night through our parks and gardens in their quest to find enough food and a mate. If you have an enclosed garden you might be getting in the way of their plans. Hedgehogs have enough barriers to contend with such as roads and rivers that we can’t do much about. However we can make their life a little easier by removing the barriers within our control – for example making holes in or under our garden fences and walls for them to pass through. The gap need only be around 15cm in diameter and so should not affect your pets’ safety.”
The idea of doing something so simple to protect our cute friends is a nice one. We’re converting one garden into hundreds, and combined with more naturally occurring wildlife corridors, potentially thousands. This is what we’re doing when we link data, the gardens represent our data and datasets and the link we’ve created gives users and machines unrestricted access to navigate from one dataset to another. It’s an almost perfect analogy – an analogy which I hope will help to open up the concept to all our readers, technical and non-technical alike.
Linky - The Linked Data Hedgehog
Continue reading »