In computing, emulation is the practice of creating a virtual environment in order to replicate a different, usually older computer system. I first encountered emulation in the 1990s, when I chanced upon a community of Sinclair enthusiasts who had created an emulator for my beloved ZX Spectrum. I could play the games of my childhood again! In the wider world, emulation has practical applications for computer science and digital preservation.
While The National Archives has no current plans to utilise emulation to make its digital holdings available, emulation is a widely-considered strategy for digital preservation, and Jeff Rothenberg published a key piece of research in this area. The basic premise is that, if you need to present a document that was created within an old operating system on an ancient piece of software, the best way to ensure the viewer sees the document as it was originally intended is to emulate the original hardware and software environment, and to make that environment available to the viewer.
There are issues with emulation that currently make it impractical for making digital records widely available in this manner, primarily around intellectual property, copyright, and software licensing, but the purpose of this blog is not to discuss these issues in depth. The thrust is that in order to remain legal, you need to own the original software and license to use it.
So why talk about emulation?
In my line of work, emulation really does have practical uses. I have talked previously about how my role involves examining archaic file formats in order to create digital signatures for use within PRONOM. In many cases it is difficult to source examples of formats, particularly those formats that were popular before the internet gained wide use. In these instances it is often easier to create file examples using the original software.
At The National Archives, we hold a sizeable library of software, dating back as far as the early 1990s. Unfortunately much of this software refuses to install or run on a modern system. Attempt to run a 16bit program on a 64bit Windows 7 workstation, and the computer won’t even try, instead presenting a rather terse refusal:
I therefore require an older operating environment in which to install these programs, so that I can create my samples to develop my signatures, and this is where emulation comes to my rescue.
I use Oracle’s Virtual Box and Microsoft’s Virtual PC as both of these are free to use. Getting started can be a bit fiddly, particularly if setting up a Microsoft DOS environment. Many of you might remember the heady thrill of wrestling with AUTOEXEC.BAT and CONFIG.SYS to get something as apparently straightforward as a CD ROM drive to work. It isn’t any easier than it used to be and isn’t for the faint-hearted. Fortunately there is a friendly community on hand to help.
Once over the technical hurdles however, the emulated environment operates exactly as it should. I can install and execute the software I need and generate the sample files I require, and then transfer the files via a virtual disk drive onto my modern system to work on.
There may well be a future for emulation within the wider digital preservation world, but for now, in my little corner of it, emulation is exactly what I need.
Just some thoughts on the practicality of emulation point regarding software licensing and IP. We must remember that over the historical time frame any IP in the software and hardware will itself expire.
That is to say in 2300 when someone needs a Windows ’95 environment to look at some record that IP is unlikely to be an issue. Unless Bill Gates achieves immortality and the life + 70 rule is still active.
In the medium term I can see this is an issue. But the expiry of IP does mean that it would be a good idea to retain copies of environment software.
I’m suggesting that the availability-due-to-licensing of emulated environments may look something like this:
http://www.techdirt.com/articles/20120330/12402418305/why-missing-20th-century-books-is-even-worse-than-it-seems.shtml
http://i.imgur.com/m9zif.png
The gap may be substantially less dramatic and shorter for digital-record-environments due to patents being shorter in general than copyright and also from the trend towards open source.