Considering the word ‘digital’ makes up one third of my job title, you might consider it an oversight to have not used it once in my last blog entry. That may be an indication of variety in work – or perhaps forgetfulness – but I will make up for that today when I consider the union and mutually-beneficial relationship between open data and the archiving of datasets.
A colleague recently asked me what a dataset is. This is not necessarily as simple a question as it may appear: I side-stepped. I think the answer really lies in the term ‘structured data’; namely that the text of an email could not necessarily be termed a dataset, but a table in a PDF, a CSV (Comma-Separated Values) file, or an XML (Extensible Mark-up Language) file could. Also, a dataset can be analysed quantitatively, and is not a collection of different electronic files, like a database. However, the discussion rages and the terminology is so uncertain that the Government has even consulted on the word itself.



