The first scheduled Preservation Maintenance operation on the large and complex PAWDOC collection started on 1st September. Well, actually, it started a bit before then in early August when I started to investigate the items in the ‘Possible Future Issues’ section of the PAWDOC Preservation MAINTENANCE PLAN. There were 15 such items – most relating to files that had proved inaccessible in the initiating preservation exercise three years ago; but four concerning the numerous contents of CDs that had been included in the collection. Two of these proved particularly demanding: one is a disk that was distributed with the April 2001 issue of PC Magazine; and the other is the Nautilus disk – a 1991 attempt to issue a technology magazine with lots of software, advice, news, and multimedia files on disk. I couldn’t get either to open; and without an interface there’s no way of knowing what they contain or whether the contents still work; so I decided to try and create a guide to the material by going through all the contents. It was a laborious process (there were over 1500 files in all), but I did get a result, and guides to both disks now reside alongside the zipped up contents.
The challenge presented by the huge volume of files on CDs, as illustrated above, was also manifested in the maintenance process proper that I started at the beginning of September. The process requires that an inventory is made of all files in a collection (which I achieved by using the National Archives’ DROID tool); and that an attempt is made to open two or three files of each type. Problems identified in this investigation stage can then be addressed. The CDs in the collection (now all residing alongside the rest of the collection in Windows folders) comprise a large proportion of the overall collection, and this overloads the analysis and investigation process. However, many of the CDs are installation disks for the collection’s document management software (no longer used) and for old versions of its indexing software. In subsequent maintenance operations, all such sets of files will be excluded from the DROID analysis: I have decided that the mere presence of such material in the collection is sufficient to signal its previous inclusion – there is no need for it to actually work going forwards. Perhaps this is an example of a sort of additional decision that may have to be made with a digital collection as compared with collections of physical objects. Digital collections are very different animals.
The culmination of the investigation phase is to produce a Project Plan with tasks which are specific enough to enable effort and elapsed duration to be reliably estimated. I got to this point yesterday, and, as per the first task, I have started converting 28 Help files from the old .HLP format to the HTML based .CHM format. The plan prescribes a finish date of 3rd December. After that I shall be producing the final updates to the Preservation Planning templates which I have been refining since 2015, and which are published in the Website of the Digital Preservation Coalition.