In the previous post I identified a need to understand the additional digital preservation requirements of the overall combined set of collections. To investigate this, I listed all the individual collections in a spreadsheet and noted some points which have a potentially significant impact on preservation work, including:
- Does the collection have an index? (if there is no index there is no way to check the inventory – the items themselves define what is in the collection).
- Does the collection have digital items with or without physical equivalents, and/or physical items with or without digital files? (when an item exists in both digital and physical form, there is more preservation work to do).
- The number of digital and physical items (there is substantially less preservation work to do on a folder of 30 digital items, than there is on a collection of 500 digital items of which 175 have physical equivalents).
- Whether there is any duplication with other collections (If a collection is part of a larger set of objects which already has a Preservation Plan, there is no need to specify a separate Preservation Plan for it).
Having populated this Preservation Asessment spreadsheet with its long list of 38 collections that might need Preservation work I was filled with some dismay as I’ve now had several years of implementing Preservation plans on many hundreds, if not thousands, of objects: it’s time consuming and exacting work. I knew that I needed to minimise the time and effort on this new set of preservation activities if it was going to be workable and successful. Furthermore, I also realised that for many of the collections on the list I was not really that concerned about the long term: they were accessible currently – many without needing an index, required little intervention, and might be of little interest many years hence.
With these thoughts in the back of my mind, I went through the list deciding what preservation work, if any, was to be done on each collection. Fortunately, 8 of the collections either already had a Preservation Plan or were part of one of those which had; I discounted another one altogether as it only had one insignificant digital file; and another seven were part of another collection on the list. I also combined 3 of the remaining 22 collections into a single overall Healthcare collection (because there were fewer than 90 files across them all), and 2 of the Book collections into a single overall Physical Books collection (because I knew the two would need to be done together). Finally, I added one other collection to the list – my other general laptop folders which I concluded would also benefit from being under the control of a preservation plan. Consequently, I was left with 20 collections to define Preservation Plans for. This was far too many to be practical, and, in any case, the more I looked at the digital files involved, the more I realised that they mainly consisted of pdf, jpg, png, doc/docx, xls/xlsx,, and ppt/pptx formats – not very problematic. For the most part, an eyeball check would be all that was necessary to identify doc, xls, and ppt files that needed converting to docx, xlsx, and pptx respectively, so the detailed 16-step process required in my comprehensive Preservation Maintenance Plan template would be overkill. I needed to create a LITE version of the Preservation Plan with fewer steps and capable of addressing multiple collections. What I came up with were the following 4 steps:
- Populate a ‘Changes’ section with the significant changes that have occurred to the collection and its digital platform between the previous maintenance exercise and the maintenance you are about to carry out.
- Populate a ‘Hardware and operating system strategy’ section with the strategy you envisage for the future.
- List the collections you want to undertake Preservation activities on in a ‘Contents & Location’ section together with the specific actions you want to take for each one (for example, ‘Check file extensions’ or ‘check inventory’).
- Record a summary of the actions taken and associated results for each collection, in an ‘Actions taken’ section.
With this structure in mind, I separated the 20 collections into two groups – one which included substantial numbers of physical objects, and one which consisted mainly of digital files. The result was two Lite Preservation Plans each dealing with 10 collections (it’s just coincidence that each have the same number of collections).
The actions specified for each collection were established by assessing what I wanted to protect against for each collection and how much effort I was prepared to make. Six different types of possible actions emerged:
- Check file formats: Check that the current file formats will enable the files to be accessed in the future, and if not make changes to ensure they will.
- Check Inventory: Check that the index entries have a corresponding physical item and/or digital file, and rectify any inconsistencies.
- Ensure physical docs are up to date: Ensure that the physical documents are the latest versions.
- Ensure Index is up to date: Ensure that the latest additions to the collection are included in the Index.
- Ensure Digital collection is up to date: Ensure that the latest additions are all included in the digital collection.
- Ensure Physical collection is up to date: Ensure that the latest additions are all included in the physical collection.
The two Preservation Plans fully populated with the results of the preservation work carried out on them can be accessed at the links below:
Objects Preservation Maintenence Plan Lite dealing with 10 collections
Files Preservation Maintenence Plan Lite dealing with 10 collections
The preservation work, as specified and recorded in both plans, took approximately 20 hours over about a week. This included filling in the Plan documents with the results as each collection was tackled. Overall, the main actions taken were:
1,976 .doc files converted to .docx: 1,937 of these were converted in bulk using the VBA code kindly provided by ExtendOffice (see https://www.extendoffice.com/documents/word/1196-word-convert-doc-to-docx.html). The remainder were simply opened in Word and saved as .docx files. (a few of these were originally .rtf files).
150 .xls files converted to .xlsx: 141 of these were converted in bulk using another set of VBA code provided by ExtendOffice (see https://www.extendoffice.com/documents/excel/1349-excel-batch-convert-xls-to-xlsx.html), with the remainder being opened in Excel and saved as .xslx files ( a few of these were originally .csv files).
564 files deleted: 464 of these files were in an iTunes folder – and I no longer use iTunes. 36 were CD case covers/spines which I created in an application I no longer have – and the CD covers are all now printed out and in place on the CD cases so I no longer need these files. Most of the remainder were odd files which I no longer have a use for. As is apparent from this description, such files tend to be from folders containing more general material rather than specifically collected and indexed items. Many computers probably have an array of such unneeded material.
Around 9 new items added: 7 of these were added to get a collection up to date, and the others were the two new Lite Preservation Plans which were included in the Backing-up collection.
2 Hardcopies updated: One was a physical A5 ring binder of the addresses in my address database; and the other was my Backing-up and Disaster Recovery document which I print out and keep a copy in my desk drawer. It’s really a bit of an effort to update such documents regularly and so they often get out of date. Having a scheduled Preservation Plan does help to keep them relatively current.
The next cycles of these two Preservation Maintenance Plans are now scheduled for 2027 and 2028 respectively: I can now relax, confident that I have done as much as I wish to future-proof the 20 collections that they deal with.
I have included most of my workings in this post largely to help me be clear of what I did. However, the details are of little consequence to readers interested in undertaking digital preservation work on their collections. They only serve to show that you can call anything a collection, and that you can cut and dice collections any way you want. The key point is that, using this approach, it is feasible to exert a measure of preservation control over a large number of collections, including the files on your computer, with relatively little effort. If you try this out, you may find this Preservation MAINTENANCE PLAN LITE Template helpful.