New version 2.5 of the Maintenance Plan Template

A couple of days ago I completed an experiment to use the Maintenance Plan template to undertake initial Digital Preservation work on a collection instead of using the Scoping document. It proved to be very successful. The collection is relatively small with only 840 digital files of either jpg, pdf or MS Office format, so there were few complications and I was able to proceed through the Maintenance Plan process steps without any serious holdups. The whole exercise took just over a week with the majority of the time being taken up by the inventory check of the digital files and of about 300 associated physical artefacts. I used the structure of the Maintenance Plan to document what I was doing and to keep a handle on where I was up to.

As a result of this exercise I’ve now added the following guidance to the beginning of the Maintenance Plan template, and equivalent text to the beginning of the Scoping document template:

If this is the first time that Digital Preservation work has been done on a collection

EITHER use the Scoping template to get started (best for large, complex collections)

OR use this Maintenance Plan template to get started (can be effective for smaller, simpler collections – retitle it to ‘Initial Digital Preservation work on the @@@ collection’ and ignore sections Schedule, 3, 4 and 7)

This concludes the interim testing and revision of the Maintenance Plan template. It has resulted in some substantial changes to the latest version 2.5 of the document (an equivalent version 2.5 of the SCOPING Document Template has also been produced). The final and most substantial test of the Maintenance Plan template will take in September 2021 when the large and complex PAWDOC collection is due to undergo its first maintenance exercise.

More than a Maintenance Plan?

Yesterday I finished the maintenance work on my PAW-PERS collection and so now have a refined version of the Maintenance Plan template based on two real-world trials. However, before publishing it, I’m going to take the opportunity to see if it could be used to start every Preservation Planning project. I’m able to do this because I have one other collection which has, as yet, had no preservation work done on it. It is the memorabilia that my wife and I have accumulated since we were married, and it is called SP-PERS.

Each of the three collections that I have subjected to Digital Preservation (DP) measures so far, have been through the process of creating a Scoping document followed by the production and implementation of a DP Plan, and finally the creation of a DP Maintenance Plan specifying works a number of years hence. However, my recent implementation of Maintenance Plans has led me to believe they might provide a structured immediate starting point for any preservation planning project.  They do not preclude Scoping documents etc. – indeed they explicitly discuss the possible use of those other tools halfway through the process. So, the opportunity to try using the Maintenance Plan template as a way in to every DP project is too good to miss. I’m starting on it today.

First trial of the Maintenance Plan

Today I completed the first real trial of a Maintenance Plan using the Plan I created for my Photos collection in 2015. It was one of the first Plans I’d put together so is slightly different from the current template (version 2.0 dated 2018). However, both have the same broad structure so the exercise I’ve just completed does constitute a real test of the general approach.

Overall, it went well. In particular, having a step by step process to follow was very helpful; and I found it particularly useful to write down a summary of what I’d done in each step. This helped me to check that I’d dealt with all aspects, and gave me a mechanism to actively finish work on one step and to start on the next. I found this to be such an effective mechanism that I modified the current Maintenance Plan Template to include specific guidance to ‘create a document in which you will summarise the actions you take, and which will refer out to the detailed analysis documents’. It’s worth noting that I was able to include this document as another worksheet in the collection’s Index spreadsheet, along with the Maintenance Plan constructed in 2015 and the Maintenance Plan I have just constructed for 2025. Being able to have all these sub-documents together in one place makes life a whole lot easier.

The exercise also identified another significant shortcoming of the template – it includes no details about the collection’s contents and their location(s). Consequently, an additional ‘Contents & Location’ section has been included at the beginning of the template.

The Photos collection has certainly benefited from the exercise; and the experience has enabled me to make some useful modifications to the template. I intend to tackle the second test of the Maintenance Plan (for the PAW-PERS collection) in the next few weeks, and will then publish an updated version 2.5 of the Maintenance Plan template which will include all the refinements made in the course of these two trials.

Maintenance Plan Template Refinement

The final piece of work in this Digital Preservation work is to test and refine the Maintenance Plan template. I’ll be doing this by implementing the following plans drawn up in earlier stages of this preservation journey:

I’m late in starting the PAW-PERS maintenance work because earlier this year I was focused on completing the ‘Sorties into the IT Hurricane’ book. Now that’s out of the way, I plan to complete the PAW-PERS and PHOTO maintenance during May and to use that experience to update the Preservation MAINTENANCE PLAN Template – v2.0, to version 2.5. The insights gained in the major maintenance exercise on the PAWDOC collection in Sep 2021, will be used to produce version 3.0 of the Maintenance Plan template. Updates to the other templates (SCOPING Document, and Project Plan DESCRIPTION and CHART) may also be made at that point if necessary. I shall offer the revised templates to the DPC for inclusion in their website. These will be the final activities in the Digital Preservation work being documented in this journey.

Self-publishing a Photobook

To get an idea of the possibilities for photobooks, just take a look at the Blurb bookstore; there’s a huge diversity of subject matter, and the books look great. It’s clear that anyone who has a passion can create a permanent record which will sit handsomely on a bookshelf for around the cost of a meal out or less. Furthermore, authors can elect to sell their book in the Blurb bookstore and/or through Amazon; and they can specify how much money they want to make on the sale of each copy. Blurb will keep track of sales and remit the income due to the author each month.

I’d already had a go back in 2012 – but with a service designed more for the presentation of photographs rather than discursive text. The result was pleasing but not brilliant. I’d heard there were more appropriate online printing operations – and I determined to try one out sometime. My opportunity came last summer when I decided that I might have more success finding a permanent repository for my work document collection, if I had a book of memorable experiences based on the contents of the documents. I decided to use the Blurb service for no better reason than I’d had a brief look at it a few years ago after seeing it get a good rating in a review of self publishing services. There are many other such services available on the net today and I don’t know how they currently compare to Blurb.  You should check them out.

I decided that my book would consist of one page write ups of particular events, each one accompanied by a page of images. I opted to create the text first in Microsoft Word and then to decide what images to include when I imported each piece of text into Blurb’s BookWright page layout package.

I started writing the text in September 2019. It was mostly done by the end of January 2020, at which point I downloaded the BookWright software. Although it took a bit of getting used to, it wasn’t too difficult, and I found the functionality quite good. There were a couple of minor problems: first, the software closed abruptly, without notice, five or six times – but each time it fired up again and opened up the book’s contents successfully without having lost any data. Second, typing was sometimes slow to reproduce on screen. Exchanges with Blurb Support suggested it was due to a lack of virtual memory – which didn’t surprise me because I was using Word, Excel, Powerpoint, Filemaker, and a PDF package all at once to create the contents while I was using BookWright. Closing some of these seemed to resolve the issue.

The biggest issue I faced was with the resolution of the images I was including. The Blurb Help files warn against grainy, blurry or pixelated images, but, of course, you can only be absolutely sure you have avoided this pitfall when you get the printed book. BookWright itself provides a warning when it thinks an image will not be up to standard (which typically occurred when I was trying to expand an image to make it easily readable or to fill a page). I took notice of these warnings and either made the image smaller or found a way of increasing its resolution. I achieved the latter by either rescanning a physical document at a higher resolution, or printing out an electronic document in high quality and then scanning at a high resolution. Although these two approaches did seem to improve the quality of many of the images, they also substantially increased the file size of the book (about 4.2Gb at that point). A search on the net about the size of BookWright files, reassured me that uploads of that size and more were not unusual – but I did discover that eBooks cannot be produced for files over 2Gb. I also discovered – rather too late in the day – the BookWright advice to use the png lossless format in preference to jpg. I guess this just highlights the fact that I really don’t know too much image formats and resolutions. Nevertheless, most of the images seemed to turn out OK in the finished book. The key seems to keep image sizes below the threshold of the BookWright warning messages.

I had 195 separate stories, so there was at least one image to find and import for each one – and, in some cases, several images. It was a long haul and took me until the 19th March before I’d finished the first pass through in BookWright, and could start the final edit.

I’d elected to subdivide the stories into nineteen short stories – each one labelled with an icon comprising a unique set of different shapes and including the page number of the next story. The idea was that readers of a particular short story could find the next instalment at the specified page number. The page numbers went into the Contents list, and into the icons, on 27th March, and then it was onto creating the dust jacket and doing final checks.

On 30th March, I was ready to submit the 4.75Gb file using Blurb’s Upload facility. First the system ‘rendered’ the file down to 492Mb; and then it did the Upload. The whole process took about 37 minutes. I was all set to order a copy, but found that the discount code I’d planned to use, didn’t work. I searched the Blurb site and the net for 45 minutes and tried lots of codes – but none were current. I decided to wait – the full price of £103.59 was too much to ignore the possibility of a substantial reduction. It was worth the wait – on 1st April Blurb advertised a 41% discount code, so I paid the overall cost of £73.70 (which included a £2.99 PDF copy, £8.99 delivery, and 60p tax), and was told to expect delivery by 14th April.

The book arrived around 9am on 7th April. It exceeded my expectations, with a bold glossy cover, glossy pages, clear text, and bright images. I spent the rest of the day checking each page noting the corrections needed; and then the next two days making final changes. On the morning of 10th April, I did a final preview of the book and this turned up about a dozen further changes. At around 3.30pm I started the Upload process. The system took about 10 minutes to render the 5Gb file down to 496Kb; and a further 27 minutes to upload it.

Putting the book into the bookstore was not particularly difficult – but it did take a little time. There was a book description to write, categories to select, and keywords to specify. Then I had to decide how much profit I wanted to add onto the price of the book; and finally there was the specification of which pages I wanted people to see in the preview. I completed the whole business by around 5.30pm – glad to be able to take a break from the perishing book.

Overall, I’ve found it to be a very effective and satisfying experience. It has been a long and demanding exercise – but that was to be expected with a 438 page book of this nature. I elected to produce a photobook on 118gsm standard semi-matte high quality paper – however, I’ve no reason to suppose that the results couldn’t be commensurately as good for the other types of book and paper that Blurb offers. The BookWright software provides very flexible text options and layout capabilities, and seems to be able to handle images very well; and the bookstore facility provides a ready made distribution channel for the finished books.

However, there is one aspect that needs to be borne in mind. The price of a print-on-demand book is inevitably going to be greater than the price of mass produced books in a physical bookshop. Blurb books give absolute control to the author – but may price the book out of the market. There are volume discounts to be had – but the demand for a bulk lot has to be created by the author. When authors get publishing deals they do, indeed, cede much power to the publishers; but, in return, the publishers establish markets for the books and keep their prices down. This trade off becomes particularly apparent for large glossy books such as the one I have created. It is far less so for softback books with many fewer pages and of lower quality paper, of which many examples can be found on the Blurb bookstore.

Of course, these price concerns are of little consequence if all you are trying to do is to exploit some of the artefacts that you possess and make them visible. My experience with Blurb – and the huge range of examples in the Blurb bookstore – shows that using a self-publishing service provides ample opportunity to use your creativity and artefacts to bring to life your memories, ideas and passions.

Oh, and the book I created? Well here’s the cover. Clicking it will take you to the Blurb bookstore where some of its contents can be previewed.

A subjective halfway view

I’ve just acted as subject in our first investigation into the memorability and impact of information nuggets. The nugget material, in this case, was mindmaps of key points in nineteen esoteric-type books which explore perceived unresolved mysteries from ancient Egyptology to modern secret societies.  I discovered that I could remember almost none of the points presented to me and was unable to link any of them to a particular development in my thinking. My immediate reaction to this disappointing – but probably to be expected – finding was that these are not actually nuggets of information but instead are just parts of a summary of each book.

However, on reflection, I’ve reversed that view. After all, when I was picking out the points as I read, I must have thought each of them to be significant – otherwise I wouldn’t have picked them out. So, how is a key point in a book different from a key point in, say, a five page article? Well there are some obvious differences like the book is a lot bigger and has a lot more stuff in it – most of which I’m not familiar with AT ALL. Unless one has a photographic or otherwise superb memory, you wouldn’t expect to remember everything in such a book after one quick casual read. Of course, I have the books on my bookshelf and have the look of each one locked in my memory with some ideas of what it’s about. However, this is the case because there are just a few hundred of them, and they have a rich content and the covers and spine usually have distinctively memorable images. In contrast, the articles and documents in my work collection (which are due to be investigated next), are much more numerous; are hidden away in my computer (with just a few in my physical archive box); and they all look very similar and have very few distinctive markings.

I guess I’ve expanded my thinking this morning about all this. However, I’m only the subject and we’re only half way through the overall exercise. The interesting bit will be what the researcher concludes from it all.

The truth about truth

Maybe most people have already twigged this, but the BBC programme ‘The Capture’ has made me realise that we can no longer rely on videos for the truth. It illustrates how live camera feeds can be altered – dramatically. I believe sophisticated and moneyed organisations can do this today; and I think it will become easier as time goes by.
So, to add to the possibility of text being untrue, and of people’s accounts and memories being untrue, and of photos being faked, we must add that videos may be false. Is there anything left – well perhaps just our own internal thoughts and memories, but no doubt our race will get to manipulating those too.
So, I guess, we are back to a great truth that our enquirers and thinkers have known for hundreds of years: there is no substitute for diligence and multiplicity in our search for what is and what has been. Our modern technology has made us slack and gullible and persuaded us that we can nail down reality. In fact, reality has to be carefully investigated and checked and rechecked, and then still considered with a critical eye as we use it generously to develop our understanding and knowledge.

Power Booking

People in power a few hundred years ago just didn’t have access to up to date global information. These days such people have no excuse as  large numbers of diligent writers research global issues and publish up to the minute resumes on a wide range of topics around the world. There is no excuse for failing to be aware of what humans have done, and continue to do, to each other; what effects we are having on the planet we live and depend upon; what our universe might consist of; and what possible futures we might have within it. Even I, with just a few books I have read in the last few years, feel informed and broadened. If each world leader were to be given just ten or fifteen books to read at the start of their reigns, perhaps they would act rather more in the interests of all of us, than they currently appear to do so.

PAWDOC: Requirements and Objectives

The 2001 paper reviewing the first 20 years of use of the PAWDOC system listed 23 requirements for a Personal Electronic Filing System and provided a status for each one.  The table below reproduces that listing and also provides an updated status for PAWDOC in 2019.

Requirement 2001  Status 2019 Status Notes
Requirements dictated by The Job
1. Cope with large amounts of material, some of which becomes redundant very quickly. Fully met Fully met
2. Cope with changing terminology Not met at all Not met at all This would require specific functionality in the Index.
Requirements dictated by the physical environment in which the system is used
3. Be capable of being operated in temporary and limited office accommodation. Partially met Fully met Now fully met because the whole collection is now digitised on the laptop; and scanners are found in most offices.
4. Be easily portable. Partially met Fully met Now fully met because laptops are now small and powerful, and have more than enough storage.
Requirements to support the information sources used
5. Handle hardcopy material in a very wide range of physical sizes and formats. Partially met Fully met Hardcopy that is too large, or too difficult to scan, can now be photographed with a mobile phone at sufficiently high resolution for it to be read on screen.
6. Handle documents containing coloured text, backgrounds, diagrams and pictures.  

Not met at all

 

Fully met Now fully met because modern scanners handle colour; and they have brightness and contrast settings which can be adjusted to be able to scan most document contents.
7. Record references to material in other people’s filing systems. Fully met Fully met
8. Manage information received via email and computer conferencing systems (including Lotus Notes). Partially met Fully met Now fully met because any file formats can be handled, and getting email content into such files is not the concern of the filing system.
9. Record references to material in remote systems such as Lotus Notes databases and web sites, and access those remote systems and retrieve the relevant information. Partially met Fully met HTML references are included in Filemaker Index entries, and can be opened by selecting them and right clicking which presents an ‘Open’ menu option.
Requirements to manage information that is created by the filing system owner
10. Handle handwritten text and diagrams on paper. Partially met Fully met Now fully met because modern scanners have brightness and contrast settings which can be adjusted to be able to scan most document contents.
11. Handle electronic files Fully met Fully met
Requirements to help cope with information and communication overload
12. Support the rapid organization of information before it has been dealt with. Not met at all Not met at all This would require special functionality in the filing system.
13. Make visible what information has to be dealt with and support the scheduling of dealing with it. Not met at all Not met at all This would require special functionality in the filing system.
14. Enable information to be filed very quickly. Fully met Fully met
Requirements to support information access
15. Enable information to be retrieved simply and quickly. Fully met Fully met
Requirements to support the reuse of information
16. Enable templates, best practice and other reusable material to be identified, retrieved and reused. Partially met Fully met This is now fully met because I now believe it is best addressed by including appropriate wording in the Index Title field.
17. Enable existing material to be copied and modified, and to be stored as new material. Fully met

 

Fully met

 

Requirements to support knowledge acquisition and development
18. Identify, store and retrieve the marks highlighting key text. Not met at all Not met at all This would require special functionality in the filing system.
19. Collect together all important points and present them as a coherent set of information  

Not met at all

 

Not met at all This would require special functionality in the filing system.
20. Enable the user to relate all important points together in such a way that concepts can be developed as new material is acquired. Not met at all Not met at all This would require special functionality in the filing system.
21. Assist the user to identify knowledge developments that are occurring and to choose what areas to focus on. Not met at all Not met at all This would require special functionality in the filing system.
Technology support
22. The technology must be cheap enough for an organization not to quibble over, and for individuals to buy for themselves. Partially met  

Fully met

 

This is now fully met because costs of scanners have dropped; and it has been established that a Document Management System is not required.
23. The technology must be reliable enough not to require expensive maintenance contracts or multiple one-off repairs. Partially met

 

Fully met

 

This is now fully met because the technology required is standard and now very reliable.

Two things are clear from the comparison of the status in 2001 and 2019: the technology for a personal electronic filing system has now become commonplace and relatively cheap; and, no developments have occurred to support the particular requirements of ‘coping with information and communication overload’, and of ‘supporting knowledge acquisition and development’. I believe the latter is probably true not just for the PAWDOC system but generally – I have not heard of any work going on in these areas.

The 2001 paper also described three objectives for the work:

  • To provide practical feedback to product developers, system designers, and other (potential) users regarding the real day-to-day requirements of individuals using personal electronic filing systems.
  • To establish an office document test set.
  • My own personal need to stay organised and efficient in my day-to-day work.

Of the three, only the final one – to keep myself organised – has been fully achieved.

Regarding the first objective, I have done my best to document my experiences and make that information freely available to product developers, system designers and other users, but I have seen no evidence to suggest much interest – it seems there is not a consumer-led demand for this capability. Certainly, I have come across very few people, if any, who have been operating an all-inclusive PAWDOC-type system without any direct contact with myself.

I believe the second objective – to establish on office document test set – has simply been overtaken by events; the huge strides made in search & retrieval algorithms by internet search engines such as Google over the last 30 years has simply removed the need for such test sets.

Over the last 15 years or so, two more potential objectives – or at least potential uses – have come to mind: first, as huge changes continue to occur as a result of ever-increasing computing power, the universality of the mobile phone, and the ubiquity of the internet, it has occurred to me that the PAWDOC collection provides a unique insight to the early stages of this massive transition in business and society in general. Therefore, I continue to seek a permanent repository for the collection, in the belief that it may hold some value for future researchers.

The second additional objective concerns digital preservation. In order to ensure PAWDOC’s future accessibility, I have had to develop suitable digital preservation processes and documentation and to apply them to the very diverse range of material in the collection. In the course of this work, it has occurred to me that the collection might be useful to the digital preservation community as a test bed and training tool. If I have no success in finding a permanent destination for the collection as a research tool, I may then try to find a home for it within the digital preservation community.

This entry brings to an end my own personal final review of the PAWDOC personal electronic filing system. The remaining work to be done is to assemble a set of Conclusions. However, this will be a joint effort between Peter Tolmie, an independent researcher, and myself.

PAWDOC: Architecture

Four types of prototype solutions have been used in the course of this work: card index, electronic index, electronic index and document management system, and electronic Index and file storage in Windows 10 folders. The timeline of development is shown below and the architecture of each solution is described in the subsequent text.

1980      Visit to Amoco Research Centre, Tulsa: first electronic office filing system seen
1981      First entry placed in card index (1 June 1981)
1987      Implementation of a computer-based index – Filemaker software on a Macintosh
1993      Movement of Filemaker index from Macintosh to a Compaq LTE Elite laptop running MS Windows
1994      Scanner and Magneto-Optical drive loaned by Fujitsu
1995      Paperclip Document Management software loaned by DDS
2000      Paperclip upgraded to FISH product and loaned by Ringwood Software (which was subsequently taken over by Azur Group, which was taken over by Maxima which was taken over by m-Hance)
2018      FISH document management system removed and replaced with Windows 10 folders

1. Architecture of the card index system, 01Jun1981 – 30Dec1987

 This consisted of paper documents stored in an upright cabinet in serial number order, and an index on 6×4 inch cards. The cards were initially held in a small plastic box with a lid, and later on in a metal drawer. The index consisted of a series of title cards with an entry for each document (serial number, title, keywords) and a set of keyword cards – each one having a keyword on the top and then all the entries possessing that keyword listed underneath. The rationale for this filing schema was worked out with my colleague John Pritchard, and it continues to form the basis for the system as it stands today. The Card Index Architecture is illustrated below.

2. Architecture of the electronic index, Jan1988 – Dec1994

After being provided with a Macintosh computer at work, I selected Filemaker, a general-purpose database with great flexibility and ease-of-use, to hold the Index. The fields used for the Filemaker index are:

  • Reference number.
  • Title/keywords.
  • Movement status (to record the location of items borrowed by people, archived, lost, etc.).
  • Publication date (to indicate when an individual item was first published).
  • Creation date (to record when an item was catalogued into the filing system).
  • Date last accessed (to record when an item has last been retrieved thereby enabling items that have never been accessed, or least recently accessed, to be identified as most suitable for archiving).

By 1988, the amount of paper had overflowed the upright filing cabinet and I had started to archive less frequently accessed documents to cardboard liquor boxes, which were stored around me in my office. The architecture of the Electronic Index system is illustrated below.

3. Architecture of the electronic index and document management system, Jan1995 – Feb2018

In 1994, Fujitsu offered to support the work by providing, on long-term loan, a scanner and a magneto-optical storage system; and, in 1995, DDS, the European distributors of the PaperClip document management system, provided PaperClip on a long-term loan basis. By the beginning of 1996, these components had all been trialled and tested and were operational. Paperclip provides drawers, folders within drawers, and documents within folders. All the PAWDOC folders were contained in a single drawer; and a single folder was allocated to each PAWDOC Reference Number. Multiple documents of one or many formats could be contained within a folder.

Paperclip interworked with Filemaker by control key combinations which, when selected within Filemaker, copied specified information on the screen, imported it into Paperclip and enacted a Paperclip action. Two interworking functions were set up:

  • Search (search for a folder in Paperclip based on a reference number picked up in Filemaker);
  • Create (create a new folder in Paperclip and insert the Filemaker reference number into the folder index field).

PaperClip was upgraded to FISH in 2000. The architecture of the combined Index and document management system is illustrated below.

4. Architecture of the electronic index and Windows 10 folders, Mar2018 – present

The companies that owned the European rights to the Fish document management system were successively taken over in the first 15 years of the new millennia, and, by 2017, Fish was owned and supported by m-Hance. However, the digital preservation exercise undertaken on the PAWDOC collection in 2017 learned that there were no plans to develop Fish any further, so it was decided to remove Fish and replace it with Windows 10 folders. The conversion was undertaken successfully in 2018. The Electronic Index and Windows 10 Folders architecture is illustrated below.