Hikes through the preservation hinterland

I’ve just finished dealing with two particular digital preservation challenges that exist within the document collection I’m currently working on. The first involved two Lotus Notes files; and the second concerned some Windows Help files. My experience with these issues illustrates a) how just a few files can take a lot of work to resolve, and b) that there’s often an answer out there to seemingly impossible preservation problems provide you are prepared to look diligently enough.

I really didn’t believe I was going to find a way to unlock the Lotus Notes files since Notes is a major and very expensive piece of software that I don’t possess; and, in any case, it applies sophisticated time-limited password and encryption controls for its use. Despite being aware of these issues, I thought I’d take a quick look on the net to see if I could find any relevant advice. It was time well spent; I discovered that it’s possible to download a local evaluation copy of Notes for 90 days, and that, because it doesn’t run on a server, this sometimes enables old Lotus Notes files to be opened. I duly downloaded the software and installed it; and then, regardless of the mysteries of Notes access controls, had access to the whole of one of the files (which contained conference-type material) and to parts of the other (which contained sent messages). I still had the username and expired password from the time the files were created and I think this may have helped to access the latter – though I’m not sure about that. Anyway, in both cases, I was able to print out the material to PDF files. I had to manually reorder the conference-type material and to reinstate a few hundred links in it, but that was it – job done!

The Windows Help files were a lot more demanding. Microsoft stopped supporting the WinHelp system (.HLP files) in 2006 in favour of its replacement, Compiled HTML Help (.CHM files). Although Microsoft did issue a WinHelp viewer for Windows 7 in 2009, WinHelp is essentially an obsolete format – it isn’t supported in Windows 10. I’m still running a Windows 7 system so am still able to view the HLP files – but they had to be converted now if they are ever to be accessed again in the future.

There is much material on the net about how to convert HLP files into CHM files, but, as someone with no knowledge at all about how files in either of these systems are constructed, I didn’t find it easy to understand. I soon realised that converting from one to the other was going to be a challenge. However, I did eventually find a web site which offered clear practical advice which I could follow (http://www.help-info.de/en/Help_Info_WinHelp/hw_converting.htm), and I duly downloaded the recommended HLP decompiler; and the Microsoft HTML Help Workshop software. The process to be followed went something like this:

  • Decompile the HLP file into its component parts (consisting of a help project file with the extension .hpj, along with one or more .rtf documents, an optional .cnt contents file, and any image files – .bmp, .wmf, or .shg – that are used within the Help file).
  • Convert the various HLP files into HTML Help files using a wizard in the HTML Help Workshop tool (the new files consist of a project file with the extension .hhp, one or more HTML files, a .hhc contents file, an optional .hhk index file, and any image files that are used within the Help file).
  • Set parameters in the hhp file to specify a standard Window name and size; and to have a search capability created when the files are compiled into a single CHM file.
  • Reconstruct the Table of Contents using the original HLP file as a guide (in many cases no Table of Contents information comes through the conversion process – and, even when some did, it had lost its numbering). Where the contents had to be created from scratch, each new content item created had to be linked to the specific HTML file to be displayed when that content item is selected.
  • Re-insert spacings in headings: The conversion process also loses the spacing in headings in the base material resulting in headings that look like this, ‘9.1Revised System’ instead of like this ‘9.1  Revised System’. To rectify this problem, the spacings have to be manually re-inserted into each HTML file of base material.
  • Compile the revised files into a single CHM file.

The first HLP file I tried this out on contained just a single Help document with some 130 pages. It took a bit of figuring out, but I eventually got the hang of it. However, the second HLP item was in fact made up of 86 separate HLP files all stitched together to present a unified Table of Contents in a single window in which the base material was also displayed. Many of these 86 separate files had 50 or more pages, and some had many more than that; and each page had to represented separately in the Table of Contents. It was a very long tortuous job converting all 86 HLP files and ensuring that each one had a correct Table of Contents (I didn’t attempt to re-introduce the spacing in the headings – that would have been a torture too far). However, that was not the end of it; the files then had to be stitched together in a single overall file that combined all the individual Tables of Content and that displayed all the base material. This involved inserting a heading for each document, in the master file; and inserting a linking command to call up the Table of Contents for that particular document. Oh, and I should also mention that the HTML Help File Workshop software was very prone to crashing – not a little irritating – I soon learnt to save regularly…..

This overall task must have taken at least 30 or 40 hours – but I did get there in the end. The new CHM file works fine and is perfectly usable, despite three of the documents being displayed in separate windows instead of the single main window (although I spent some time on this issue I was unable to eliminate the problem). Of course, the lack of spacing in the headers is immediately noticeable – but that’s just cosmetics!

No doubt there are specialists out there who would have made a quicker and better job of these conversion activities. However, if you can’t find such people or you haven’t got the money to throw at them, the experiences recounted above show that, with the help of the net, it’s worth having a go yourself at what you may consider to be your most difficult digital preservation challenges.

Scoping Document Finalised

Back in February, work started on the draft Scoping Document for the digital preservation actions required on the PAWDOC collection. Having spent some months actually doing bits of the work identified in the document and refining the document with the insights gained in the process, the final version of the Scoping Document has now been completed. It includes the following list of things that have to be done before a Project Plan can be produced:

  • Decide what document management system or alternative, and any associated databases, are to be used going forward.
  • Decide if Filemaker is to be retained as the platform for the Index or if it is to be replaced going forward.
  • Establish the future platform strategy.
  • Research and understand the actions required to:
    • make any moves planned from one piece of software to another; or from one platform to another;
    • be able to open those documents that don’t currently open;
    • promote the long term accessibility and survivability of all categories of document in the collection;
    • mitigate against the collection’s CDs and DVDs becoming unreadable;
    • mitigate against the electronic part of the collection being separated from the physical part.

Unfortunately Jan Hutar and Ross Spencer have decided they are unable to make any further substantial contributions to the project due to time pressures and other reasons. However, I continue to hope that they will remain associated with the work and be prepared to answer questions by email as needed. Their input to the early part of the work has been invaluable in getting the project to the point where I am actively investigating the practicalities of moving the electronic documents out of the Fish document management system into flat files in a Windows directory. The Fish supplier has a utility which will perform such a transformation, but much will depend on whether it can be customised to produce the file title format required and how much it will cost.

Alongside this activity, work continues on files that can’t be opened and on issues identified by the DROID analysis. Given the position that the project is in at present I would anticipate being able to complete the project plan sometime in the next 9 – 12 months.

Droid explorations and DMS alternatives

Things have started to move in our efforts to perform digital preservation on the PAWDOC collection. I’ve been running the National Archives DROID tool across the 190,000 files and Ross’s automated analysis of the results has turned up a number of issues including several hundred duplicates which we are investigating. Among other things, DROID identifies file types and versions, and this has helped another strand of our investigations to try and gain access to about three hundred files which can no longer be opened. 150 of these are old PowerPoint files from the early 90s which neither the Microsoft viewer nor the earliest version of OpenOffice can open. However, the Zamzar online service, to which you download a file and specify what format you want it to become, successfully converted all of the examples which I submitted, into a version of Powerpoint I can open. Zamzar can’t deal with every problem file, especially those for which I no longer have the relevant application, for example, MS Project and iThink, though it did convert Visio drawings into PDF. We’re continuing to work through these files with the intention of getting a clear decision about what to do with each one so that specific actions can be included in our eventual preservation project plan.

Another substantial investigation underway is to try and identify a suitable alternative to the document management system (DMS) that controls the collection’s files. The future of the current DMS is uncertain, and is too complex to reinstall on upgraded hardware without expensive consultancy support. Jan’s exploration of alternative DMS and preservation repositories, highlighted the fact that, while there are several free to use public domain systems available, they all require multiple components and appear to be relatively complex to install, configure, and maintain. This observation has prompted me to be a lot clearer about the immediate requirements for the collection. It is hoped to find a long term owner, perhaps working in the field of modern history, and it’s possible that that person or organisation may require more sophisticated search and access control functions. However, until that eventual owner is found, only a minimal level of single user functionality is needed, and minimal system management and cost demands are essential. In light of this greater clarity, we are now also considering a low tech, low cost alternative which would involve inserting the Index reference number into the title of every file and storing all the files in the standard Windows folder system. After identifying a required Reference No in the Index, files would be accessed by putting the reference number into the folder system’s standard search facility. As well as looking at the pros and cons of such a solution, we are also investigating the feasibility of  getting the necessary information out of the current DMS and into the titles of all the document files. A further challenge that would have to be overcome is that the current DMS stores multi-page documents as a series of separate TIF files. If we were to move to the low tech Windows folder system solution, it would first be necessary to combine the files making up a single document into one single file. This would need to be an automated process as there are too many documents to contemplate doing it manually.

All these activities and more are required in order to be able to assemble a project plan with unambiguous tasks of known duration. We are continuing to work towards this goal.

Work Underway

Digital preservation work on the PAW/DOC collection has now started in earnest. The first couple of months were spent getting participants up to speed with a common understanding of what the collection consists of and what process we are going to follow in the work. This was achieved with the following reading list:

PawdocDP-N1 – Ergonomic aspects of computer supported personal filing systems, April1990

PawdocDP-N2 – 20 years in the life of a long term personal electronic filing system, Sep2001

PawdocDP-N3 – Checking PAW-DOC, v1.0, 31May2016

PawdocDP-N4 – Preservation Planning for a Personal Digital Archive – DPC Webinar with presenter notes, 29Jun2016, v1.1

PawdocDP-N5 – Preservation Planning for Personal Digital Collections – DPC Case Note, Apr2016

PawdocDP-N6 – Preservation Planning SCOPING Document Template – v1.1, 11Sep2015

PawdocDP-N7 – Preservation Project Plan DESCRIPTION Template, v1.2 – 10Apr2016

PawdocDP-N8 – Preservation Project Plan CHART Template, v1.1 – 11Sep2015

PawdocDP-N9 – Preservation MAINTENANCE PLAN Template – v1.0, 11Sep2015

On Sunday 2nd April we held our first conference call during which I gave a demo of the collection’s Index and Document Management System; and in which we went through the first draft of the Scoping paper and allocated some of the activities that need to be completed before we will be in a position to create a project plan. These include the following:

  1. Document the FISH supplier’s recommended replacement route – Paul – End April
  2. Document possible alternative Document Management systems/Database Systems and their costs – Ross/Jan to advise on how to redefine tasks 2&3
  3. Document any alternative solutions to using a Document Management System for storing and retrieving the collection’s electronic documents – Jan/Ross to advise on how to redefine tasks 2&3
  4. Perform a DROID analysis of the current version of the PAWDOC collection and send out to the team – Paul – End April
  5. List the documents that can’t currently be opened and categorise them – Paul – End April
  6. Decide what should be done for each category of document that can’t currently be opened – Ross/Jan
  7. Identify what categories of document may not be able to be opened in future – Matt
  8. Decide what should be done for each category of document that may not be able to be opened in future – Ross/Jan
  9. List the CDs and DVDs that may become unreadable and categorise them – Paul – End April
  10. Decide what should be done for each of the categories of CD and DVD that may become unreadable – Matt to decide if he can do this after seeing the list
  11. Document possible solutions to the possibility of the electronic documents becoming separated from the physical documents, and recommend a course of action – Paul – End June

Work is now proceeding on these tasks, though timescales are uncertain since all team members other myself have full time jobs. Our next conference call is scheduled for Sunday 21st May.

PawdocDP Participants

The project to perform digital preservation on the PAW/DOC collection (PawdocDP) started at the beginning of 2017 with myself and four other participants: Matt Fox-Wilson (Ambient Design), Ross Spencer and Jan Hutar (Archives New Zealand ), and Nicolaie Constantinescu (Kosson). We have exchanged introductory emails (see below), and  the team is now reading background material to get up to speed with what the collection is and what state it is in.  We aim to hold a screen sharing conference call in March to demonstrate and explore the digital collection and its supporting systems. The introductory texts sent by each member of the team are shown below.

From: Paul Wilson [mailto:pwilsonofc@btinternet.com] Saturday, 14 January 2017 12:15 a.m.

Hello Ross, Jan, Nicolaie and Matt. Very pleased to have you all on board at the start of this project undertaking Digital Preservation on the pawdoc collection (PawdocDP). To give us all some background on each other, I suggest you reply-to-all to this email with a brief intro about yourself. My intro is below. I’m a retired computer consultant living in Lavendon between Northampton and Bedford in the UK. I got a degree in Ergonomics from Loughborough University in 1972 and got my first job with Kodak where I first started working on the application of computers.  In 1978, I joined the UK’s National Computing Centre where I investigated best practice in Office Automation – that’s when the pawdoc collection came into being. I then spent 28 years with Computer Sciences Corporation (CSC) as, first, a computer consultant, and then as a Bid Manager for IT outsourcing deals. During my professional career I’ve been particularly involved in Office Systems, Requirements Analysis, Process Definition, Workflow Technology, HCI, CSCW, and Architecture Definition and Management. I play golf, and collect stamps and first edition books. My study window looks out on a side road with open fields beyond and 7 wind turbines in the far distance.

From: Constantinescu Nicolaie <kosson@gmail.com> 16 Jan at 7:25 AM

Hello! I’m an information architect for a library and information science community online – kosson.ro and a private enterprise manager. I have been involved with building useful content for all parties from my country that are interested in digital practices and resources preservation since over 10 years now. Right now I’m toiling on a JavaScript manual in Romanian needed so much for a solid foundation that will be followed by a series of data management for librarians. My languages is JavaScript and the Web APIs. Part of my time is dedicated now to writing essential learning materials and advocating for Open Access in Romania.

From: Ross Spencer < Ross.Spencer@dia.govt.nz> Monday, 16 January 2017 7:34 a.m.

Hello everyone! Thank you Paul.  I am a digital preservation analyst at Archives New Zealand. My background is in software engineering and digital humanities. I have worked at Archives New Zealand for three years, and before then, The National Archives, UK. My primary interests are developing tools for others to use to analyse and sentence digital records within an archival context. I release open source tools on GitHub. My languages are Python and Golang, with a modern day preference for Golang because of its easy portability across platforms without the need to run an interpreter. Outside of work I’m still a programmer, but, I’m also a cyclist. Interested in movies and music. I’m also attempting to learn French – but I have been attempting that endeavour for a long long time now!

From: Jan Hutar <Jan.Hutar@dia.govt.nz> 17 Jan at 11:33 PM

Hi all, Similarly to Ross I am a digital preservation analyst at Archives New Zealand, same role, different focus as you would expect. My background is classic archival science and then libraries. Before joining Archives NZ in February 2012 I was at the National Library of the Czech Republic in Prague, managing the Digital preservation team there for 5 years. I have got a PhD, my dissertation was about metadata for digitisation and digital preservation, the proposed metadata standard and schema is being used across Czech republic libraries since 2012. My main focus at Archives NZ is keeping our digital preservation system in shape, managing the data in it, getting data in and dealing with all sorts of digital preservation problems. Also digital preservation related policies. Mountain biking is my thing.

From: Matthew Fox-Wilson <ambientmatt@me.com> 22 Jan at 9:52 AM

Hi everyone, Sorry for the slow introduction! My exact job is sort of hard to describe but technically I’m the director/owner of a software development company here in New Zealand specialising in creative software for the consumer / prosumer market. My main focus here is on application architecture and UI design, but I’m also responsible for coding application structure and front end systems for our products. We’ve been in operation since 2001 but before then I’ve worked for a variety of companies in NZ and remotely for the US on consumer and pro-level graphics software, and consulted on a variety of projects relating to data sorting and natural methods for presentation, hence my interest in this project. When I’m coding I’m mainly old-school, focused primarily on C++ with a bit of Objective C, for Windows, MacOS, and iOS. Outside work I enjoy trying to recover from work, which mainly takes the form of gym, running, and a sword based martial art.

Collaborators sought – and found!

I started digital preservation work on my pawdoc document collection by first recruiting my son, Matt Fox-Wilson, to help with the project. Matt is the co-owner/developer of the ArtRage natural painting software package, and his experience as a software engineer is likely to be of use in the work. I then sent out the following request to other potential collaborators via the JISC Digital Preservation email list:

————————————————————————————————————

From: Digital-Preservation Announcement and Information List [mailto:DIGITAL-PRESERVATION@JISCMAIL.AC.UK] On Behalf Of Paul Wilson
Sent: Wednesday, 7 December 2016 5:53 a.m.
To: DIGITAL-PRESERVATION@JISCMAIL.AC.UK
Subject: Learning opportunity for digital preservation in document management systems

Collaborators are sought for a digital preservation project on a collection of some 40,000 documents stored in a document management system. This will be an ideal opportunity to gain practical experience of a variety of digital preservation challenges in an informal, small scale environment, unfettered by organisational constraints.

The project will apply the preservation planning methodology developed by Paul Wilson and published by the Digital Preservation Coalition in May 2016 – see http://www.dpconline.org/advice/case-notes

The project will start early in 2017 and is likely to last for between one and two years. Collaborators will be expected, as a minimum, to familiarise themselves with the background and high level contents of the collection in question, to comment on the preservation planning documents that are produced, and to participate in the project meetings at which decisions are taken. Such participation need only be remote and will probably take up about one day a month excluding the time spent on familiarisation. However, it is hoped that participants will wish to benefit from the project in more substantive ways; they will be encouraged to play a full part in the decision making that will be required and in the detailed preservation work itself. Such involvement will be determined in the course of the project. It is anticipated that the learnings from the project will be written up and submitted for publication.

The project is likely to be of particular interest to collaborators who have experience of managing collections of Personal Archives, or who work in organisations that own such collections. Location is not a constraint for participation, however people and organisations from the UK and New Zealand are particularly encouraged to get in touch. Interested parties should contact Paul Wilson at pwilsonofc@btinternet.com

————————————————————————————————–

I had two responses and am currently confirming the involvement of the people concerned. Things are looking good for commencement of the project in January 2017.

A Platform Challenge story (Win10)

My recent attempts to upgrade from Windows 7 to Windows 10 is a good illustration of some of the platform challenges associated with digital preservation planning. The background to this tale is that my lifetime collection of work documents is held in a document management system called Fish which employs an underlying SQL Express database to store the digital documents. The high level index to the collection is contained in a FileMaker database which integrates with Fish via some simple commands. All these pieces of software run on a laptop under the Windows 7 operating system.

Last year, when I first heard about Microsoft’s plan to enable users to upgrade to Windows 10 for free for an initial period, I decided that I would take advantage of the offer but would leave it till close to the cut-off date – which turned out to be the day before yesterday (29Jul2016).  In the intervening months, FileMaker issued yet another new version of its database (15) which I decided to take up (at £280) as my current  version (11) was no longer going to be supported, and I wanted to have a version which I could be sure would work successfully under Windows 10.

With FileMaker 15 in place, I got confirmation from my document management supplier that Fish does work under Windows 10, and so set about preparing to undertake the Win10 upgrade. The upgrade screen informed me that there were no incompatibility problems with any of my software, and my wife had already undertaken the upgrade successfully on her laptop at the first time of asking with the anti-virus programme that we both employ still running, so I thought there was fighting chance that the upgrade might go smoothly. I made a comprehensive set of backups, and set the upgrade going. It failed, giving me the rather cryptic error message 80070004-3000D. I soon discovered that , despite this being such a specific error number, there is no specific reason for failure associated with it. I spent many hours over the following four weeks trawling the net and reading a whole variety of advice from Microsoft and others about this error.

One of the first things I came across alerted me to potential problems with the SQL Express database that I was running. After further research I eliminated that as being a reason for the failure of the upgrade, but I did discover that Microsoft were not going to support the version I am running (2008R2) under Windows 10. I discussed this with my document management system supplier who advised that they had recently performed an upgrade to a later version for a client but that it hadn’t been entirely straightforward. They advised me to delay upgrading the database as long as possible. I checked the net again and found at least one entry saying that SQL Express 2008R2 was working under Windows 10, so I decide to set aside the SQL challenge for the time being.

I subsequently tried out a whole variety of suggestions I found on the net to overcome the error including removing superfluous user profiles; checking that folders such as Programme Files, Programme Data and Users are in the same directory as the OS; running scannow; checking I don’t have a proxy server; checking I have no empty folders in the Start Menu; checking that my computer name is not System or other reserved name and is more than 8 characters long; checking regedit to ensure that OS upgrades are allowed; performing the upgrade in Administrator mode; and creating a new Administrator role and upgrading from that. None of these worked across about 8 upgrade attempts, and each time I got the same 80070004-3000D error message.

Finally the deadline passed, and I was glad to be able to be able to stop the whole very time consuming and frustrating exercise. However, Microsoft was able to deliver a final sting from its very long and uncontrolled tail: I tried to write to them alerting them of my inability to find a solution to 80070004-3000D and asking them to confirm I would still be eligible for a free update if and when I did. I used a box on one of their ‘Contact us’ support screens which said something like ‘describe your problem here’ and which had a NEXT button underneath. I wrote out my problem, but, on pressing the NEXT button, another screen appeared which said ‘this page doesn’t exist’. Highly annoyed, I returned to the previous screen, copied my text, put it in a mail message to myself so it was properly date stamped, printed it out and sent it to the UK Microsoft HQ in Stockley Park. I do not expect to hear back from them.

For now I will continue using Windows 7, and the question of whether and when to upgrade to Windows 10 will become just another platform question that will need to be addressed in the Digital Preservation Planning exercise I intend to embark on for my document collection in the next six months or so.

DPC Webinar

The DPC webinar on ‘Preservation Planning for Personal Digital Archives’ took place last Wednesday (29th June), and I duly gave my talk to a small select audience of about a dozen people. I believe this included one person from the Bodleian Library, one from the UK Parliamentary archives and two separate groups from the UN Archives and Records group – one based in Long Island and the other in New York.  The Q&A at the end was interesting, but too short – I know I would have enjoyed spending more time talking about practical problems with these professionals. The two questions I can remember both came from the UN groups who are considering providing guidance to UN staff about how to preserve their digital files. The discussion highlighted that the Maintenance Plan I am proposing should eventually result in people not having very old unreadable files because the Maintenance Plan would be ensuring that they are regularly updated.

The full Webinar was recorded and is available via the DPC website at this link http://www.dpconline.org/index.php?option=com_content&view=article&id=1720:dpc-webinar-preservation-planning-for-personal-digital-archives-with-paul-wilson&catid=33:conference-reports

However you may just want to look at the Powerpoint slides that I used which have speaker notes included.

In both the DPC paper and the webinar, I made it clear that I was looking for collaborators to apply the Preservation Planning process to my document collection; and a repository for the collection. Now that I’ve been able to publicise these wants through these DPC activities, I’m hoping that I might hear from someone who is interested. However, whether or not any such people emerge, I’m aiming to start the Preservation Planning work on my document collection towards the end of this year or early next – that will be the next phase of my digital preservation adventure.

Webinar and Staging Posts

The DPC is running a webinar on the contents of my paper on 29th June, and yesterday I completed the slides for it. With that done, I am completely up to date with all of my Journey activities and this left me feeling unburdened and relaxed.

By chance, yesterday was also the day on which Richard Harper, a sociologist colleague from my CSCW days, dropped in for a chat. I’d come across Richard’s name in a paper I’d read in the course of my investigation into Digital Age Artefacts. The paper recounted insightful work into what people actually kept in their houses – highly relevant to stuff I’m doing – and I remember Richard as being a particularly interesting person to talk to. I looked him up on the web and discovered he had spent many years at the Microsoft Research Centre in Cambridge in the Socio-Digital Systems Group, investigating topics such as the myth of the paperless office and  communications in the digital age. I got to thinking that it would be great to talk to him to get a different take on what I’d been doing and what I might do next. I managed to reach him through his blog and we arranged for him to call in here in Lavendon on his way home from a trip to Chipping Norton (the scene of the three CSCW meetings which he and I attended in the early 90s). Before he arrived I spent an hour listening to four or five of his talks on YouTube, and then we had a very pleasant couple of hours discussing digitisation stuff over lunch.

It was good to see Richard again – and very kind of him to spend the time with me. As a result I have all sorts of new thoughts and perspectives rolling round my head which I know will take a few weeks to work through and take shape. Our conversation was the perfect opportunity to reassess what I’m doing before I set out on the next phases of these journeys. An immediate thought that shines through, though, is that perhaps I should treat my future ventures to make use of the digital artefacts I have collected, as though they were one of the treasure hunt type games that I have occasionally devised for my family. i.e. things that are occasional and intriguing and fun.

So, yesterday certainly felt like a memorable staging post in the things I have been doing. Oh, and by the way, as I write this I’m discovering that this morning appears to be a staging post in the UK’s journey – we appear to have voted to leave the European Union…

DPC Press Release

The DPC press release announcing the availability of my paper on it’s website, was issued today. It’s contents are below:

Paul Wilson, formerly of the Office Systems Division at The National Computing Centre, has contributed a new addition to the Case Notes now available on the Digital Preservation Coalition (DPC) website.

In this new Case Note, Paul narrates his attempts to create a preservation plan for a small personal collection. In the fuller article (which can be downloaded as a PDF), he outlines his experiences to provide insights into the practical outcomes of using published guidelines and tools for preservation planning. Since he could find no preservation planning process appropriate to individuals, Paul obtained a slide set detailing a simple preservation workflow from the Digital Preservation Coalition, and used that as a foundation on which to establish an approach to the work.

This general approach and accompanying documentation was tested and refined on two of his personal digital collections (one of 800 mementos and the other of 17,000 photos).

“I recounted my PDF experiences not to alert others to specifics about PDF (about which I know very little) or the eCopy software (which I am generally very pleased with),” he explains, “but to illustrate how complicated and time-consuming work on file formats can be.”

The detailed account of his research and preliminary trials provides a set of guidance for any individual or institution looking to preserve their own small, digital collection. Paul has also provided the documents he created from scoping to maintaining his collection, along with blank template versions that can be easily used and adapted by others. All of the documents, as well as blank templates, are available to download as a Toolset.

This case note also appears in the DPC’s Technology Watch Report Personal Digital Archiving by Gabriela Redwine

*Apologies for cross posting*

Sarah Middleton, Head of Communications and Advocacy, Digital Preservation Coalition, 37 Tanner Row, York, YO1 6WP