Personal filing in a 40 year vortex of change

This entry has been jointly authored by Paul Wilson and Peter Tolmie

The PAWDOC filing system was set up in 1981 to try and understand how the newly emerging office technologies of that era might assist individuals to manage their office documents. Over its 35+ years of operation much has been learnt, and it is our intention to try and understand and describe those findings. However, the system was set up to address requirements in the office of the 1980s, and there has been a revolution in the way business operates since then. In order to be able to relate the findings to office work today and in the future, this entry explores the differences in  requirements for personal filing in the office between the early 1980s and 2019.

Perhaps the most significant difference is the transfer of huge amounts of information from paper-based documents to digital files. Note that this is not saying anything about the current volume of paper in the office (though we would suggest that there is probably less paper filed by individuals now than in the 1980s) – just that individuals now have to deal with huge amounts of electronic material in contrast to the early 1980s when they dealt with virtually none. This transition has been facilitated by a huge growth in the use of computer hardware – desktop computers, laptop computers and mobile phones – throughout the world, from a base of zero to near-universality.

The growth in the use of computers has also prompted huge changes in office work. At the beginning of the 1980s, office professionals used support staff – typing pools, secretaries, and admin support staff – to perform their administrative tasks.  However, as office technology became widespread, professionals started to do their own typing and support staff became a luxury which could be cut to reduce budgets. In 2019, only very senior management have secretaries and office workers are expected to be self-sufficient and fully competent in the use of all hardware and software relevant to the realisation of their work.

These changes were accompanied by a revolution in communications. Electronic mail has now almost entirely replaced internal memos and external letters, and has prompted massive increases in the amount and speed of communication. Email also rapidly became the key mechanism for supporting distributed teamwork – nationally and globally – and now underpins a battery of related interests, from the sharing of documents to the organisation of voice conference calls (which are the unsung foundation upon which much of business and government now operates). In more recent years, as mobile phones have permeated throughout the world’s populations, text messaging and chat applications have become an integral element of personal and business relationships. To this highly significant mix of new technologies must be added the recent massive uptake in Social Media. An unfortunate side-effect of the sheer effectiveness and pervasiveness of these mechanisms is high levels of information overload across a large proportion of office workers.

Importantly for the work to derive findings from the long-term operation of the PAWDOC filing system, the changes described above have impacted filing activities in the office. Hot desking and home working have made personal filing cabinets and bookshelves a luxury. The folder systems integral to computer operating systems (primarily from Microsoft and Apple) are now used to store the electronic documents created and received by the individual. At the same time, email systems have their own integral filing systems into which mail can be rapidly sorted and stored indefinitely in the cloud; text messages are stored on users’ mobile phones in the form of text streams by both senders and recipients; and Social Media systems have their own self-contained environments distributed across vast computing networks. The further evolution of cloud-based repositories, such as Dropbox and Google Drive has led to an added utilisation (if not trust) in distributed document stores. Even if users wanted to integrate these different collections, it would be almost impossible for them to do more than just copy selected elements from one to another or to a dedicated filing system: these stores are separate silos and will probably continue to be so for many years to come.

The design of the PAWDOC system in 1981 was based on an understanding of office filing requirements at the time. There was an expectation of how emerging office technology might be used to support those filing requirements, but little appreciation of how the technology itself would change the way business operates. Initially, then, learnings from the development of the PAWDOC system were entirely focused upon what the impact might be of new assumptions about filing built into the construction of computer systems in the early 1980s. Later on, in the middle period of PAWDOC operation, the findings speak to what it was taking to manage a filing system in a changing work environment populated by imperfect but maturing technologies. More recent findings give a somewhat different picture, as many of the troublesome technologies of the middle-era have come to be taken-for-granted resources, giving rise to new kinds of problem, of which information overload is but one potential symptom. What is clear at present is that computer technology and the business world is now changing so rapidly, the presumption present in the early days of PAWDOC – that one could readily identify needs and solutions for the future – now seems somewhat naïve (if still just as pressing).

One thing, however, we believe has remained constant and that is the attitude towards filing across the population. Most people are not motivated to put effort into filing because it is extra work for an indeterminate reward at some undetermined point in the future. A smaller subset of people is willing to put varying degrees of effort into the activity. We believe this has changed little between the early 1980s and the present day. As it happens, the PAWDOC owner was at the more extreme end of this latter group and wanted to file both effectively and comprehensively. Hence the PAWDOC collection contains most of the documents that the owner read and/or believed to be significant in his work; and consequently it should be borne in mind that the learnings derived from his experiences concern almost the worst case requirements of filing load and effort. It should be easier for most of the population. Certainly, it would seem easier, for digital copies are now retained of virtually everything as a matter of course. The extent to which that is oriented to as a personal collection of materials is a different matter, as is the probity of third parties hanging on to everything in that way. These, of course, are burning questions of the moment, and ones to which we shall ourselves return.

Opening the channels

Since our initial phone conversation on 28th Feb, Peter Tolmie and I have Skyped twice more – we seem to have got into a pattern of speaking every four weeks or so. In our second conversation, Peter pointed out to me that my pawdoc filing system was just another manifestation of my inclination to keep things – as amply demonstrated in the various journeys documented in pwofc.com. He asked me what I thought I’d learnt from all these experiences, and I recounted a few things that immediately came to mind. Afterwards, however, I began to think that there were a great many more learnings dotted around the website. So I duly trawled through pwofc.com and recorded in a spreadsheet anything that looked like a finding. For good measure, I used another worksheet in the same spreadsheet to list all the requirements and findings specified in the paper about PAWDOC that was published in Behaviour & Information Technology (BIT) in 2001. I’ve given the spreadsheet to Peter and it will provide a base set of information for our investigations going forward.

My re-assessment of the BIT paper reminded me that one of the things I was thinking about when I wrote it was how one could use the key points in the documents you read to develop ones knowledge. This idea stemmed from my practice of putting a line next to key points – or nuggets as I termed them – in documents. I remembered that I’d made a start on this work some 17 years ago by recording in a Mind Mapping programme nuggets I found in books about the Pyramids etc. Peter and I discussed the possibility of my revisiting this material in a ‘Nugget Management’ journey sometime.

In our last Skype call on 25th April, Peter asked if I could keep an auto-ethnographic log of my keeping activities to provide us with more base material on draw on in our analysis activities. I duly created a spreadsheet with the headings listed below and am now recording all instances in which I make a specific effort to store a physical or digital artefact. The word ‘specific’ is used to exclude general keeping of things like email messages in email folders; and the word ‘artefact’ is used to explicitly require that a whole integral item is kept not just information removed from it like the name of a species from a plant label.

  • Ref No
  • Date
  • Item
  • How the instance arose
  • Reason for keeping
  • Initial actions and decisions made
  • Actions taken

Peter’s comment on my request for his views on my recording scheme was “This is great. It’s not how I would have done it myself, but that doesn’t matter at all. The main thing is that it works for you. Just different work practices because we come from different backgrounds. Nothing more.”; and I doubt that I, on my own, would have come up with the idea of a generalised keeping log. Herein are clues as to the sheer unique and precious value of collaboration with our fellows.

Getting started with the Findings

Having initiated a preservation planning regime for the collection, and having moved it onto the Windows 10 platform, I’m feeling that the only remaining things I need to do with it are to find it a permanent home and to write up the findings of this lengthy experiment. I took a step forward on the latter activity earlier this week when I had a very interesting phone call with Peter Tolmie, a UK Ethnographer based in the School of Information Systems and New Media at the University of Siegen in Germany. I was given Peter’s name by Richard Harper when I asked if he knew of anyone who is knowledgeable about how professionals manage their documents and who would be interested in working on a wrap-up paper with me. An initial phone call with Peter last Thursday indicated that we have a great many common interests – I found it a very stimulating conversation indeed. I’ve sent Peter some documents describing the collection and we’ve agreed to talk again on 21st March.

Regarding the search for a home for the collection (which is documented in various posts in this Blog going back to 2015), my current efforts lie in conversations I’m having with Dr James Peters, the Archivist of the National Archive for the History of Computing at Manchester University, who has kindly agreed to help me in my search. In a phone call last month, James told me he was waiting for a response from someone he had emailed, but that, if there was no interest from that source, he could issue a note to a relevant mailing list on my behalf. If it is to be the mailing list route, I’m hoping to get James’ advice on what needs to go in the note.

Backup Bolstering

Backing Up has always been an essential part of maintaining my personal document collection; but it was never something I enjoyed – I did it out of a fear of loss. And I have, indeed, experienced loss: in 1996 one of the MO Disks I was using became corrupted and I lost a number of files; in 2004 my laptop was stolen and my whole document collection had to be re-instated from the backups; and in 2017 I had a system crash and, although the repair company was able to recover all my data in that instance, that might not always be the case.

When I was working, I used to take a backup of the more recently created material every month or so, as well as complete versions of the whole collection as it kept growing. This produced multiple copies on many disks which increased my confidence in being able to replace any file that got corrupted or mislaid, but which required managing in its own right as the number of disks grew. As time went by I added other backup mechanisms including storing a copy on another laptop in the househoId, storing a copy on disk in a relative’s house located many miles away, and storing a copy on disk at my son’s house in New Zealand.

After I retired I tried to put the backing up on a more orderly basis and finally fixed on five different types of backup – Cloud, copy on another laptop in the house, local hard disk, remote (in the UK) hard disk, and New Zealand copy on memory stick. I scheduled backups in my iPad calendar for each of these (though, for the Cloud, it was more a matter of checking that it was working and that I could recover from it). However, the iPad calendar doesn’t have a To Do mechanism and I wasn’t looking at the calendar anything like as often as I used to at work. Consequently, I kept missing scheduled backup activities – and, in most cases, didn’t realise I’d missed them; and when I did realise I just kept putting off what was an annoying extra thing to do. One answer would have been to get a To Do app – but I’d had enough of To Dos at work.

The opportunity to come up with an alternative approach, came when I created a Users’ Guide for my document collection in May 2018. I structured the Guide so that it had a Quick Reference Guide to the Collection on the front page, and a Backup Quick Start Guide on the back page. The latter listed the different types of backups to be performed and provided cells to be filled in with a date when that particular backup had been done, as shown below.

This was a definite improvement over dates dotted about a calendar, but unfortunately the schedule was still hidden because the Users’ Guide was tucked away inside an archive storage box.

When I replaced my Windows 7 laptop for a Windows 10 version in December 2018, I decided to review all my backup arrangements again and to try to overcome this lack of visibility. The answer turned out to be really quite simple: I have a display frame for the latest issues of UK postage stamps, on the wall in front of where I sit at my desk. So, I created a table with columns for when backups have been done and when they are due; and this table now resides in the display fame as shown below.

I have a clear view of when the next backups are due every time I sit down at my desk. The next time I miss a backup it’ll be because I just don’t enjoy doing them, not because of blissful ignorance!

From Nottingham to Manchester

Last month I heard back from the keeper of Manuscripts and Special Collections at the University of Nottingham, Mark Dorrington, who said that my collection may not be a good fit with their archives and that, in any case, they were not geared up to deal with such a large digital collection. However he did suggest trying the National Archive for the History of Computing at the University of Manchester and provided a link to its web page.

I have, in fact, already been round the houses with the University of Manchester Library; however, that was not specifically in relation to this particular archive, and it was before I had done any digital preservation work on the collection. So, today I tried making contact with someone specifically concerned with this particular archive and was told that the archivist for this and a number of other special collections is Dr. James Peters. I duly emailed him with the following opening para: ” Dr. Peters, I’m contacting you as the Archivist in charge of the National Archive for the History of Computing (NAHC). I have a collection of documents which reflect the development and application of computers over the last 40 years, and would be grateful for your advice as to whether the collection has any merit and where it could be placed.” I followed this with a description of the background to the collection and of its contents. I’m hoping that my rather indirect approach on this occasion might engender some discussion rather than the outright rejection which I’m becoming used to.

Still looking for a home

Back in 2015 I reported on my efforts to find a permanent home for my document collection. I had no success with any of the organisations I mentioned in that post, and subsequently turned my attention to trying to find a contemporary historian who is interested in the development of computing. I came across one Daniel Wilson (no relation) based at Cambridge University who has a particular interest in the history of science and technology; and I duly contacted him. Despite being interested in hearing about the contents of the collection, he felt unable to help, explaining that “this will require significant work and few people have the budget or the time, given current pressures”. He gave me the name of another contemporary historian at Leicester University who I also tried emailing, but, despite sending a follow-up, I got no response. I’ve concluded that individual academics just have too little time to take on the management of a collection that isn’t absolutely central to a specific piece of work that they are doing.

I am now turning my attention, once more, to institutions, and have just sent an email to the Keeper of Manuscripts and Special Collections (MSC) at the University of Nottingham. I came across this organisation in a JISC email which advised that MSC has just joined the DPC. I was able to mention in the email that, not only have I just completed a digital preservation exercise on the PAWDOC collection using templates which are published in the DPC website; but also that the PAWDOC collection contains much material from the Cosmos project in which the University’s Department of Computer Science took part – perhaps those little extra bits of information might spark an extra bit of interest.

Relief

As reported in the Preservation Planning Journey in this Blog, my document collection has just been exported from the Document Management System (DMS) that it has been in for the last 22 years, and now resides in some 16,000 Windows folders. I feel a strong sense of relief that I will no longer have to nurture two complicated systems – the DMS and its underlying SQL database – in order to access the documents.

Over the years I have had to take special measures to ensure the survival of the collection through 5 changes of hardware, one laptop theft and a major system crash. This included:

  • trying to configure and maintain complex systems I had no in-depth knowledge of
  • paying out hundreds of pounds for extra specialist support (despite the software cost and most general support being very kindly provided free because this has always been a research-oriented exercise)
  • engaging with support staff over phone, email, screen sharing and in person for hundreds of hours to overcome problems (it starts to add up over 22 years…)
  • backing-up and protecting large amounts of data (40Gb total) regularly and reliably.

That’s not to say that DMSs are not worth using – they have characteristics which are essential for high usage, multi-user, systems in which regulatory and legal requirements must be met. However, such constraints don’t apply to the individual. The stark conclusion has to be that, for a Personal Information System, using a DMS was serious overkill.

I guess I’d already come to that conclusion back in 2012 when I set up a filing system for my non-work files using an Excel index and a single Windows Folder for all the documents. That has worked pretty well, however it’s slightly different from the way the newly converted work document collection is stored which has a separate Folder for each Ref No as shown below.

Experience so far with the Windows Folder system indicates that it is very easy and quick to find documents by scrolling through the Folders – quicker than it was using the DMS since there is no need to load an application and invoke a series of commands: Windows Explorer is immediately accessible. As for the process of adding new documents, that too seems much simpler and quicker than having to import files into a DMS, because it involves using the same Windows file system within which the digital files reside in the first place.

Its early days yet so it’ll be a while before I have an in-depth feel for how well other aspects of the system, such as backup requirements, are working; watch this space.

Disks and DMS

As part of the digital preservation work (documented elsewhere) that I’m doing on my document collection, I’ve just completed an exercise to organise and index all the associated physical disks.  It turns out that there are 156 disks of which 16 are actually contained in the collection, and the remaining 140 are backup disks (which have been accumulating over the years) of the collection’s computer system and digitised contents. Old backup disks may not be useful to restore a system crash, but I have kept them to provide an audit trail over the 20+ years that the digital system has been in operation.  Over that period documents have been lost, the index has had fields deleted by mistake, files have been corrupted, and no doubt other errors have occurred. Although the number of such occurrences is low, when such problems are identified it is very useful to have the ability to trace back through previous states of the system.

Another activity that has been prompted by the digital preservation work is to establish what future plans the current supplier of FISH (the document management system I use) has for the system. Last time I asked the question in February 2016, I was told that there are no plans to upgrade the product and that current customers who wanted to look at alternatives were being advised to consider a product called File Stream supplied by Filestream Ltd which is based in Berkshire in the UK. I spoke to the Fish supplier, m-hance, again earlier today and was told there had been no change – it is unlikely that Fish will be upgraded and Filestream is still the recommended replacement product. When I contacted Filestream last year I was told that the product would cost £750 to purchase and £250 a year for support including upgrades.

When I was investigating Filestream last year, I also took a quick look at Open Source document management systems and found several – some of them being free to use. However, further investigation would be required to establish what other components (such as the back-end database) would have to be acquired and whether they would also be free.

These and other options to future proof the collection will all be considered in the digital preservation project currently underway.

Digitised and Checked

I reached a milestone today: my document collection is totally digitised, and every Index entry and associated Document Management folder has been checked. It’s been a very laborious process – which is why my last entry here was over four months ago. However, the collection is now in good shape for a digital preservation exercise, and is ready for transfer to a long-term repository if one can be found.

Following the checking exercise, a detailed analysis was performed to derive statistics and rectify problems where possible. The report documenting the analysis serves as a comprehensive status report on the whole collection at the end of May 2016.

Digitisation in progress

Since my last entry I’ve been steadily digitising the remaining paper in my lifetime work document collection. These are documents I want to retain in original form (some of which have a comb binding), documents that need to be scanned in colour, or documents that were too large to go through the scanner. I acquired a better comb binding machine at the end of October, my current scanner has full colour capability, and I’ve found that photographing large items with my modern camera produces a perfectly readable on-screen image. So there’ve been no more obstacles to getting the job done. As each item is digitised and the file inserted into the FISH document management system, I’m checking the index entry and updating the Movement Status field with either OK or XX as described in my last entry.  At the current rate of progress I should finish the digitisation work by the end of January.