PAWDOC: Reliability and Longevity

Operating a Personal Electronic Filing System is just an adjunct to one’s main work, and, consequently, it’s at the bottom of the pecking order when it comes to an individual’s time and attention. This combined with the fact that we humans do make mistakes, means that filing tasks may build up, documents may get lost, scans may miss out pages, file titles may include incorrect Reference Numbers etc.. Despite all these problems, experience with the PAWDOC system has shown that it is possible to operate such a system successfully over the long term. It has also demonstrated very clearly that it would be almost impossible to maintain a hardcopy-based personal filing system across a lifetime of work; but that it is certainly possible to do so with a digitised version. The reason is simply that the volume of paper is overwhelming, whereas an equivalent digital collection is eminently manageable.

The very intangibility of a digital collection does, however, present dangers which need to be addressed if it is to survive. Backing-up is essential, and creating multiple backups placed on various media in different and distant locations is a wise move. Technology’s current incessant charge of development, also presents challenges to a collection’s long-term readability, and owners must be prepared to perform digital preservation work periodically to keep their hardware and applications operational and up to date.

Specific questions relating to this aspect are answered below. Note that the status of each answer will fall into one of the following 5 categories: Not Started, Ideas Formed, Experience Gained, Partially Answered, Fully Answered.

Q55. Will human errors make the filing system unworkable?

2001 Answer: Experience gained: No, because the number of errors is relatively low and fall mainly into the following categories:

  • Duplicate reference numbers on physical documents (the indexing system precludes duplicate reference numbers in the index) (Wilson 1992a; 2, 1992b: 2.10).
  • Hardcopy documents out of order in the cabinet/box.
  • Archived items not marked as archived in the index, and vice-versa (Wilson 1992b: 2.10).

2019 Answer: Fully answered: Human error will creep into most systems which have human operators – and probably even more so in personal filing systems which have to be managed alongside heavy workloads. The 2001 answer identified three types of human error discovered in the PAWDOC collection (duplicate Reference Numbers on hardcopies, misfiled documents, and errors in the Index Movements field). In addition to these, the recent checking and Digital Preservation work that has been undertaken on the PAWDOC collection identified several other types of error including:

  • 285 items have been lost over the years – about 1.6% of the total.
  • 33 instances of missing pages have been identified in scanned documents – probably caused by human errors in the course of scanning.
  • 9 records in which text was not copied correctly from emails into a word document (which was my preferred approach to capturing email text for inclusion in the collection) have been identified. This probably occurred because I failed to check that all the text had been pasted in.
  • 5 records where index entries have been inadvertently left empty – probably caused by a mix up in the course of creating new records.
  • 2 instances in which the wrong document was scanned so that the digitised document is not the document that is specified in the relevant index entry.

No doubt there are others. However, despite this, the filing system continues to work successfully, and, over the years, I have rarely come up against such errors when I have been searching for documents.

Q56. What backup arrangements should be put in place to protect the integrity or sheer existence of the filing system?

2001 Answer: Partially answered: A comprehensive collection of all one’s files becomes a unique irreplaceable entity over a period of years. To ensure its availability and existence the following measures need to be taken (Wilson 1992b: 2.13):

  • Regular backup of the index – daily is preferable, weekly is realistic, monthly is essential.
  • Regular backup of the electronic files and scanned images – daily is preferable, weekly is realistic, monthly is essential.
  • Printout of the index in KWIC (Keyword in Context) format – every six months or yearly (though I have never had software able to do this).
  • Secondary backups of index, electronic files, scanned images and KWIC index stored in a location different to the location of the primary backup media – every six months or yearly.
  • Tertiary backup in a secure environment such as a bank – every six months or yearly (I have not done this yet but am seriously contemplating it).

2019 Answer: Fully answered: Backing up is an essential element in any computer system. It is advisable to have at least two copies, one of which is held some miles away from the master. The PAWDOC backup regime is clearly described in the PAWDOC User Guide; and, to prompt me to actually perform the backups, I have a table with upcoming backup dates in a frame on the wall that is directly in front of me when I sit at my desk. The backup regime that is applied to the PAWDOC collection is as follows:

  • Cloud: Ongoing backup of new files and changes to files are made to a cloud service.
  • Offline backup to an external drive at home: New copies of the whole collection are taken once a year.
  • Copy on other laptop at home: The back up on the external drive described above is copied to the other laptop in the house immediately after the new copy has been acquired, i.e. once a year.
  • Remote UK external drive: The whole collection is copied onto this hard drive once every two years and it is stored at least 10 miles away from the master laptop.
  • Remote out of country backup: A copy of the whole collection is copied to a 128Gb memory stick and given to the person who lives in the country concerned, whenever I meet up with that person.

Q57. Are electronic filing systems reliable over very long periods?

2001 Answer: Partially answered: Over the 20 years of this project, the system has been very reliable. However, the following problems have been experienced or are anticipated as the system gets older:

  • Crashes of the index database – recovered either by functionality in the software or by using backups.
  • Magneto-Optical disk corruption (has happened to just one disk) – recovered by using backups.
  • Document management system has lost about 30 files – not sure how this happened and it was not recoverable.
  • Longevity of other people’s files – I am sure I could not now obtain some of the items belonging to other people to which I put a reference in my index 10 or 15 years ago.
  • Longevity of web addresses – I do not think that some of the web addresses the index points to will be still live after several years. We have yet to see whether web addresses of journal contents will be reliable over long periods (Wilson 1996a).
  • Electronic files stored in old versions of software when the original application software may no longer exist on your PC, or may have been upgraded beyond recognition. This is a potentially very serious problem over periods of 10 or 20 years or more (Wilson 1997: 1).

2019 Answer: Fully answered: The fact that the PAWDOC system is still fully operational after 38 years does demonstrate that such systems can be reliable in the long term, despite the inevitable loss of some documents or pages within documents.  However, in practice much depends on the diligence of owners and whether they are sufficiently motivated to take regular backups and to perform digital preservation activities on their collections. Taking an overall very long-term view, the longevity of such systems relies on the following 4 characteristics:

  • Visibility: Because an electronic filing system (EFS) is, by its nature, intangible and locked away somewhere inside a computer, the first essential requirement for it to survive is for one or more people to be aware of its existence. This can, of course, be achieved by simply telling people. However, PAWDOC’s existence is also fully documented in a Hardcopy User’s Guide which is contained in one of the two archive boxes in my study.
  • Accessibility: Knowing that an EFS exists isn’t the same as being able to get at it; over a period of years, laptops become defunct and inaccessible; and backup technologies may cease to work. Therefore, for EFSs to continue to work long-term, the platforms they run on must be kept up to date.
  • Integrity: For an EFS to work properly it is necessary to have all the software and data that it uses, in place. Missing data can be very annoying and even disasterous; whilst missing or corrupt application software can preclude the system working at all. Effective backup regimes can help to alleviate this problem.
  • Readability: The data files in an EFS can’t be read unless there is an application that can open them up and display them. Over time, applications get upgraded or may become defunct. Therefore, it is essential to implement a Digital Preservation routine that identifies files in danger of no longer being accessible and that takes steps to rectify the problem.

If all these aspects are addressed, an EFS should be able to survive for many, many years.

Leave a Reply

Your email address will not be published. Required fields are marked *