Taking Stock and Set to Go

In 2019, I started collaborating with Peter Tolmie with the aim of producing some overall results from my 40 years experience of personal electronic filing. It wasn’t long before Peter observed that my PAWDOC filing collection was just another manifestation of my inclination to keep things; and he suggested I keep a log of my keeping activities. I realised then that whatever we produced would be about more than my PAWDOC activities, and that I might as well write up my latest thoughts on PAWDOC there and then in this blog. Peter and I prefaced this summation with a post about the impact of digitisation over the last 40 years. Since then, Peter has gained further insights into my activities by investigating my attempts at understanding knowledge development; and by reading my write up of comments I made when being reunited with certain documents after many years.

We both now feel it is time to get on and do what it takes to produce some outputs. Namely, a book on the subject of digitisation’s impact on personal curation of any assemblage of materials where the assemblage is premised upon not only current but potential future use. This will be based upon all the investigations and writings already described, as well as auto-ethnographic investigations of a variety of collections that Peter and myself have been associated with. The questions to be asked range from the Use, Curation and Searching of the collections, to the Security, Preservation and Loss of the contents; all considered from both pre and post digitisation perspectives. We now have the provisional list of collections listed down the left-hand side of a spreadsheet and the questions along the top, so we’re pretty much set to go.

Comments on reunions with old documents

My colleague, Clive Holtham, was instrumental in putting me in touch with suppliers who loaned me a scanner and document management software around 1995, to enable me to progress my mission to understand how personal electronic filing would work in practice.  Some six years later, in February 2001, Clive and I met up for dinner and a catch up on what we’d both been doing. I explained that as well as scanning new hardcopy as I acquired it, I was also trying to scan all the legacy documents I had acquired since 1981, when I started this electronic filing adventure. Clive pointed out that it would be interesting to see what I thought of each document in retrospect, as I carried out the scanning process. After all, the point of indexing and filing the documents, was based on the assumption that some of them would have some value downstream. Here was an opportunity to get an insight into what their downstream value might be.

I took Clive’s suggestion on board towards the end of 2001; but, to minimise the effort required, I decided I would only comment on those documents which prompted some particular thoughts. The comments would be recorded at the end of the Title field in my filing index; and they would be identifiable by being placed within a special set of characters in the following format: <<! Date: Comment Text here !>>. To make it easier, I created a script in my Indexing software to automatically place the delimiter characters with current date at the end of the title field, and assigned it the keyboard shortcut CTR-8. This seemed to work in practice, and I got into the habit of making my comments in real time as they occurred to me. After a while, I started to use the facility to record other information, such as a document being duplicated in another Index entry, or problems I had had with scanning a document. Now, in 2021, 20 years after starting to record these comments, I find that 584 records within my filing index possess such comments; and 20 of those have two comments.

This is an analysis of what those comments say. They have been placed into one or more of 5 categories:

  • Comments on the impact of the material (7% of the 584 records with comments)
  • Comments on the contents of the material (32%)
  • Comments which prompted questions and thoughts (23%)
  • Comments about memories forgotten and/or remembered (17%)
  • Comments about filing, indexing and scanning activities (46%)

The full list of comments and the categories to which they have been allocated is provided in this link. The comments have been further allocated into sub-categories  which are used in the discussion below. However, the following two salient points need to be born in mind when considering the results of this investigation:

Scale: Although 584 records with comments may sound a large number, in fact comments have only been made on a small subset of the contents of the filing system: 584 is only about 3% of the 17,350 records in the index. This could indicate that the sample size is too small to be generalised; though, I believe it is more likely to indicate that relatively few documents merited a comment. Unfortunately, there is no data to investigate which of these two possibilities is the case – the decisions to include comments were made in an arbitrary manner over many years.

Lag: The time between a document being included in the filing system and when a comment was made about it, has almost certainly affected many of the comments. Presumably, the more time that passes, the less likely the contents of a document are to be remembered, and this may make them more remarkable when they are encountered again. The actual lags that occurred have been calculated as a number of years by the difference between the Creation Date field in the Index, and the date recorded at the beginning of each comment. This shows that over 93% of the comments were made more than 10 years after the documents were included in the filing system; and over 50% had comments with a lag of over 20 years. Only 12 items had comments with a lag of less than 5 years.

Comments on the impact of the material (43)

These comments include remarks about documents which have influenced my thinking (8). For example, “This is a most important paper because it alerted me to the key insight that to get the most out of an OA investment the organisation must change the way it does business”. A further 9 comments relate to documents which were more generally important to my work, for example, “This was an important edition of EDP analyser and highly relevant to NCC’s OA team of which I was a part”. Finally, 26 comments were made about documents that are special in a variety of other ways, for example, “This is a great example of how to do brainstorming”, and “This is an interesting document to have from the early days of the net”.

Comments on the contents of the material (188)

Just over a third of this category is concerned with comments about a document I wrote or activity I was involved with. This is hardly unexpected given my intimate relationship with the events. For example, “Have just read the suggestions I made to Esprit about its CSCW program. I wonder if they made any kind of difference”; and “This was my one not very successful claim to broadcast fame – and I’m not even sure it got broadcast”. A quarter of the comments just remark on ‘interesting content’, for example, “This is a fascinating article because it represents a twilight period in the change from old style typists to individuals doing the typing all themselves”; and “This was worth another read – definitely food for thought…”. The remainder includes comments in a range of other sub-categories – listed below together with an example for each one.

  • Comments on technology developments (16) – “Seems very advanced for 1978”
  • Assessments of predictions (8) – “The prediction of a day in the life of the CEO in 2013 didn’t get it quite right”
  • Comments on the author or other people (8) – “I’ve been thinking about getting in touch with X again”
  • Comments on photos in documents (7) – “While scanning this I discovered that it contains a photo of X”
  • Content which I thought I might find useful (26) – “This document is highly relevant to the assignment I am about to start”
  • Comments which provide a critique of content (5) – “I think this process missed out the key element of Improvement by Learning by Doing”.

Comments which prompted questions and thoughts (135)

The majority of these comments – some 60% – were general reflections and musings prompted by the documents concerned. For example, “I think it demonstrates that prior to the internet and the web there was a different way of thinking about information: in those days having the information meant having the actual item, whereas today in the internet/web/mobile era having the information is all about having a device and knowing where to look”; and “It would be interesting – amazing – to re-run this event with the same people”. The other six sub-categories are all specific questions:

  • Is this still around/available/the case today? (10 comments – for example “I don’t hear the term ‘Groupware’ much these days – I wonder if it has fallen out of use”
  • What’s a person doing today? (15 – “I wonder If X is doing anything related to this now – havn’t seen him for about 20 years”)
  • Is this still relevant today? (12 – “This might be interesting to read to see if 25-year-old advice about dealing with Info overload still applies”)
  • How does this look in retrospect? (4 – “There was a big fuss about X’s thinking on this – would be interesting to see how it all looks in retrospect”.
  • What was the impact of this? (5 – “This work on Teletel was ground breaking and was subsequently successful. How it affected the French use and take-up of the web I don’t know”)
  • How did these predictions fare? (6 – “The Booze-Allen Hamilton report was very influential. It would be interesting to see how its predictions fared”)

Memories forgotten and/or remembered (100)

70% of these comments are about things I’d forgotten either partially or wholly; and 30% about things I remembered about associated aspects, or about people. Examples of each are provided below:

  • Forgetting something about a document or a related activity (28 comments), for example, “I’d forgotten these details and didn’t know I had these notes”
  • Forgetting about the document or activity all-together (41): “Can’t remember giving this talk”
  • Remembering associated aspects or it prompted memories (19); “This was a pioneering machine – we really liked the Snake game, and the early type of remote access mail through the phone lines was relatively quite advanced”.
  • Remembering the author/other person (12); “That’s a name I haven’t thought about for years! – think I met him”

Filing, indexing and scanning activities (266)

Over a third of all these comments concern filing practicalities – not an aspect which was envisaged when I established this comment facility. Recording information about the operation of a filing system is definitely an overhead, so there is a natural tendency to minimise the effort spent on it. Consequently, the fact that it was quick and simple to create comments in a form which was tightly coupled with individual documents and their index entries, made this facility an obvious choice for quickly documenting issues or important observations. The 22 separate sub-categories of comment listed below together with an example for each one, illustrate the extensive range of topics that were encountered as the PAWDOC collection grew and aged (note that over 93% of these comments were made at least 10 years after the document concerned had been included in the collection).

  • Practicalities of using PAWDOC (5) “Must force myself to search for stuff even if I don’t think it’s in this index!”
  • Deciding what to include/remove (5) “Artefact removed for inclusion in PAW personal collection”
  • Notes about where items originated (6) “The Quick Reference Card was included in Nov2018 when I found it inside the WGEM starter pack”
  • Notes about what version is filed (8) “This Aug86 version must have replaced an earlier version in my collection”
  • Notes about artefacts (6) “Specified this as an artefact at this late date because it’s the first issue I have in this new format”
  • Notes about cross-references in the collection (7) “See also PAW/DOC/0110/145”
  • Notes about duplicates in the collection (87) “Some of these documents are duplicated in PAW/DOC/7971/01”
  • Notes about Archiving (6) “This was in an archive box but archive status had not been specified in the Movement field”
  • Comments on Reference Number (16) “This document has the number PAW/DOC/0005/03 at the top – but that number is for something else”
  • Comments on Title field (9) “Inserted the info about the abstract when I was scanning because there was no reference to it in the title”
  • Comments on Creation date (12) “Don’t understand how the date on this paper is 1986 but the record was created in 1984”
  • Comments on Publication date (3) “2019 properties of the word doc say this was modified on 31May1985 so this was probably the publication date”
  • Comments on Movement field (10) “Don’t know why this says it was scanned and paper destroyed in 2004 – in Feb 2006 there was a full envelope of material in the box”
  • Losing/deleting index information (4) “I deleted the title text of this accidentally when scanning so this is a replacement title text”
  • Lost or misplaced documents (17) “Found the electronic version of this filed in FISH under PAW/DOC/4052/01”
  • Relationship with Personal files (8) “I found these PAW/DOC papers in one of my personal home files”
  • Notes about physical characteristics of items (17) “This printout had almost completely faded so it was a challenge to see if the scanner would bring the text to light – and it didn’t do a bad job!”
  • Notes about disks in the collection (6) “This included a disk containing a DOS version of the ITSforGKProposal”
  • Management of the FISH DMS (8) “This seemed very necessary at the time when disk space was short – and very complicated. Now in 2006 with 40Gb on my PC it doesn’t seem to be an imperative at all”
  • File formats & Digital Preservation (7) “No longer able to read the floppy disk when it came to take this material out of archive to scan it in 2006”
  • Notes about loading electronic files to FISH (17) “The Word version doesn’t have the appendices so I PDF’d the Word version and then scanned the appendix pages from the hardcopy. Unfortunately, the pagination of the Word document is slightly different from that of the hardcopy – but the words are all the same”
  • Notes about Scans and Scanning (49) “These pages were too thick to go through the duplex scanning process so I had to do one side first and then the other side”.

Conclusions

No great revelations have emerged from this investigation. However, its clear that reviewing old material in this way provides an opportunity to reflect, and perhaps to rediscover potentially useful material. These are luxuries that are hard to come by amidst the pace of modern life. Whether such activities actually provide any tangible benefits is hard to say: I can’t remember if any of the rediscovered documents made a difference in my subsequent assignments; and the benefits of reflection are difficult to pin down at the best of times (though I personally feel it is always worthwhile).

The one practical finding that has emerged from this exercise is that there are significant advantages in being able to quickly and easily annotate a filing index with any relevant additional information, be that extra detail about content, or factual information about the way that content has been filed. The former augments the information provided by the filing system, and the latter assists in its smooth operation. In fact, the latter is more than a mere nicety. My experience has shown that, as this type of personal filing system grows and ages, the number of imperfections it possesses increases substantially. The long list above of sub-categories of ‘Filing, indexing and scanning activities‘, and their associated examples, provides an indication of the range of issues that can arise. Having the ability to quickly note details of those issues in a place where they are likely to be immediately visible to the user, is of great benefit.

PAWDOC: Requirements and Objectives

The 2001 paper reviewing the first 20 years of use of the PAWDOC system listed 23 requirements for a Personal Electronic Filing System and provided a status for each one.  The table below reproduces that listing and also provides an updated status for PAWDOC in 2019.

Requirement 2001 Status 2019 Status Notes
Requirements dictated by The Job
1. Cope with large amounts of material, some of which becomes redundant very quickly. Fully met Fully met
Requirements dictated by the physical environment in which the system is used
2.Cope with changing terminology Not met at all Not met at all This would require specific functionality in the Index.
3. Be capable of being operated in temporary and limited office accommodation. Partially met Fully met Not fully met because the whole collection is now digitised on the laptop; and scanners are found in most offices.
4. Be easily portable. Partially met Fully met Now fully met because laptops are now small and powerful, and have more than enough storage.
Requirements to support the information sources used
5. Handle hardcopy material in a very wide range of physical sizes and formats. Partially met Fully met Hardcopy that is too large, or too difficult to scan, can now be photographed with a mobile phone at sufficiently high resolution for it to be read on screen.
6. Handle documents containing coloured text, backgrounds, diagrams and pictures.  

Not met at all

 

Fully met Now fully met because modern scanners handle colour; and they have brightness and contrast settings which can be adjusted to be able to scan most document contents.
7. Record references to material in other people’s filing systems. Fully met Fully met
8. Manage information received via email and computer conferencing systems (including Lotus Notes). Partially met Fully met Now fully met because any file formats can be handled, and getting email content into such files is not the concern of the filing system.
9. Record references to material in remote systems such as Lotus Notes databases and web sites, and access those remote systems and retrieve the relevant information. Partially met Fully met HTML references are included in Filemaker Index entries, and can be opened by selecting them and right clicking which presents an ‘Open’ menu option.
Requirements to manage information that is created by the filing system owner
10. Handle handwritten text and diagrams on paper. Partially met Fully met Now fully met because modern scanners have brightness and contrast settings which can be adjusted to be able to scan most document contents.
11. Handle electronic files Fully met Fully met
Requirements to help cope with information and communication overload
12. Support the rapid organization of information before it has been dealt with. Not met at all Not met at all This would require special functionality in the filing system.
13. Make visible what information has to be dealt with and support the scheduling of dealing with it. Not met at all Not met at all This would require special functionality in the filing system.
14. Enable information to be filed very quickly. Fully met Fully met
Requirements to support information access
15. Enable information to be retrieved simply and quickly. Fully met Fully met
Requirements to support the reuse of information
16. Enable templates, best practice and other reusable material to be identified, retrieved and reused. Partially met Fully met This is now fully met because I now believe it is best addressed by including appropriate wording in the Index Title field.
17. Enable existing material to be copied and modified, and to be stored as new material. Fully met

 

Fully met

 

Requirements to support knowledge acquisition and development
18. Identify, store and retrieve the marks highlighting key text. Not met at all Not met at all This would require special functionality in the filing system.
19. Collect together all important points and present them as a coherent set of information  

Not met at all

 

Not met at all This would require special functionality in the filing system.
20. Enable the user to relate all important points together in such a way that concepts can be developed as new material is acquired. Not met at all Not met at all This would require special functionality in the filing system.
21. Assist the user to identify knowledge developments that are occurring and to choose what areas to focus on. Not met at all Not met at all This would require special functionality in the filing system.
Technology support
22. The technology must be cheap enough for an organization not to quibble over, and for individuals to buy for themselves. Partially met  

Fully met

 

This now fully met because costs of scanners have dropped; and IT has been established that a Document Management System is not required.
23. The technology must be reliable enough not to require expensive maintenance contracts or multiple one-off repairs. Partially met

 

Fully met

 

This is now fully met because the technology required is standard and now very reliable.

Two things are clear from the comparison of the status in 2001 and 2019: the technology for a personal electronic filing system has now become commonplace and relatively cheap; and, no developments have occurred to support the particular requirements of ‘coping with information and communication overload’, and of ‘supporting knowledge acquisition and development’. I believe the latter is probably true not just for the PAWDOC system but generally – I have not heard of any work going on in these areas.

The 2001 paper also described three objectives for the work:

  • To provide practical feedback to product developers, system designers, and other (potential) users regarding the real day-to-day requirements of individuals using personal electronic filing systems.
  • To establish an office document test set.
  • My own personal need to stay organised and efficient in my day-to-day work.

Of the three, only the final one – to keep myself organised – has been fully achieved.

Regarding the first objective, I have done my best to document my experiences and make that information freely available to product developers, system designers and other users, but I have seen no evidence to suggest much interest – it seems there is not a consumer-led demand for this capability. Certainly, I have come across very few people, if any, who have been operating an all-inclusive PAWDOC-type system without any direct contact with myself.

I believe the second objective – to establish on office document test set – has simply been overtaken by events; the huge strides made in search & retrieval algorithms by internet search engines such as Google over the last 30 years has simply removed the need for such test sets.

Over the last 15 years or so, two more potential objectives – or at least potential uses – have come to mind: first, as huge changes continue to occur as a result of ever-increasing computing power, the universality of the mobile phone, and the ubiquity of the internet, it has occurred to me that the PAWDOC collection provides a unique insight to the early stages of this massive transition in business and society in general. Therefore, I continue to seek a permanent repository for the collection, in the belief that it may hold some value for future researchers.

The second additional objective concerns digital preservation. In order to ensure PAWDOC’s future accessibility, I have had to develop suitable digital preservation processes and documentation and to apply them to the very diverse range of material in the collection. In the course of this work it has occurred to me that the collection might be useful to the digital preservation community as a test bed and training tool. If I have no success in finding a permanent destination for the collection as a research tool, I may then try to find a home for it within the digital preservation community.

This entry brings to an end my own personal final review of the PAWDOC personal electronic filing system. The remaining work to be done is to assemble a set of Conclusions. However, this will be a joint effort between Peter Tolmie, an independent researcher, and myself.

PAWDOC: Architecture

Four types of prototype solutions have been used in the course of this work: card index, electronic index, electronic index and document management system, and electronic Index and file storage in Windows 10 folders. The timeline of development is shown below and the architecture of each solution is described in the subsequent text.

1980      Visit to Amoco Research Centre, Tulsa: first electronic office filing system seen
1981      First entry placed in card index (1 June 1981
1987      Implementation of a computer-based index – Filemaker software on a Macintosh
1993      Movement of Filemaker index from Macintosh to a Compaq LTE Elite laptop running MS Windows
1994      Scanner and Magneto-Optical drive loaned by Fujitsu
1995      Paperclip Document Management software loaned by DDS
2000      Paperclip upgraded to FISH product and loaned by Ringwood Software (which was subsequently taken over by Azur Group, which was taken over by Maxima which was taken over by m-Hance)
2018      FISH document management system removed and replaced with Windows 10 folders

1. Architecture of the card index system, 01Jun1981 – 30Dec1987

 This consisted of paper documents stored in an upright cabinet in serial number order, and an index on 6×4 inch cards. The cards were initially held in a small plastic box with a lid, and later on in a metal drawer. The index consisted of a series of title cards with an entry for each document (serial number, title, keywords) and a set of keyword cards – each one having a keyword on the top and then all the entries possessing that keyword listed underneath. The rationale for this filing schema was worked out with my colleague John Pritchard, and it continues to form the basis for the system as it stands today. The Card Index Architecture is illustrated below.

2. Architecture of the electronic index, Jan1988 – Dec1994

After being provided with a Macintosh computer at work, I selected Filemaker, a general purpose database with great flexibility and ease of use, to hold the Index. The fields used for the Filemaker index are:

  • Reference number.
  • Title/keywords.
  • Movement status (to record the location of items borrowed by people, archived, lost, etc.).
  • Publication date (to indicate when an individual item was first published).
  • Creation date (to record when an item was catalogued into the filing system).
  • Date last accessed (to record when an item has last been retrieved thereby enabling items that have never been accessed, or least recently accessed, to be identified as most suitable for archiving).

By 1988, the amount of paper had overflowed the upright filing cabinet and I had started to archive less frequently accessed documents to cardboard liquor boxes, which were stored around me in my office. The architecture of the Electronic Index system is illustrated below.

3. Architecture of the electronic index and document management system, Jan1995 – Feb2018

In 1994 Fujitsu offered to support the work by providing, on long-term loan, a scanner and a magneto-optical storage system; and in 1995 DDS, the European distributors of the PaperClip document management system, provided PaperClip on a long-term loan basis. By the beginning of 1996, these components had all been trialled and tested and were operational. Paperclip provides drawers, folders within drawers, and documents within folders. All the PAWDOC folders were contained in a single drawer; and a single folder was allocated to each PAWDOC Reference Number. Multiple documents of one or many formats could be contained within a folder.

Paperclip interworked with Filemaker by control key combinations which, when selected within Filemaker, copied specified information on the screen, imported it into Paperclip and enacted a Paperclip action. Two interworking functions were set up:

  • Search (search for a folder in Paperclip based on a reference number picked up in Filemaker);
  • Create (create a new folder in Paperclip and insert the Filemaker reference number into the folder index field).

PaperClip was upgraded to FISH in 2000. The architecture of the combined Index and document management system is illustrated below.

4. Architecture of the electronic index and Windows 10 folders, Mar2018 – present

The companies that owned the European rights to the Fish document management system were successively taken over in the first 15 years of the new millennia, and, by 2017, Fish was owned and supported by m-Hance. However, the digital preservation exercise undertaken on the PAWDOC collection in 2017 learned that there were no plans to develop Fish any further, so it was decided to remove Fish and replace it with Windows 10 folders. The conversion was undertaken successfully in 2018. The Electronic Index and Windows 10 Folders architecture is illustrated below.

PAWDOC: Costs and benefits

Operating a Personal Electronic Filing System is no trivial matter – it takes determination and an appreciable amount of extra time. So, is it worth it? The answer depends very much on the type of person you are. Those who like to be organised will find it helps them to be even more organised, as well as keeping their documentation in order and enabling them to find things when they need them. There is the added benefit of it gradually building a complete collection over a period of time, which can be referred back to at will. Those who are inclined to operate with a little less order may be disinclined to spend their time on what can be perceived as a purely administrative activity.

The costs involved are relatively low. Laptop, scanner and storage technologies are now all sufficiently well developed as to be more than able to support a Personal Electronic Filing System. The additional cost of the PAWDOC system, over and above the laptop and its operating system, is around £900.

Specific questions relating to this aspect are answered below. Note that the status of each answer will fall into one of the following 5 categories: Not Started, Ideas Formed, Experience Gained, Partially Answered, Fully Answered.

Q58. How much does a personal electronic filing system cost?

2001 Answer: Experience gained: The costs are falling. Approximate 2001 prices of the components are:

  • Filemaker Pro £220
  • FISH Document Management System and Sybase SQL Anywhere single user combination £2095 (a 10-user networked version of FISH costs about £10 000)
  • Fujitsu Scanpartner 10 Sheetfeed and Flatbed Scanner £821
  • Fujitsu MO Drive (currently 1.3 GB) £388
  • CD Writer hardware £82
  • Adobe CD Creator (often bundled with CD writers) £45

Total price (excluding the laptop PC) for the configuration used in this study is £3650. However lower priced solutions with broadly similar capabilities can certainly be assembled from alternative products (for a list of low cost Personal Knowledge Management Tools see http://www.destinationcrm.com/articles_-images/km/pkmtchart.htm)

2019 Answer: Fully answered: After removing the document management system from the PAWDOC architecture in 2018 and replacing it with Windows 10 folders, the 2019 approximate costs of the current PAWDOC system are as follows:

  • Filemaker Pro 18: new £520; upgrade from Filemaker Pro 15 £190
  • Canon DR-2020U scanner: £660 – but this model is probably discontinued: the current model with equivalent functionality can be bought for about £290.
  • Cloud service for ongoing backup: free with BT Broadband; equivalent standalone service – Apple iCloud, for example, £8/month.
  • 1 Tb External hard drive for local backup: Seagate £40
  • 500Gb External hard drive for remote in-country backup: Maxtor £30
  • 128Gb memory stick flash drive for out of country backup: Kingston £15

The total price (excluding the laptop and the Windows 10 operating system which came with the laptop) is £895 + £8 / month cloud backup.

Q59. Is it worth spending the time and money on a personal electronic filing system?

2001 Answer: Ideas formed: Yes, the core benefits of an improved ability to find documents and files – faster retrieval and space reduction – are achievable and do make a difference. Furthermore, these benefits continue to be achievable over many, many years. However, the desirability of these benefits, and the way the filing system is operated is highly dependent on individual preferences and work style.

2019 Answer: Fully answered: Before answering this question, there needs to be clarity on what is meant by a Personal Electronic Filing System (PEFS) because it may mean different things to different people. At its most fundamental level, it’s something most people who have computers do – they put electronic files into folders provided by the operating system. However, the PEFS that is being discussed here is something much more than that. It has three characteristics:

  • Operates as a single system: All files, regardless of size, file type and creating application, are stored in the same single system with the same standards and structure and organisation.
  • Digitises hardcopy: Hardcopies are incorporated into the same single system by digitising them. Any hardcopies that remain are retained either because they are artefacts with special value in the physical form, or to act as temporary working documents.
  • Controls documents: Files are not just put into folders willy nilly. Their existence is recorded and they are given a unique reference in their title so that they can be identified.

These special characteristics mean that an individual has to apply an amount of extra  effort, and to have a certain amount of determination, to operate a PEFS. Some people may not want to work in such a structured way; and other people may not want to expend that time and effort. Such people will feel that it just isn’t worth doing. If you are prepared to do it, however, my experience is that it is definitely worth it. It organises documents and enables you to find them again. It helps you to organise your work generally; and, by its very nature, it automatically builds a long term collection which can be accessed at will. I personally found it so useful that it just became an integral part of my normal day to day work.

Regarding cost, I believe this to be much more reasonable these days and well within the reach of an individual.

Q60. What do other people think about this approach to electronic filing?

2001 Answer: Not started: Although at least three other people/groups (John Pritchard, Dave Harris and the CSC UK Consulting & Systems Integration Information Centre) have tried out this approach, no detailed work has been done to establish their views on its effectiveness and desirability.

2019 Answer: Fully answered: I have done no systematic work to establish people’s views about this type of Personal Electronic Filing System (PEFS). However, I have assisted at least 4 different people / organisations to implement such a system, so their willingness to try it out indicates that they could at least perceive that there might be some potential benefits. The four instances are:

  • My colleague at NCC, John Pritchard, who designed the PAWDOC schema with me. He used the approach until he left NCC in 1990.
  • The CSC UK Technical Library at Slough which applied the approach for about 18 months in 1988-90 using the Aquila application for the Index. The general approach was taken up in the reincarnation of the library in the form of CSC UK’s Consulting & Systems Integration information Centre in Farnborough which was certainly using the same Reference Number schema around 1995-96.
  • A colleague at CSC who worked for me, who used the approach for about 8 months in 1988-9 before leaving the company.
  • Another CSC colleague who used the approach for 5 years in 1985-90 before leaving the company.

I don’t recall any of these people/organisations saying that the approach was not workable or worthwhile. On the contrary, the person who used the system for 5 years said in the summary of his experience with the approach that he “found it to be a very effective way of controlling my own documents”.

Of course these instances and anecdotes provide almost no hard evidence at all. It is probably only my long and highly documented use that gives any detailed insight. But at least the combination of the two sets of material may provide the basis for readers to form their own views.

PAWDOC: Reliability and Longevity

Operating a Personal Electronic Filing System is just an adjunct to one’s main work, and, consequently, it’s at the bottom of the pecking order when it comes to an individual’s time and attention. This combined with the fact that we humans do make mistakes, means that filing tasks may build up, documents may get lost, scans may miss out pages, file titles may include incorrect Reference Numbers etc.. Despite all these problems, experience with the PAWDOC system has shown that it is possible to operate such a system successfully over the long term. It has also demonstrated very clearly that it would be almost impossible to maintain a hardcopy-based personal filing system across a lifetime of work; but that it is certainly possible to do so with a digitised version. The reason is simply that the volume of paper is overwhelming, whereas an equivalent digital collection is eminently manageable.

The very intangeability of a digital collection does, however, present dangers which need to be addressed if it is to survive. Backing-up is essential, and creating multiple backups placed on various media in different and distant locations is a wise move. Technology’s current incessant charge of development, also presents challenges to a collection’s long term readability, and owners must be prepared to perform digital preservation work periodically to keep their hardware and applications operational and up to date.

Specific questions relating to this aspect are answered below. Note that the status of each answer will fall into one of the following 5 categories: Not Started, Ideas Formed, Experience Gained, Partially Answered, Fully Answered.

Q55. Will human errors make the filing system unworkable?

2001 Answer: Experience gained: No, because the number of errors is relatively low and fall mainly into the following categories:

  • Duplicate reference numbers on physical documents (the indexing system precludes duplicate reference numbers in the index) (Wilson 1992a; 2, 1992b: 2.10).
  • Hardcopy documents out of order in the cabinet/box.
  • Archived items not marked as archived in the index, and vice-versa (Wilson 1992b: 2.10).

2019 Answer: Fully answered: Human error will creep into most systems which have human operaters – and probably even more so in personal filing systems which have to be managed alongside heavy workloads. The 2001 answer identified three types of human error discovered in the PAWDOC collection (duplicate Reference Numbers on hardcopies, misfiled documents, and errors in the Index Movements field). In addition to these, the recent checking and Digital Preservation work that has been undertaken on the PAWDOC collection identified several other types of error including:

  • 285 items have been lost over the years – about 1.6% of the total.
  • 33 instances of missing pages have been identified in scanned documents – probably caused by human errors in the course of scanning.
  • 9 records in which text was not copied correctly from emails into a word document (which was my preferred approach to capturing email text for inclusion in the collection) have been identified. This probably occurred because I failed to check that all the text had been pasted in.
  • 5 records where index entries have been inadvertently left empty – probably caused by a mix up in the course of creating new records.
  • 2 instances in which the wrong document was scanned so that the digitised document is not the document that is specified in the relevant index entry.

No doubt there are others. However, despite this, the filing system continues to work successfully, and, over the years, I have rarely come up against such errors when I have been searching for documents.

Q56. What backup arrangements should be put in place to protect the integrity or sheer existence of the filing system?

2001 Answer: Partially answered: A comprehensive collection of all one’s files becomes a unique irreplaceable entity over a period of years. To ensure its availability and existence the following measures need to be taken (Wilson 1992b: 2.13):

  • Regular backup of the index – daily is preferable, weekly is realistic, monthly is essential.
  • Regular backup of the electronic files and scanned imagesÐdaily is preferable, weekly is realistic, monthly is essential.
  • Printout of the index in KWIC (Keyword in Context) format – every six months or yearly (though I have never had software able to do this).
  • Secondary backups of index, electronic files, scanned images and KWIC index stored in a location different to the location of the primary backup media – every six months or yearly.
  • Tertiary backup in a secure environment such as a bank -every six months or yearly (I have not done this yet but am seriously contemplating it).

2019 Answer: Fully answered: Backing up is an essential element in any computer system. It is advisable to have at least two copies, one of which is held some miles away from the master. The PAWDOC backup regime is clearly described in the PAWDOC User Guide; and, to prompt me to actually perform the backups, I have a table with upcoming backup dates in a frame on the wall that is directly in front of me when I sit at my desk. The backup reqime that is applied to the PAWDOC collection is as follows:

  • Cloud: Ongoing backup of new files and changes to files are made to a cloud service.
  • Offline backup to an external drive at home: New copies of the whole collection are taken once a year.
  • Copy on other laptop at home: The back up on the external drive described above is copied to the other laptop in the house immediately after the new copy has been acquired, i.e. once a year.
  • Remote UK external drive: The whole colllection is copied onto this hard drive once every two years and it is stored at least 10 miles away from the master laptop.
  • Remote out of country backup: A copy of the whole collection is copied to a 128Gb memory stick and given to the person who lives in the country concerned, whenever I meet up with that person.

Q57. Are electronic filing systems reliable over very long periods?

2001 Answer: Partially answered: Over the 20 years of this project, the system has been very reliable. However, the following problems have been experienced or are anticipated as the system gets older:

  • Crashes of the index database -recovered either by functionality in the software or by using backups
  • Magneto-Optical disk corruption (has happened to just one disk)Ðrecovered by using backups.
  • Document management system has lost about 30 files – not sure how this happened and it was not recoverable.
  • Longevity of other people’s files – I am sure I could not now obtain some of the items belonging to other people to which I put a reference in my index 10 or 15 years ago.
  • Longevity of web addresses – I do not think that some of the web addresses the index points to will be still live after several years. We have yet to see whether web addresses of journal contents will be reliable over long periods (Wilson 1996a).
  • Electronic files stored in old versions of software when the original application software may no longer exist on your PC, or may have been upgraded beyond recognition. This is a potentially very serious problem over periods of 10 or 20 years or more (Wilson 1997: 1).

2019 Answer: Fully answered: The fact that the PAWDOC system is still fully operational after 38 years does demonstrate that such systems can be reliable in the long term, despite the inevitable loss of some documents or pages within documents.  However, in practice much depends on the diligence of owners and whether they are sufficiently motivated to take regular backups and to perform digital preservation activities on their collections. Taking an overall very long term view, the longevity of such systems relies on the following 4 characteristics:

  • Visibility: Because an electronic filing system (EFS) is, by its nature, intangible and locked away somewhere inside a computer, the first essential requirement for it to survive is for one or more people to be aware of its existence.This can, of course, be achieved by simply telling people. However, PAWDOC’s existence is also fully documented in a Hardcopy User’s Guide which is contained in one of the two archive boxes in my study.
  • Accessibility: Knowing that an EFS exists isn’t the same as being able to get at it; over a period of years laptops become defunct and inaccessible; and backup technologies my cease to work. Therefore, for EFSs to continue to work long term, the platforms they run on must be kept up to date.
  • Integrity: For an EFS to work properly it is necessary to have all the software and data that it uses, in place. Missing data can be very annoying and even disasterous; whilst missing or corrupt application software can preclude the system working at all. Effective backup regimes can help to alleviate this problem.
  • Readability: The data files in an EFS can’t be read unless there is an application that can open them up and display them. Over time, applications get upgraded or may become defunct. Therefore, it is essential to implement a Digital Preservation routine that identifies files in danger of no longer being accessible and that takes steps to rectidfy the problem.

If all these aspects are addressed, an EFS should be able to survive for many, many years.

PAWDOC: Confidentiality, ownership and intellectual property rights

When working for an organisation it is widely recognised that you will need a collection of documents to support your work. In building this collection, any explicit IPR regulations, or stipulations imposed by document owners, should always be complied with; and this includes very highly confidential documents which are probably best left out of the collection. However, if you operate your personal filing system on the basis that it is only for your own personal use, many of the IPR and Ownership issues may be overcome.

When leaving an organisation, individuals may or may not want to take some documents with them depending on the type of profession, stage in their career, their next job, etc.. However, in deciding what to take, they must follow explicit rules and regulations; and they should leave behind any material which is likely to give a significant competitive advantage to competitors.

Specific questions relating to this aspect are answered below. Note that the status of each answer will fall into one of the following 5 categories: Not Started, Ideas Formed, Experience Gained, Partially Answered, Fully Answered.

Q52. How should confidential documents be handled in an electronic filing system?

2001 Answer: Not started: Hardcopy documents can be stored in a vault and the associated reference numbers can point to the vault (Wilson 1992: 2.8).

2019 Answer: Partially answered: The most effective way of dealing with very highly confidential documents is not to include them at all in your personal electronic filing system. However, if you do want to record their existence, they could be included in your index in such a way that the Reference Number or Movement Status field points to their residence in a secure location. If you must have possession of a document itself, then it will be necessary to secure its existence on your laptop or other personal device with a combination of passwords and encryption – provided that satisfies any rules or regulations associated with it.

Q53. How do ownership and IPR issues constrain the operation of a personal electronic filing system

2001 Answer: Not started

2019 Answer: Partially answered: One of the most important principles to apply when assembling a personal document collection is that you deliberately restrict it to your own personal use.  This probably overcomes many of the IPR and Ownership issues related to documents that you haven’t created yourself. With that principle in place, I believe it is recognised that, while one is working for an organisation, you are entitled to have a collection of documents relevant to your work. Having said all that, it is essential to comply with any particular IPR regulations, or constraints stipulated by specific document owners, that you become aware of.

Q54. What IPR considerations have to be taken into account when moving from one organization to another

2001 Answer: Not started

2019 Answer: Partially answered: While one is working for an organisation, you are usually entitled to have a collection of documents relevant to your work. When leaving an organisation, I always felt that, as a consultant, there was a tacit understanding that you were able to keep copies of the documents you had created; but that there was a grey area in which the removal of large quantities of documents only partially related to one’s work, would cause concerns – probably because there was a fear that they might give some advantage to competitors. Herein lies a dilemma for the diligent personal filer: on the one hand it seems a shame to waste all the effort one has put into assembling and indexing a collection of documents, and on the other hand there is the knowledge that ostensibly taking it away when you leave an organisation might encourage a knee jerk reaction to prevent you removing it. To address this issue, two principles should be applied: first, comply with any explicit regulations concerning specific documents or information; and second, leave behind any documents or information that you believe would give a competitor a significant advantage.

PAWDOC: Sharing files

A personal filing system enables the individual to decide what to include in it, what to specify in the Index, and what to include in the file titles. Hence, a personal file system is very specific to an individual, and it is this very specificity that would make it difficult to share an index and the items it represents over the long term.

To address this problem the Reference Numbers in the PAWDOC system were designed to distinguish between different owners and different sets of information within each owner. The thinking was to enable entries for documents in other collections to be included in one’s own Index. It certainly did enable such entries to be included in the Index; however experience over many years showed that it was almost impossible to get access to such documents after one had changed jobs or after many years had passed. Consequently, I stopped including such entries in the PAWDOC system.

Of course, the sharing of a small specific set of files for a relatively short period is much more feasible. Collaborative file sharing has been explored by the CSCW (Computer Supported Cooperative Work) community for many years; and services such as Dropbox are widely available on the net. However, this is not the same as sharing thousands of files on numerous different subjects over many years.

Corporate Document Management Systems have also been widely implemented in recent years and these do enable whole workforces to share an index and the associated files. However, these too cannot be considered to be personal filing systems since they have very specific corporate objectives with supporting regulations and constraints.

In short, the long term sharing of personal indexes and files does not seem feasible or demanded at present.

Specific questions relating to this aspect are answered below. Note that the status of each answer will fall into one of the following 5 categories: Not Started, Ideas Formed, Experience Gained, Partially Answered, Fully Answered.

Q50. Can two or more people share the same filing index?

2001 Answer: Ideas formed: Only with great difficulty, unless deterministic indexing rules are adhered to and controlled descriptors are used. Unfortunately, this probably reduces the effectiveness of the index for all parties.

2019 Answer: Ideas formed: I have come across no evidence to suggest this is currently feasible. Corporate document management systems do enable individuals to share the same index – but such systems do not provide the longevity, portability from job to job, and freedom of choice about what to include, that a personal filing system offers. If such a system were to be developed it would need to cater for three different types of requirements:

  • Objectives: the ability to satisfy the different approaches and preferences of the individuals concerned while providing a coherent uniform system.
  • Individual characteristics: the ability to cater for things such as the different terminology used by each individual, and for the fact that each individual will only have a mental imprint of the documents they themselves have included in the system.
  • Technology: the use of compatible technology by all individuals including things like software applications to open documents, file formats, application versions, and a commonly accessible file store such as the cloud.

Q51. What is the most effective way to enable two or more people to share the same files?

2001 Answer: Experience gained: Make entries in your own index in your own words but use the other person’s reference number in your index instead of creating your own (Wilson 1992: 2.4)

2019 Answer: Experience gained: Two or more people can share files very effectively using a cloud-based service such as DropBox. However, this is probably used mostly for a limited number of files on a specific subject for a few months or years. I have not heard of large ongoing collections of personal files being shared in this way – though I suppose it might be feasible with some sort of shared index which is also held in the cloud. Large Corporate Document Management systems also enable people to share files, but such systems have different objectives, regulations and constraints and cannot be considered to be personal filing systems.

PAWDOC: Portability

Since the early 1980s, office work has become increasingly decoupled from a single static location. Most people no longer have their own offices – in fact, many people no longer have their own desks. Hot desking and working from home is commonplace.  Business travel by car, train and plane is widespread; and working away from the base location in hotels or remote offices is a regular experience for some people. This mobility, and the fact that most office work is now undertaken electronically, continues to drive the development of powerful portable computers. The modern laptop is very light and powerful, and can have a huge storage capacity. The PAWDOC collection is stored on a machine weighing 1.3 kg and takes up just under one twentieth of the total 1Tb capacity of its SSD solid state drive.

Specific questions relating to this aspect are answered below. Note that the status of each answer will fall into one of the following 5 categories: Not Started, Ideas Formed, Experience Gained, Partially Answered, Fully Answered.

Q48. Are electronic filing systems portable?

2001 Answer: Experience gained: Yes, except for the scanner. The software systems run very well on a laptop computer. Most, if not all the electronic files and scanned images can be held on a laptop’s hard disk (which in 2001 would normally have between 5 GB and 10 GB available). If additional storage is required, CDs will meet the requirement.

2019 Answer: Fully answered: Yes – modern laptops have more than enough storage capacity to store all the digitised contents of a personal filing collection; and they have the power and display technology to enable fast searching and easy reading of the selected documents. Such laptops are light weight and eminently portable. Even portable scanners have been available for some time. However, scanning technology is now commonplace and is likely to be available in most offices (often through the office photocopier), so carrying around a portable scanner may not be worth bothering with.

Q49. Under what circumstances does a filing system need to get used away from the base office?

2001 Answer: Experience gained:

  • While travelling in trains, planes, hotels and hot desks.
  • While working for periods of days, weeks or months away from the base office. (Wilson 1992b: 2.9).
  • While at home.

2019 Answer: Fully answered: In addition to the circumstances listed in the 2001 answer (while travelling, while working away from the base office, while at home) I would add the following:

  • While attending meetings.
  • Any place and situation in which the owner is working.

PAWDOC: Technology requirements and problems

To operate a personal electronic filing system you need a computer with a screen, a scanner, software to manage an Index and the documents, and a general approach. My colleague, John Pritchard, and I decided to explore what it would be like to operate such a system after visiting Amoco in the USA, and we followed the approach that we had seen there: every document was given a reference number and an index entry and was then stored in reference number order. Searches were performed on the index and retrieval was achieved by using the reference number.

We were able to apply the approach immediately using index cards. However, the technology to support the approach took a long time to become sufficiently powerful and cheap to become feasible for the individual to apply it: and it took many more years before it could be considered to fully support personal electronic filing systems. Consequently much of the experience gained in using hardware and software to support PAWDOC has been in how to manage imperfect technology solutions. This has been particularly the case with computer storage which was insufficient and expensive when I first started scanning PAWDOC documents in 1996. The bulk of scanned documents had to be held offline on Magneto-Optical disks and this not only imposed a whole set of management requirements but also constrained the portability of the system. Today, however, storage is plentiful and cheap and the whole of the digitised PAWDOC collection is held on my laptop.

Scanners too have become better and cheaper since 1996. The first one I had was only capable of scanning in Black & White and one side of the paper at a time. Consequently, scanning large documents took a long time, and any colour on documents I scanned at that time has been lost. The scanner I have today takes less time to scan a page despite the fact it also scanning in colour and both sides of the paper as it goes through the machine.

In many ways the software to support personal filing has always been in place, but its performance has been constrained by computing power. For example, the indexing software I use took over three minutes to conduct a complex search on less than 4,000 records in 1988, whilst my current version of the same software takes less than one second to conduct the same search on over 17,000 records.

The software to manage the stored documents has also been constrained by computer power – but in a rather unexpected way. In the 1980s and 90s when I first started using the PAWDOC system the conventional thinking was that a dedicated Document Management System was needed for the purpose. Such software applications were large complex beasts with numerous features and they relied on an underlying database application. Today, PAWDOC documents are stored in Windows folders labelled with a Reference Number. My laptop and the Windows 10 operating system are more than powerful enough to be able to display and search over 17,000 folders in just a few seconds. Such a solution would not have been feasible in the mid 90s, but today’s power has enabled a very complicated and constraining element of the personal electronic filing system architecture to be dispensed with.

Specific questions relating to this aspect are answered below. Note that the status of each answer will fall into one of the following 5 categories: Not Started, Ideas Formed, Experience Gained, Partially Answered, Fully Answered.

Q38. What additional software functionality is required?

2001 Answer: Partially answered: A system which eliminates the need for two systems by combining simple and flexible indexing and searching functionality, and file management functionality which keeps track of the thousands of electronic files (Wilson 1996a: 3).

  • Facilities to detect low usage and to automatically recommend the destruction of paper (after scanning).
  • Intelligent synonym functionality that can recognize relationships between frequently used abbreviations and terms, and which requests the user to confirm possible synonym relationships (Wilson 1990: 96 ± 97).
  • The ability to automatically manage multi-part reference numbers of the type PAW/DOC/7653/01 and to be able to present the next unused number.
  • The ability to produce a KWIC (Key Words In Context) or KWOC (Key Words Out of Context) index (Wilson 1992a: 29,30).
  • The ability to store a set of web pages without losing the links between them (the FISH Document Management System is unable to do this because it stores each individual file with a new file name consisting of a combination of alphanumerics) (Wilson 1995b: 131)
  • . Functionality to support the assembly, development and use of knowledge (Wilson 1997: 3 ± 4).

2019 Answer: Fully answered: My current views on the additional functionality listed in the 2001 answer are as follows:

  • Combined Indexing and file storage: Now that I have eliminated the Document Management System and replaced it with Windows folder, I no longer feel this is needed. However, despite retrieval being simple and quick, it could be made even more effective if the files associated with a particular Reference Number could be automatically listed under the Index entry for that number; and if the file you require could be selected and opened from that list.
  • Low useage detection: Now that all documents are digitised and paper is no longer taking up valuable space, there is no need to identify which hardcopies are not being accessed and could therefore be digitised and removed. Consequently this requirement is no longer needed.
  • Intelligent synonym functionality: Terminology continues to change, so this is still required.
  • Management of multi-part Reference Numbers: This is still a requirement. It would make it quicker and easier to create new index entries.
  • Production of a KWIC index: I no longer produce paper backups of the index, so this is no longer required.
  • Store web pages without losing links: I now use zip functionality to combine and store the multiple files making up a single web site, so this is no longer required.
  • Nugget/knowledge management: I never clearly ascertained if this would be worthwhile or not (see more detailed discussion in the answers to questions 27 – 29, and also in the topic ‘Knowledge Development‘ elsewhere in this web site).

In addition I would add the following:

  • Use of flexible Date formats: This is required to be able to specify BOTH exact dates (for, say, the date a document gets created or a letter is sent – dd/mm/yyy); AND partial dates (for, say, the year a book is published – yyyy – or the month and year of publication of a journal or magazine – mm/yyyy)

Q39. What technology problems have been experienced while operating the electronic filing system?

2001 Answer: Experience gained:

  • Replacing the PC requires the re-installation of all the software, which has been problematic on the last three occasions.
  • Upgrades of software can require a complex conversion process.
  • The index software (Filemaker Pro) crashes from time to time, but the Filemaker recovery function has always been able to deal with it except for an occasion over 10 years ago when the backup file had to be used (Wilson 1992a: 65, 72, 79).
  • About 30 of the files were lost in the document management system -probably in the course of moving them to off-line storage or moving them back into the PC’s hard disk (Wilson 1995b: 137, 139).
  • One of the Magneto-Optical disks became corrupted and the backup files had to be used (Wilson 1995b: 141).

2019 Answer: Fully answered: Over the last 20 years most of the technology problems I’ve had, seem to relate to four main areas – storage, specialist software, upgrades, and obsolescence:

  • Issues associated with the lack of cheap reliable storage: This problem has largely disappeared. When I first started scanning in 1996 I had to use external Magneto-Optical disks attached to my laptop, and I did suffer some data transfer and disk corruption problems. Today I have more than enough fast storage with the 1Tb SSD in my laptop.
  • The management and cost of specialist software: I had to deal with a wide variety of issues over the years with my document management software and its associated Sybase, and subsequently SQL, database. So much so that I have concluded that t is far better to avoid all specialist software if at all possible. It introduces complexity and is costly to buy, upgrade and support. While general purpose software may have fewer features, overall it is likely to be much easier to manage and use, and is likely to be a much more viable long term solution for the individual. I am very pleased to have eliminated the document management system and associated database from the PAWDOC architecture, and to now be using the much more familiar and straightforward Windows folders to store PAWDOC files in. I still use Filemaker for the Index but I regard this also as specialist software. Although it is very reliable and presents few management issues, it still has to be upgraded every three years at a cost of over £200 a time; whereas I know that I could still operate the Index if I exported the data to an Excel spreadsheet. In conclusion, I would recommend anyone setting up a personal electronic filing system to use standard multi-purpose software, prefereably which you are already using, and to avoid specialist software if at all possible.
  • The complexities associated with upgrading platforms and operating systems: Moving systems from old to new computers, or upgrading operating systems, are major changes with associated risks. That’s not to say that it will necessarily be difficult – but over the years I have encountered issues and have found the more complex the systems being used the greater the challenges. The document management system had to be totally reinstalled from scratch when it was moved to a new laptop and that was something I only ever achieved once by myself without any supplier support. Now that PAWDOC only uses a Filemaker Index and Windows folders, the risks and difficulties associated with upgrades are much lower.
  • Obsolescence: As files are accumulated over the years, they may become unreadable because you no longer have the appropriate application software running on your machine. When I conducted a Digital Preservation exercise on the PAWDOC system in 2016-2018 I discovered many examples of such files, and it took a considerable effort to deal with the problems and achieve readable files again. Similar problems can affect hardware such as disks and memory sticks – though I feel less vulnerable on this front as I have sufficient storage on my laptop to cope with all PAWDOC requirements. However, anyone operating a long long term filing system is going to have to undertake periodic Digital Preservation work of one sort or another to ensure that their documents continue to be readable.

Q40. What contingency arrangements can be made to minimize and overcome technology problems?

2001 Answer: Ideas formed:

  • Make clear notes on little used technology procedures and fixes.
  • Document system components and configuration settings.
  • Assemble support phone numbers.
  • Keep all of the above in hardcopy and in a place that does not require the filing system to find them.

2019 Answer: Fully answered: In addition to the four points made in the 2001 answer (make notes on procedures and fixes, document components and configurations, document support numbers, keep such documentation outside the system in hardcopy), I would add:

  • Be diligent about regularly backing up.
  • Ensure you know how to use backup data to reinstall applications.
  • If you have a specialist Index application, consider regularly exporting the data to a spreadsheet application so that, if the application fails, you still have immediate access to the Index.

Q41. What equipment is needed to operate a filing system and what are the key criteria by which it should be selected?

2001 Answer: Experience gained:

  • A high resolution monitor preferably capable of displaying a whole A4 page in a magnification you can read.
  • A laptop computer with sufficient hard disk to store all the electronic files and scanned images in the collection, and with room for the growth of the collection.
  • An off-line storage system that can be used to make backups of all the collection’s electronic files and scanned images, as well as the electronic filing software application’s configuration, control and data files.
  • The equipment should not be too noisy.

2019 Answer: Fully answered:

  • A large screen high resolution colour monitor big enough to display a whole portrait page sufficiently large as to be roughly readable without magnification, and//or capable of being turned into a portrait monitor as required.
  • A light weight laptop computer small enough to be transported in hand luggage, with a high resolution colour screen, sufficient fast SSD storage to accommodate the whole of the personal filing collection, and which makes a minimal amount of noise.
  • A colour scanner with both a sheet feeder and a flatbed capable of scanning documents at least A4 in size, which is reasonably fast, and is small enough to fit on or next to your desk. Its software should be capable of automatically adjusting the scan to the size of the document and automatically adjusting sloping originals to produce a vertical scan. It should also provide easy to use facilities for adjusting contrast and brightness to deal with poor originals, and for resetting after sheet feeder jams.
  • Two or more external hard disks or flash drives with sufficient capacity to store the whole of the personal filing collection, for use as a) a local backup, b) a remote in-country backup, and c) if required, a remote out-of-country backup.

Q42. What considerations should be taken into account when physically laying out the filing system?

2001 Answer: Partially answered:

  • Paper files should be placed so they are accessible while sitting at the desk (Wilson 1990: 94)
  • The scanner should be placed so it can be operated while sitting at the desk.

2019 Answer: Fully answered: Over the years I’ve had to cope with a variety of company offices and a long period of operating out of my home study. In all these situations I have tried to arrange the physical layout so I could conduct all my filing activities while seated at my desk. I’ve found this to be feasible and effective.  Hardcopy files can be placed in an upright filing cabinet (or cardboard boxes) alongside or behind one’s desk; and a scanner can be placed on the right hand edge of the desk. Backup external drives can be placed in a pedestal drawer.

Q43. What criteria should be used to select an electronic filing system software package?

2001 Answer: Experience gained:

  • Ability to support the desired filing schema.
  • Ability to manage both hardcopy and electronic files.
  • Enables the rapid input of new items.
  • Enables easy and quick searching.

2019 Answer: Fully answered: In addition to the points made in the 2001 answer (support for the filing schema, management of both hardcopy and electronic files, rapid input of new items, and easy and quick searching), I would add:

  • Simplicity and understandability of the architecture of the system.
  • Ease of installation.

Q44. Is it feasible to construct a filing system out of multiple different software packages?

2001 Answer: Experience gained: Yes. However, provided all the requirements are met, it would be more efficient and easier to manage if only a single package was required.

2019 Answer: Fully answered: Yes, it is feasible, provided effective integration between the packages can be achieved, and provided not too much effort is required to set up and maintain the integration. However, it undoubtedly complicates matters and requires more effort to manage and maintain, therefore, the simpler the packages to be integrated the better. However, on balance I would not recommend it if a single piece of software will do the job.

Q45. How much file space do you need to store an individual’s personal files?

2001 Answer: Experience gained: Assuming only black and white scanning, no digitizing of journals or books, and no video material, a collection built up over 70 years would require approximately 53 GB (Wilson 2001a). Until experience is gained of colour scanning and digitized video, a more realistic figure cannot be estimated.

2019 Answer: Partially answered: The current PAWDOC collection can’t be considered a total lifetime collection because:

  • A substantial number of the colour hardcopies were scanned in B&W.
  • For about 10 years when I was working in Bid Management with highly confidential and fast moving documents, the number of documents I was putting into the collection was much reduced.
  • The collection only includes about 30 years of my 40 years of work.

It should also be remembered that about half the collection was assembled under business conditions that were in transition from paper only to paper + electronic – very different from today’s environment. Furthermore, the type of work I did and my overlapping interest in technology research, dictated my coming into contact with a particular range of documents; different types of jobs and interests will dictate different numbers of documents of different types.

Having said that, all the items in PAWDOC have been digitised and the overall digital collection takes up about 46Gb.

Q46. How much file space is taken up by the average document?

2001 Answer: Partially answered: Chan’s results showed the following sizes for an A4 page: line art 87 kb; black and white 91 kb; halftone 181 kb; and colour 3347 kb (Chan 1993: 28). In practice, initial black and white scans at 240 dpi were producing an average file size of about 40 kb (Wilson 1995c: 1).

2019 Answer: Fully answered: File sizes vary depending on what application they have been created in and on whether they are scanned as colour or B&W documents. Therefore, file sizes for a number of these combinations were established using my current scanner (a Canon DR-2020U) to scan at 300dpi a single full page of typed text for the B&W document and a single page containing 5 colour photos of various sizes for the colour document.

  • B&W page created in Word 2007: 13 Kb
  • B&W page scanned in B&W to PDF: 105 Kb
  • B&W page scanned in Greyscale to JPG (the scanner would not scan in B&W to JPG): 579 Kb
  • B&W page scanned in 24 bit colour to JPG: 584 Kb
  • B&W page scanned in B&W to TIF: 69 Kb
  • Colour page created in Powerpoint 2007: 1,100 Kb
  • Colour page scanned in 24 bit colour to PDF: 808 Kb
  • Colour page scanned in 24 bit colour to JPG: 750 Kb
  • Colour page scanned in 24 bit colour to TIF: 25,389 Kb

Q47. What’s the best type of storage media to keep electronic files on?

2001 Answer: Experience gained: A hard disk in the laptop is best because it is so quick and easy to use. CDs are good because CD writers are cheap and CD drives are available in most laptops. Having said that, this does not preclude other media with similar characteristics.

2019 Answer: Fully answered: Its best to keep your files with you on a laptop – or on your mobile phone provided you have all the necessary applications on the phone and you feel the screen is big enough to be able to read the documents. However, since both laptops and phones are portable and therefore at higher risk of being lost or stolen, adequate measures must be taken to protect the data should the equipment fall into the wrong hands. Another possibility is to store the master set of files in a cloud-based service, however I believe that would be unwise due to the risks of the service failing or being subject to viruses or hacking. A cloud-based service may be suitable for backup, though external hard drives or SSD flash drives are cheap and effective enough for the purpose.