Comments on reunions with old documents

My colleague, Clive Holtham, was instrumental in putting me in touch with suppliers who loaned me a scanner and document management software around 1995, to enable me to progress my mission to understand how personal electronic filing would work in practice.  Some six years later, in February 2001, Clive and I met up for dinner and a catch up on what we’d both been doing. I explained that as well as scanning new hardcopy as I acquired it, I was also trying to scan all the legacy documents I had acquired since 1981, when I started this electronic filing adventure. Clive pointed out that it would be interesting to see what I thought of each document in retrospect, as I carried out the scanning process. After all, the point of indexing and filing the documents, was based on the assumption that some of them would have some value downstream. Here was an opportunity to get an insight into what their downstream value might be.

I took Clive’s suggestion on board towards the end of 2001; but, to minimise the effort required, I decided I would only comment on those documents which prompted some particular thoughts. The comments would be recorded at the end of the Title field in my filing index; and they would be identifiable by being placed within a special set of characters in the following format: <<! Date: Comment Text here !>>. To make it easier, I created a script in my Indexing software to automatically place the delimiter characters with current date at the end of the title field, and assigned it the keyboard shortcut CTR-8. This seemed to work in practice, and I got into the habit of making my comments in real time as they occurred to me. After a while, I started to use the facility to record other information, such as a document being duplicated in another Index entry, or problems I had had with scanning a document. Now, in 2021, 20 years after starting to record these comments, I find that 584 records within my filing index possess such comments; and 20 of those have two comments.

This is an analysis of what those comments say. They have been placed into one or more of 5 categories:

  • Comments on the impact of the material (7% of the 584 records with comments)
  • Comments on the contents of the material (32%)
  • Comments which prompted questions and thoughts (23%)
  • Comments about memories forgotten and/or remembered (17%)
  • Comments about filing, indexing and scanning activities (46%)

The full list of comments and the categories to which they have been allocated is provided in this link. The comments have been further allocated into sub-categories  which are used in the discussion below. However, the following two salient points need to be born in mind when considering the results of this investigation:

Scale: Although 584 records with comments may sound a large number, in fact comments have only been made on a small subset of the contents of the filing system: 584 is only about 3% of the 17,350 records in the index. This could indicate that the sample size is too small to be generalised; though, I believe it is more likely to indicate that relatively few documents merited a comment. Unfortunately, there is no data to investigate which of these two possibilities is the case – the decisions to include comments were made in an arbitrary manner over many years.

Lag: The time between a document being included in the filing system and when a comment was made about it, has almost certainly affected many of the comments. Presumably, the more time that passes, the less likely the contents of a document are to be remembered, and this may make them more remarkable when they are encountered again. The actual lags that occurred have been calculated as a number of years by the difference between the Creation Date field in the Index, and the date recorded at the beginning of each comment. This shows that over 93% of the comments were made more than 10 years after the documents were included in the filing system; and over 50% had comments with a lag of over 20 years. Only 12 items had comments with a lag of less than 5 years.

Comments on the impact of the material (43)

These comments include remarks about documents which have influenced my thinking (8). For example, “This is a most important paper because it alerted me to the key insight that to get the most out of an OA investment the organisation must change the way it does business”. A further 9 comments relate to documents which were more generally important to my work, for example, “This was an important edition of EDP analyser and highly relevant to NCC’s OA team of which I was a part”. Finally, 26 comments were made about documents that are special in a variety of other ways, for example, “This is a great example of how to do brainstorming”, and “This is an interesting document to have from the early days of the net”.

Comments on the contents of the material (188)

Just over a third of this category is concerned with comments about a document I wrote or activity I was involved with. This is hardly unexpected given my intimate relationship with the events. For example, “Have just read the suggestions I made to Esprit about its CSCW program. I wonder if they made any kind of difference”; and “This was my one not very successful claim to broadcast fame – and I’m not even sure it got broadcast”. A quarter of the comments just remark on ‘interesting content’, for example, “This is a fascinating article because it represents a twilight period in the change from old style typists to individuals doing the typing all themselves”; and “This was worth another read – definitely food for thought…”. The remainder include comments in a range of other sub-categories – listed below together with an example for each one.

  • Comments on technology developments (16) – “Seems very advanced for 1978”
  • Assessments of predictions (8) – “The prediction of a day in the life of the CEO in 2013 didn’t get it quite right”
  • Comments on the author or other people (8) – “I’ve been thinking about getting in touch with X again”
  • Comments on photos in documents (7) – “While scanning this I discovered that it contains a photo of X”
  • Content which I thought I might find useful (26) – “This document is highly relevant to the assignment I am about to start”
  • Comments which provide a critique of content (5) – “I think this process missed out the key element of Improvement by Learning by Doing”.

Comments which prompted questions and thoughts (135)

The majority of these comments – some 60% – were general reflections and musings prompted by the documents concerned. For example, “I think it demonstrates that prior to the internet and the web there was a different way of thinking about information: in those days having the information meant having the actual item, whereas today, in the internet/web/mobile era, having the information is all about having a device and knowing where to look”; and “It would be interesting – amazing – to re-run this event with the same people”. The other six sub-categories are all specific questions:

  • Is this still around/available/the case today? (10 comments – for example “I don’t hear the term ‘Groupware’ much these days – I wonder if it has fallen out of use”
  • What’s a person doing today? (15 – “I wonder If X is doing anything related to this now – haven’t seen him for about 20 years”)
  • Is this still relevant today? (12 – “This might be interesting to read to see if 25-year-old advice about dealing with Info overload still applies”)
  • How does this look in retrospect? (4 – “There was a big fuss about X’s thinking on this – would be interesting to see how it all looks in retrospect”.
  • What was the impact of this? (5 – “This work on Teletel was ground breaking and was subsequently successful. How it affected the French use and take-up of the web I don’t know”)
  • How did these predictions fare? (6 – “The Booze-Allen Hamilton report was very influential. It would be interesting to see how its predictions fared”)

Memories forgotten and/or remembered (100)

70% of these comments are about things I’d forgotten either partially or wholly; and 30% about things I remembered about associated aspects, or about people. Examples of each are provided below:

  • Forgetting something about a document or a related activity (28 comments), for example, “I’d forgotten these details and didn’t know I had these notes”
  • Forgetting about the document or activity all-together (41): “Can’t remember giving this talk”
  • Remembering associated aspects or it prompted memories (19); “This was a pioneering machine – we really liked the Snake game, and the early type of remote access mail through the phone lines was relatively quite advanced”.
  • Remembering the author/other person (12); “That’s a name I haven’t thought about for years! – think I met him”

Filing, indexing and scanning activities (266)

Over a third of all these comments concern filing practicalities – not an aspect which was envisaged when I established this comment facility. Recording information about the operation of a filing system is definitely an overhead, so there is a natural tendency to minimise the effort spent on it. Consequently, the fact that it was quick and simple to create comments in a form which was tightly coupled with individual documents and their index entries, made this facility an obvious choice for quickly documenting issues or important observations. The 22 separate sub-categories of comment listed below together with an example for each one, illustrate the extensive range of topics that were encountered as the PAWDOC collection grew and aged (note that over 93% of these comments were made at least 10 years after the document concerned had been included in the collection).

  • Practicalities of using PAWDOC (5) “Must force myself to search for stuff even if I don’t think it’s in this index!”
  • Deciding what to include/remove (5) “Artefact removed for inclusion in PAW personal collection”
  • Notes about where items originated (6) “The Quick Reference Card was included in Nov2018 when I found it inside the WGEM starter pack”
  • Notes about what version is filed (8) “This Aug86 version must have replaced an earlier version in my collection”
  • Notes about artefacts (6) “Specified this as an artefact at this late date because it’s the first issue I have in this new format”
  • Notes about cross-references in the collection (7) “See also PAW/DOC/0110/145”
  • Notes about duplicates in the collection (87) “Some of these documents are duplicated in PAW/DOC/7971/01”
  • Notes about Archiving (6) “This was in an archive box but archive status had not been specified in the Movement field”
  • Comments on Reference Number (16) “This document has the number PAW/DOC/0005/03 at the top – but that number is for something else”
  • Comments on Title field (9) “Inserted the info about the abstract when I was scanning because there was no reference to it in the title”
  • Comments on Creation date (12) “Don’t understand how the date on this paper is 1986 but the record was created in 1984”
  • Comments on Publication date (3) “2019 properties of the word doc say this was modified on 31May1985 so this was probably the publication date”
  • Comments on Movement field (10) “Don’t know why this says it was scanned and paper destroyed in 2004 – in Feb 2006 there was a full envelope of material in the box”
  • Losing/deleting index information (4) “I deleted the title text of this accidentally when scanning so this is a replacement title text”
  • Lost or misplaced documents (17) “Found the electronic version of this filed in FISH under PAW/DOC/4052/01”
  • Relationship with Personal files (8) “I found these PAW/DOC papers in one of my personal home files”
  • Notes about physical characteristics of items (17) “This printout had almost completely faded so it was a challenge to see if the scanner would bring the text to light – and it didn’t do a bad job!”
  • Notes about disks in the collection (6) “This included a disk containing a DOS version of the ITSforGKProposal”
  • Management of the FISH DMS (8) “This seemed very necessary at the time when disk space was short – and very complicated. Now in 2006 with 40Gb on my PC it doesn’t seem to be an imperative at all”
  • File formats & Digital Preservation (7) “No longer able to read the floppy disk when it came to take this material out of archive to scan it in 2006”
  • Notes about loading electronic files to FISH (17) “The Word version doesn’t have the appendices so I PDF’d the Word version and then scanned the appendix pages from the hardcopy. Unfortunately, the pagination of the Word document is slightly different from that of the hardcopy – but the words are all the same”
  • Notes about Scans and Scanning (49) “These pages were too thick to go through the duplex scanning process so I had to do one side first and then the other side”.

Conclusions

No great revelations have emerged from this investigation. However, it’s clear that reviewing old material in this way provides an opportunity to reflect, and perhaps to rediscover potentially useful material. These are luxuries that are hard to come by amidst the pace of modern life. Whether such activities actually provide any tangible benefits is hard to say: I can’t remember if any of the rediscovered documents made a difference in my subsequent assignments; and the benefits of reflection are difficult to pin down at the best of times (though I personally feel it is always worthwhile).

The one practical finding that has emerged from this exercise is that there are significant advantages in being able to quickly and easily annotate a filing index with any relevant additional information, be that extra detail about content, or factual information about the way that content has been filed. The former augments the information provided by the filing system, and the latter assists in its smooth operation. In fact, the latter is more than a mere nicety. My experience has shown that, as this type of personal filing system grows and ages, the number of imperfections it possesses increases substantially. The long list above of sub-categories of ‘Filing, indexing and scanning activities‘, and their associated examples, provides an indication of the range of issues that can arise. Having the ability to quickly note details of those issues in a place where they are likely to be immediately visible to the user, is of great benefit.

Rethinking the Table Present

It’s been a tradition in our family to have table presents at the Christmas lunch, but this year we didn’t; it had all become a bit difficult and expensive, and, in this year of pandemic lockdowns, there were only three of us at the table. However, it’s quite a nice thing to do, so I got to thinking there might be an easier and cheaper way. Maybe the present could just contain a piece of paper describing something you think the person concerned might like but didn’t know about. For example, a holiday destination, or a hotel, or a book, or a hobby, or a restaurant, or a walking trail, or a type of pet, or a band, or a piece of clothing, or a voluntary job with a particular charity…. or almost anything really that you think the person might enjoy. Might also work for New Year meals as well.

Getting a dry grip

During a wet round of golf last Wednesday, I was reminded again of the problems of slippery wet golf club grips. In a previous wet round, I’d tried putting the club handle up inside the front of my waterproof jacket: it kept the handle dry but was fiddly. Last Wednesday, however, I tried putting the handle underneath my arm on the outside of my waterproof jacket which I found much easier, and just as effective at keeping the rain off the grip. Now, if waterproof jacket manufacturers could put some towelling or other drying device on the underside of one of the arms, which would dry already wet handles, I think we might have a solution to the problem.

Time for Structure Substitution

The TED talk I’ve just listened to by Yaël Eisenstat (Dear Facebook, this is how you’re breaking democracy, Aug2020), is important because it explains how Facebook’s business model is dependent on creating constant interest and emotion in its users. This ultimately leads to the system essentially promoting extremism. As I was listening, it occurred to me that it is Facebook’s structures (the extra functionality provided around a simple messaging system – such as adding a ‘like’ button) that dictates this result. A Social Media system with a different set of structures could avoid such harmful effects. Perhaps it’s time for competitors, or an Open Source operation, to create a messaging system with structures that promote a society with people who listen to each other and work together; and to draw users away from Facebook. In the meantime, the more people who listen to Ms. Eisenstat’s talk the better.

Self-publishing a Photobook

To get an idea of the possibilities for photobooks, just take a look at the Blurb bookstore; there’s a huge diversity of subject matter, and the books look great. It’s clear that anyone who has a passion can create a permanent record which will sit handsomely on a bookshelf for around the cost of a meal out or less. Furthermore, authors can elect to sell their book in the Blurb bookstore and/or through Amazon; and they can specify how much money they want to make on the sale of each copy. Blurb will keep track of sales and remit the income due to the author each month.

I’d already had a go back in 2012 – but with a service designed more for the presentation of photographs rather than discursive text. The result was pleasing but not brilliant. I’d heard there were more appropriate online printing operations – and I determined to try one out sometime. My opportunity came last summer when I decided that I might have more success finding a permanent repository for my work document collection, if I had a book of memorable experiences based on the contents of the documents. I decided to use the Blurb service for no better reason than I’d had a brief look at it a few years ago after seeing it get a good rating in a review of self publishing services. There are many other such services available on the net today and I don’t know how they currently compare to Blurb.  You should check them out.

I decided that my book would consist of one page write ups of particular events, each one accompanied by a page of images. I opted to create the text first in Microsoft Word and then to decide what images to include when I imported each piece of text into Blurb’s BookWright page layout package.

I started writing the text in September 2019. It was mostly done by the end of January 2020, at which point I downloaded the BookWright software. Although it took a bit of getting used to, it wasn’t too difficult, and I found the functionality quite good. There were a couple of minor problems: first, the software closed abruptly, without notice, five or six times – but each time it fired up again and opened up the book’s contents successfully without having lost any data. Second, typing was sometimes slow to reproduce on screen. Exchanges with Blurb Support suggested it was due to a lack of virtual memory – which didn’t surprise me because I was using Word, Excel, Powerpoint, Filemaker, and a PDF package all at once to create the contents while I was using BookWright. Closing some of these seemed to resolve the issue.

The biggest issue I faced was with the resolution of the images I was including. The Blurb Help files warn against grainy, blurry or pixelated images, but, of course, you can only be absolutely sure you have avoided this pitfall when you get the printed book. BookWright itself provides a warning when it thinks an image will not be up to standard (which typically occurred when I was trying to expand an image to make it easily readable or to fill a page). I took notice of these warnings and either made the image smaller or found a way of increasing its resolution. I achieved the latter by either rescanning a physical document at a higher resolution, or printing out an electronic document in high quality and then scanning at a high resolution. Although these two approaches did seem to improve the quality of many of the images, they also substantially increased the file size of the book (about 4.2Gb at that point). A search on the net about the size of BookWright files, reassured me that uploads of that size and more were not unusual – but I did discover that eBooks cannot be produced for files over 2Gb. I also discovered – rather too late in the day – the BookWright advice to use the png lossless format in preference to jpg. I guess this just highlights the fact that I really don’t know too much image formats and resolutions. Nevertheless, most of the images seemed to turn out OK in the finished book. The key seems to keep image sizes below the threshold of the BookWright warning messages.

I had 195 separate stories, so there was at least one image to find and import for each one – and, in some cases, several images. It was a long haul and took me until the 19th March before I’d finished the first pass through in BookWright, and could start the final edit.

I’d elected to subdivide the stories into nineteen short stories – each one labelled with an icon comprising a unique set of different shapes and including the page number of the next story. The idea was that readers of a particular short story could find the next instalment at the specified page number. The page numbers went into the Contents list, and into the icons, on 27th March, and then it was onto creating the dust jacket and doing final checks.

On 30th March, I was ready to submit the 4.75Gb file using Blurb’s Upload facility. First the system ‘rendered’ the file down to 492Mb; and then it did the Upload. The whole process took about 37 minutes. I was all set to order a copy, but found that the discount code I’d planned to use, didn’t work. I searched the Blurb site and the net for 45 minutes and tried lots of codes – but none were current. I decided to wait – the full price of £103.59 was too much to ignore the possibility of a substantial reduction. It was worth the wait – on 1st April Blurb advertised a 41% discount code, so I paid the overall cost of £73.70 (which included a £2.99 PDF copy, £8.99 delivery, and 60p tax), and was told to expect delivery by 14th April.

The book arrived around 9am on 7th April. It exceeded my expectations, with a bold glossy cover, glossy pages, clear text, and bright images. I spent the rest of the day checking each page noting the corrections needed; and then the next two days making final changes. On the morning of 10th April, I did a final preview of the book and this turned up about a dozen further changes. At around 3.30pm I started the Upload process. The system took about 10 minutes to render the 5Gb file down to 496Kb; and a further 27 minutes to upload it.

Putting the book into the bookstore was not particularly difficult – but it did take a little time. There was a book description to write, categories to select, and keywords to specify. Then I had to decide how much profit I wanted to add onto the price of the book; and finally there was the specification of which pages I wanted people to see in the preview. I completed the whole business by around 5.30pm – glad to be able to take a break from the perishing book.

Overall, I’ve found it to be a very effective and satisfying experience. It has been a long and demanding exercise – but that was to be expected with a 438 page book of this nature. I elected to produce a photobook on 118gsm standard semi-matte high quality paper – however, I’ve no reason to suppose that the results couldn’t be commensurately as good for the other types of book and paper that Blurb offers. The BookWright software provides very flexible text options and layout capabilities, and seems to be able to handle images very well; and the bookstore facility provides a ready made distribution channel for the finished books.

However, there is one aspect that needs to be borne in mind. The price of a print-on-demand book is inevitably going to be greater than the price of mass produced books in a physical bookshop. Blurb books give absolute control to the author – but may price the book out of the market. There are volume discounts to be had – but the demand for a bulk lot has to be created by the author. When authors get publishing deals they do, indeed, cede much power to the publishers; but, in return, the publishers establish markets for the books and keep their prices down. This trade off becomes particularly apparent for large glossy books such as the one I have created. It is far less so for softback books with many fewer pages and of lower quality paper, of which many examples can be found on the Blurb bookstore.

Of course, these price concerns are of little consequence if all you are trying to do is to exploit some of the artefacts that you possess and make them visible. My experience with Blurb – and the huge range of examples in the Blurb bookstore – shows that using a self-publishing service provides ample opportunity to use your creativity and artefacts to bring to life your memories, ideas and passions.

Oh, and the book I created? Well here’s the cover. Clicking it will take you to the Blurb bookstore where some of its contents can be previewed.

The truth about truth

Maybe most people have already twigged this, but the BBC programme ‘The Capture’ has made me realise that we can no longer rely on videos for the truth. It illustrates how live camera feeds can be altered – dramatically. I believe sophisticated and moneyed organisations can do this today; and I think it will become easier as time goes by.
So, to add to the possibility of text being untrue, and of people’s accounts and memories being untrue, and of photos being faked, we must add that videos may be false. Is there anything left – well perhaps just our own internal thoughts and memories, but no doubt our race will get to manipulating those too.
So, I guess, we are back to a great truth that our enquirers and thinkers have known for hundreds of years: there is no substitute for diligence and multiplicity in our search for what is and what has been. Our modern technology has made us slack and gullible and persuaded us that we can nail down reality. In fact, reality has to be carefully investigated and checked and rechecked, and then still considered with a critical eye as we use it generously to develop our understanding and knowledge.

Power Booking

People in power a few hundred years ago just didn’t have access to up to date global information. These days such people have no excuse as  large numbers of diligent writers research global issues and publish up to the minute resumes on a wide range of topics around the world. There is no excuse for failing to be aware of what humans have done, and continue to do, to each other; what effects we are having on the planet we live and depend upon; what our universe might consist of; and what possible futures we might have within it. Even I, with just a few books I have read in the last few years, feel informed and broadened. If each world leader were to be given just ten or fifteen books to read at the start of their reigns, perhaps they would act rather more in the interests of all of us, than they currently appear to do so.

PAWDOC: Requirements and Objectives

The 2001 paper reviewing the first 20 years of use of the PAWDOC system listed 23 requirements for a Personal Electronic Filing System and provided a status for each one.  The table below reproduces that listing and also provides an updated status for PAWDOC in 2019.

Requirement 2001  Status 2019 Status Notes
Requirements dictated by The Job
1. Cope with large amounts of material, some of which becomes redundant very quickly. Fully met Fully met
2. Cope with changing terminology Not met at all Not met at all This would require specific functionality in the Index.
Requirements dictated by the physical environment in which the system is used
3. Be capable of being operated in temporary and limited office accommodation. Partially met Fully met Now fully met because the whole collection is now digitised on the laptop; and scanners are found in most offices.
4. Be easily portable. Partially met Fully met Now fully met because laptops are now small and powerful, and have more than enough storage.
Requirements to support the information sources used
5. Handle hardcopy material in a very wide range of physical sizes and formats. Partially met Fully met Hardcopy that is too large, or too difficult to scan, can now be photographed with a mobile phone at sufficiently high resolution for it to be read on screen.
6. Handle documents containing coloured text, backgrounds, diagrams and pictures.  

Not met at all

 

Fully met Now fully met because modern scanners handle colour; and they have brightness and contrast settings which can be adjusted to be able to scan most document contents.
7. Record references to material in other people’s filing systems. Fully met Fully met
8. Manage information received via email and computer conferencing systems (including Lotus Notes). Partially met Fully met Now fully met because any file formats can be handled, and getting email content into such files is not the concern of the filing system.
9. Record references to material in remote systems such as Lotus Notes databases and web sites, and access those remote systems and retrieve the relevant information. Partially met Fully met HTML references are included in Filemaker Index entries, and can be opened by selecting them and right clicking which presents an ‘Open’ menu option.
Requirements to manage information that is created by the filing system owner
10. Handle handwritten text and diagrams on paper. Partially met Fully met Now fully met because modern scanners have brightness and contrast settings which can be adjusted to be able to scan most document contents.
11. Handle electronic files Fully met Fully met
Requirements to help cope with information and communication overload
12. Support the rapid organization of information before it has been dealt with. Not met at all Not met at all This would require special functionality in the filing system.
13. Make visible what information has to be dealt with and support the scheduling of dealing with it. Not met at all Not met at all This would require special functionality in the filing system.
14. Enable information to be filed very quickly. Fully met Fully met
Requirements to support information access
15. Enable information to be retrieved simply and quickly. Fully met Fully met
Requirements to support the reuse of information
16. Enable templates, best practice and other reusable material to be identified, retrieved and reused. Partially met Fully met This is now fully met because I now believe it is best addressed by including appropriate wording in the Index Title field.
17. Enable existing material to be copied and modified, and to be stored as new material. Fully met

 

Fully met

 

Requirements to support knowledge acquisition and development
18. Identify, store and retrieve the marks highlighting key text. Not met at all Not met at all This would require special functionality in the filing system.
19. Collect together all important points and present them as a coherent set of information  

Not met at all

 

Not met at all This would require special functionality in the filing system.
20. Enable the user to relate all important points together in such a way that concepts can be developed as new material is acquired. Not met at all Not met at all This would require special functionality in the filing system.
21. Assist the user to identify knowledge developments that are occurring and to choose what areas to focus on. Not met at all Not met at all This would require special functionality in the filing system.
Technology support
22. The technology must be cheap enough for an organization not to quibble over, and for individuals to buy for themselves. Partially met  

Fully met

 

This is now fully met because costs of scanners have dropped; and it has been established that a Document Management System is not required.
23. The technology must be reliable enough not to require expensive maintenance contracts or multiple one-off repairs. Partially met

 

Fully met

 

This is now fully met because the technology required is standard and now very reliable.

Two things are clear from the comparison of the status in 2001 and 2019: the technology for a personal electronic filing system has now become commonplace and relatively cheap; and, no developments have occurred to support the particular requirements of ‘coping with information and communication overload’, and of ‘supporting knowledge acquisition and development’. I believe the latter is probably true not just for the PAWDOC system but generally – I have not heard of any work going on in these areas.

The 2001 paper also described three objectives for the work:

  • To provide practical feedback to product developers, system designers, and other (potential) users regarding the real day-to-day requirements of individuals using personal electronic filing systems.
  • To establish an office document test set.
  • My own personal need to stay organised and efficient in my day-to-day work.

Of the three, only the final one – to keep myself organised – has been fully achieved.

Regarding the first objective, I have done my best to document my experiences and make that information freely available to product developers, system designers and other users, but I have seen no evidence to suggest much interest – it seems there is not a consumer-led demand for this capability. Certainly, I have come across very few people, if any, who have been operating an all-inclusive PAWDOC-type system without any direct contact with myself.

I believe the second objective – to establish on office document test set – has simply been overtaken by events; the huge strides made in search & retrieval algorithms by internet search engines such as Google over the last 30 years has simply removed the need for such test sets.

Over the last 15 years or so, two more potential objectives – or at least potential uses – have come to mind: first, as huge changes continue to occur as a result of ever-increasing computing power, the universality of the mobile phone, and the ubiquity of the internet, it has occurred to me that the PAWDOC collection provides a unique insight to the early stages of this massive transition in business and society in general. Therefore, I continue to seek a permanent repository for the collection, in the belief that it may hold some value for future researchers.

The second additional objective concerns digital preservation. In order to ensure PAWDOC’s future accessibility, I have had to develop suitable digital preservation processes and documentation and to apply them to the very diverse range of material in the collection. In the course of this work, it has occurred to me that the collection might be useful to the digital preservation community as a test bed and training tool. If I have no success in finding a permanent destination for the collection as a research tool, I may then try to find a home for it within the digital preservation community.

This entry brings to an end my own personal final review of the PAWDOC personal electronic filing system. The remaining work to be done is to assemble a set of Conclusions. However, this will be a joint effort between Peter Tolmie, an independent researcher, and myself.

PAWDOC: Architecture

Four types of prototype solutions have been used in the course of this work: card index, electronic index, electronic index and document management system, and electronic Index and file storage in Windows 10 folders. The timeline of development is shown below and the architecture of each solution is described in the subsequent text.

1980      Visit to Amoco Research Centre, Tulsa: first electronic office filing system seen
1981      First entry placed in card index (1 June 1981)
1987      Implementation of a computer-based index – Filemaker software on a Macintosh
1993      Movement of Filemaker index from Macintosh to a Compaq LTE Elite laptop running MS Windows
1994      Scanner and Magneto-Optical drive loaned by Fujitsu
1995      Paperclip Document Management software loaned by DDS
2000      Paperclip upgraded to FISH product and loaned by Ringwood Software (which was subsequently taken over by Azur Group, which was taken over by Maxima which was taken over by m-Hance)
2018      FISH document management system removed and replaced with Windows 10 folders

1. Architecture of the card index system, 01Jun1981 – 30Dec1987

 This consisted of paper documents stored in an upright cabinet in serial number order, and an index on 6×4 inch cards. The cards were initially held in a small plastic box with a lid, and later on in a metal drawer. The index consisted of a series of title cards with an entry for each document (serial number, title, keywords) and a set of keyword cards – each one having a keyword on the top and then all the entries possessing that keyword listed underneath. The rationale for this filing schema was worked out with my colleague John Pritchard, and it continues to form the basis for the system as it stands today. The Card Index Architecture is illustrated below.

2. Architecture of the electronic index, Jan1988 – Dec1994

After being provided with a Macintosh computer at work, I selected Filemaker, a general-purpose database with great flexibility and ease-of-use, to hold the Index. The fields used for the Filemaker index are:

  • Reference number.
  • Title/keywords.
  • Movement status (to record the location of items borrowed by people, archived, lost, etc.).
  • Publication date (to indicate when an individual item was first published).
  • Creation date (to record when an item was catalogued into the filing system).
  • Date last accessed (to record when an item has last been retrieved thereby enabling items that have never been accessed, or least recently accessed, to be identified as most suitable for archiving).

By 1988, the amount of paper had overflowed the upright filing cabinet and I had started to archive less frequently accessed documents to cardboard liquor boxes, which were stored around me in my office. The architecture of the Electronic Index system is illustrated below.

3. Architecture of the electronic index and document management system, Jan1995 – Feb2018

In 1994, Fujitsu offered to support the work by providing, on long-term loan, a scanner and a magneto-optical storage system; and, in 1995, DDS, the European distributors of the PaperClip document management system, provided PaperClip on a long-term loan basis. By the beginning of 1996, these components had all been trialled and tested and were operational. Paperclip provides drawers, folders within drawers, and documents within folders. All the PAWDOC folders were contained in a single drawer; and a single folder was allocated to each PAWDOC Reference Number. Multiple documents of one or many formats could be contained within a folder.

Paperclip interworked with Filemaker by control key combinations which, when selected within Filemaker, copied specified information on the screen, imported it into Paperclip and enacted a Paperclip action. Two interworking functions were set up:

  • Search (search for a folder in Paperclip based on a reference number picked up in Filemaker);
  • Create (create a new folder in Paperclip and insert the Filemaker reference number into the folder index field).

PaperClip was upgraded to FISH in 2000. The architecture of the combined Index and document management system is illustrated below.

4. Architecture of the electronic index and Windows 10 folders, Mar2018 – present

The companies that owned the European rights to the Fish document management system were successively taken over in the first 15 years of the new millennia, and, by 2017, Fish was owned and supported by m-Hance. However, the digital preservation exercise undertaken on the PAWDOC collection in 2017 learned that there were no plans to develop Fish any further, so it was decided to remove Fish and replace it with Windows 10 folders. The conversion was undertaken successfully in 2018. The Electronic Index and Windows 10 Folders architecture is illustrated below.

PAWDOC: Costs and benefits

Operating a Personal Electronic Filing System is no trivial matter – it takes determination and an appreciable amount of extra time. So, is it worth it? The answer depends very much on the type of person you are. Those who like to be organised will find it helps them to be even more organised, as well as keeping their documentation in order and enabling them to find things when they need them. There is the added benefit of it gradually building a complete collection over a period of time, which can be referred back to at will. Those who are inclined to operate with a little less order may be disinclined to spend their time on what can be perceived as a purely administrative activity.

The costs involved are relatively low. Laptop, scanner and storage technologies are now all sufficiently well developed as to be more than able to support a Personal Electronic Filing System. The additional cost of the PAWDOC system, over and above the laptop and its operating system, is around £900.

Specific questions relating to this aspect are answered below. Note that the status of each answer will fall into one of the following 5 categories: Not Started, Ideas Formed, Experience Gained, Partially Answered, Fully Answered.

Q58. How much does a personal electronic filing system cost?

2001 Answer: Experience gained: The costs are falling. Approximate 2001 prices of the components are:

  • Filemaker Pro £220
  • FISH Document Management System and Sybase SQL Anywhere single user combination £2095 (a 10-user networked version of FISH costs about £10,000)
  • Fujitsu Scanpartner 10 Sheetfeed and Flatbed Scanner £821
  • Fujitsu MO Drive (currently 1.3 GB) £388
  • CD Writer hardware £82
  • Adobe CD Creator (often bundled with CD writers) £45

Total price (excluding the laptop PC) for the configuration used in this study is £3,650. However lower priced solutions with broadly similar capabilities can certainly be assembled from alternative products (here is a link to a list of low-cost Personal Knowledge Management Tools. [NB. the information was no longer available at this link in Jan2023]

2019 Answer: Fully answered: After removing the document management system from the PAWDOC architecture in 2018 and replacing it with Windows 10 folders, the 2019 approximate costs of the current PAWDOC system are as follows:

  • Filemaker Pro 18: new £520; upgrade from Filemaker Pro 15 £190
  • Canon DR-2020U scanner: £660 – but this model is probably discontinued: the current model with equivalent functionality can be bought for about £290.
  • Cloud service for ongoing backup: free with BT Broadband; equivalent standalone service – Apple iCloud, for example, £8/month.
  • 1 Tb External hard drive for local backup: Seagate £40
  • 500Gb External hard drive for remote in-country backup: Maxtor £30
  • 128Gb memory stick flash drive for out of country backup: Kingston £15

The total price (excluding the laptop and the Windows 10 operating system which came with the laptop) is £895 + £8 / month cloud backup.

Q59. Is it worth spending the time and money on a personal electronic filing system?

2001 Answer: Ideas formed: Yes, the core benefits of an improved ability to find documents and files – faster retrieval and space reduction – are achievable and do make a difference. Furthermore, these benefits continue to be achievable over many, many years. However, the desirability of these benefits, and the way the filing system is operated is highly dependent on individual preferences and work style.

2019 Answer: Fully answered: Before answering this question, there needs to be clarity on what is meant by a Personal Electronic Filing System (PEFS) because it may mean different things to different people. At its most fundamental level, it’s something most people who have computers do – they put electronic files into folders provided by the operating system. However, the PEFS that is being discussed here is something much more than that. It has three characteristics:

  • Operates as a single system: All files, regardless of size, file type and creating application, are stored in the same single system with the same standards and structure and organisation.
  • Digitises hardcopy: Hardcopies are incorporated into the same single system by digitising them. Any hardcopies that remain are retained either because they are artefacts with special value in the physical form, or to act as temporary working documents.
  • Controls documents: Files are not just put into folders willy nilly. Their existence is recorded and they are given a unique reference in their title so that they can be identified.

These special characteristics mean that an individual has to apply an amount of extra  effort, and to have a certain amount of determination, to operate a PEFS. Some people may not want to work in such a structured way; and other people may not want to expend that time and effort. Such people will feel that it just isn’t worth doing. If you are prepared to do it, however, my experience is that it is definitely worth it. It organises documents and enables you to find them again. It helps you to organise your work generally; and, by its very nature, it automatically builds a long-term collection which can be accessed at will. I personally found it so useful that it just became an integral part of my normal day to day work.

Regarding cost, I believe this to be much more reasonable these days and well within the reach of an individual.

Q60. What do other people think about this approach to electronic filing?

2001 Answer: Not started: Although at least three other people/groups (John Pritchard, Dave Harris and the CSC UK Consulting & Systems Integration Information Centre) have tried out this approach, no detailed work has been done to establish their views on its effectiveness and desirability.

2019 Answer: Fully answered: I have done no systematic work to establish people’s views about this type of Personal Electronic Filing System (PEFS). However, I have assisted at least 4 different people / organisations to implement such a system, so their willingness to try it out indicates that they could at least perceive that there might be some potential benefits. The four instances are:

  • My colleague at NCC, John Pritchard, who designed the PAWDOC schema with me. He used the approach until he left NCC in 1990.
  • The CSC UK Technical Library at Slough which applied the approach for about 18 months in 1988-90 using the Aquila application for the Index. The general approach was taken up in the reincarnation of the library in the form of CSC UK’s Consulting & Systems Integration information Centre in Farnborough which was certainly using the same Reference Number schema around 1995-96.
  • A colleague at CSC who worked for me, who used the approach for about 8 months in 1988-9 before leaving the company.
  • Another CSC colleague who used the approach for 5 years in 1985-90 before leaving the company.

I don’t recall any of these people/organisations saying that the approach was not workable or worthwhile. On the contrary, the person who used the system for 5 years said in the summary of his experience with the approach that he “found it to be a very effective way of controlling my own documents”.

Of course, these instances and anecdotes provide almost no hard evidence at all. It is probably only my long and highly documented use that gives any detailed insight. But at least the combination of the two sets of material may provide the basis for readers to form their own views.