PAWDOC: Reliability and Longevity

Operating a Personal Electronic Filing System is just an adjunct to one’s main work, and, consequently, it’s at the bottom of the pecking order when it comes to an individual’s time and attention. This combined with the fact that we humans do make mistakes, means that filing tasks may build up, documents may get lost, scans may miss out pages, file titles may include incorrect Reference Numbers etc.. Despite all these problems, experience with the PAWDOC system has shown that it is possible to operate such a system successfully over the long term. It has also demonstrated very clearly that it would be almost impossible to maintain a hardcopy-based personal filing system across a lifetime of work; but that it is certainly possible to do so with a digitised version. The reason is simply that the volume of paper is overwhelming, whereas an equivalent digital collection is eminently manageable.

The very intangibility of a digital collection does, however, present dangers which need to be addressed if it is to survive. Backing-up is essential, and creating multiple backups placed on various media in different and distant locations is a wise move. Technology’s current incessant charge of development, also presents challenges to a collection’s long-term readability, and owners must be prepared to perform digital preservation work periodically to keep their hardware and applications operational and up to date.

Specific questions relating to this aspect are answered below. Note that the status of each answer will fall into one of the following 5 categories: Not Started, Ideas Formed, Experience Gained, Partially Answered, Fully Answered.

Q55. Will human errors make the filing system unworkable?

2001 Answer: Experience gained: No, because the number of errors is relatively low and fall mainly into the following categories:

  • Duplicate reference numbers on physical documents (the indexing system precludes duplicate reference numbers in the index) (Wilson 1992a; 2, 1992b: 2.10).
  • Hardcopy documents out of order in the cabinet/box.
  • Archived items not marked as archived in the index, and vice-versa (Wilson 1992b: 2.10).

2019 Answer: Fully answered: Human error will creep into most systems which have human operators – and probably even more so in personal filing systems which have to be managed alongside heavy workloads. The 2001 answer identified three types of human error discovered in the PAWDOC collection (duplicate Reference Numbers on hardcopies, misfiled documents, and errors in the Index Movements field). In addition to these, the recent checking and Digital Preservation work that has been undertaken on the PAWDOC collection identified several other types of error including:

  • 285 items have been lost over the years – about 1.6% of the total.
  • 33 instances of missing pages have been identified in scanned documents – probably caused by human errors in the course of scanning.
  • 9 records in which text was not copied correctly from emails into a word document (which was my preferred approach to capturing email text for inclusion in the collection) have been identified. This probably occurred because I failed to check that all the text had been pasted in.
  • 5 records where index entries have been inadvertently left empty – probably caused by a mix up in the course of creating new records.
  • 2 instances in which the wrong document was scanned so that the digitised document is not the document that is specified in the relevant index entry.

No doubt there are others. However, despite this, the filing system continues to work successfully, and, over the years, I have rarely come up against such errors when I have been searching for documents.

Q56. What backup arrangements should be put in place to protect the integrity or sheer existence of the filing system?

2001 Answer: Partially answered: A comprehensive collection of all one’s files becomes a unique irreplaceable entity over a period of years. To ensure its availability and existence the following measures need to be taken (Wilson 1992b: 2.13):

  • Regular backup of the index – daily is preferable, weekly is realistic, monthly is essential.
  • Regular backup of the electronic files and scanned images – daily is preferable, weekly is realistic, monthly is essential.
  • Printout of the index in KWIC (Keyword in Context) format – every six months or yearly (though I have never had software able to do this).
  • Secondary backups of index, electronic files, scanned images and KWIC index stored in a location different to the location of the primary backup media – every six months or yearly.
  • Tertiary backup in a secure environment such as a bank – every six months or yearly (I have not done this yet but am seriously contemplating it).

2019 Answer: Fully answered: Backing up is an essential element in any computer system. It is advisable to have at least two copies, one of which is held some miles away from the master. The PAWDOC backup regime is clearly described in the PAWDOC User Guide; and, to prompt me to actually perform the backups, I have a table with upcoming backup dates in a frame on the wall that is directly in front of me when I sit at my desk. The backup regime that is applied to the PAWDOC collection is as follows:

  • Cloud: Ongoing backup of new files and changes to files are made to a cloud service.
  • Offline backup to an external drive at home: New copies of the whole collection are taken once a year.
  • Copy on other laptop at home: The back up on the external drive described above is copied to the other laptop in the house immediately after the new copy has been acquired, i.e. once a year.
  • Remote UK external drive: The whole collection is copied onto this hard drive once every two years and it is stored at least 10 miles away from the master laptop.
  • Remote out of country backup: A copy of the whole collection is copied to a 128Gb memory stick and given to the person who lives in the country concerned, whenever I meet up with that person.

Q57. Are electronic filing systems reliable over very long periods?

2001 Answer: Partially answered: Over the 20 years of this project, the system has been very reliable. However, the following problems have been experienced or are anticipated as the system gets older:

  • Crashes of the index database – recovered either by functionality in the software or by using backups.
  • Magneto-Optical disk corruption (has happened to just one disk) – recovered by using backups.
  • Document management system has lost about 30 files – not sure how this happened and it was not recoverable.
  • Longevity of other people’s files – I am sure I could not now obtain some of the items belonging to other people to which I put a reference in my index 10 or 15 years ago.
  • Longevity of web addresses – I do not think that some of the web addresses the index points to will be still live after several years. We have yet to see whether web addresses of journal contents will be reliable over long periods (Wilson 1996a).
  • Electronic files stored in old versions of software when the original application software may no longer exist on your PC, or may have been upgraded beyond recognition. This is a potentially very serious problem over periods of 10 or 20 years or more (Wilson 1997: 1).

2019 Answer: Fully answered: The fact that the PAWDOC system is still fully operational after 38 years does demonstrate that such systems can be reliable in the long term, despite the inevitable loss of some documents or pages within documents.  However, in practice much depends on the diligence of owners and whether they are sufficiently motivated to take regular backups and to perform digital preservation activities on their collections. Taking an overall very long-term view, the longevity of such systems relies on the following 4 characteristics:

  • Visibility: Because an electronic filing system (EFS) is, by its nature, intangible and locked away somewhere inside a computer, the first essential requirement for it to survive is for one or more people to be aware of its existence. This can, of course, be achieved by simply telling people. However, PAWDOC’s existence is also fully documented in a Hardcopy User’s Guide which is contained in one of the two archive boxes in my study.
  • Accessibility: Knowing that an EFS exists isn’t the same as being able to get at it; over a period of years, laptops become defunct and inaccessible; and backup technologies may cease to work. Therefore, for EFSs to continue to work long-term, the platforms they run on must be kept up to date.
  • Integrity: For an EFS to work properly it is necessary to have all the software and data that it uses, in place. Missing data can be very annoying and even disasterous; whilst missing or corrupt application software can preclude the system working at all. Effective backup regimes can help to alleviate this problem.
  • Readability: The data files in an EFS can’t be read unless there is an application that can open them up and display them. Over time, applications get upgraded or may become defunct. Therefore, it is essential to implement a Digital Preservation routine that identifies files in danger of no longer being accessible and that takes steps to rectify the problem.

If all these aspects are addressed, an EFS should be able to survive for many, many years.

PAWDOC: Confidentiality, ownership and intellectual property rights

When working for an organisation, it is widely recognised that you will need a collection of documents to support your work. In building this collection, any explicit IPR regulations, or stipulations imposed by document owners, should always be complied with; and this includes very highly confidential documents which are probably best left out of the collection. However, if you operate your personal filing system on the basis that it is only for your own personal use, many of the IPR and Ownership issues may be overcome.

When leaving an organisation, individuals may or may not want to take some documents with them depending on the type of profession, stage in their career, their next job, etc.. However, in deciding what to take, they must follow explicit rules and regulations; and they should leave behind any material which is likely to give a significant competitive advantage to competitors.

Specific questions relating to this aspect are answered below. Note that the status of each answer will fall into one of the following 5 categories: Not Started, Ideas Formed, Experience Gained, Partially Answered, Fully Answered.

Q52. How should confidential documents be handled in an electronic filing system?

2001 Answer: Not started: Hardcopy documents can be stored in a vault and the associated reference numbers can point to the vault (Wilson 1992: 2.8).

2019 Answer: Partially answered: The most effective way of dealing with very highly confidential documents is not to include them at all in your personal electronic filing system. However, if you do want to record their existence, they could be included in your index in such a way that the Reference Number or Movement Status field points to their residence in a secure location. If you must have possession of a document itself, then it will be necessary to secure its existence on your laptop or other personal device with a combination of passwords and encryption – provided that satisfies any rules or regulations associated with it.

Q53. How do ownership and IPR issues constrain the operation of a personal electronic filing system

2001 Answer: Not started

2019 Answer: Partially answered: One of the most important principles to apply when assembling a personal document collection is that you deliberately restrict it to your own personal use.  This probably overcomes many of the IPR and Ownership issues related to documents that you haven’t created yourself. With that principle in place, I believe it is recognised that, while one is working for an organisation, you are entitled to have a collection of documents relevant to your work. Having said all that, it is essential to comply with any particular IPR regulations, or constraints stipulated by specific document owners, that you become aware of.

Q54. What IPR considerations have to be taken into account when moving from one organization to another

2001 Answer: Not started

2019 Answer: Partially answered: While one is working for an organisation, you are usually entitled to have a collection of documents relevant to your work. When leaving an organisation, I always felt that, as a consultant, there was a tacit understanding that you were able to keep copies of the documents you had created; but that there was a grey area in which the removal of large quantities of documents only partially related to one’s work, would cause concerns – probably because there was a fear that they might give some advantage to competitors. Herein lies a dilemma for the diligent personal filer: on the one hand it seems a shame to waste all the effort one has put into assembling and indexing a collection of documents, and on the other hand there is the knowledge that ostensibly taking it away when you leave an organisation might encourage a knee jerk reaction to prevent you removing it. To address this issue, two principles should be applied: first, comply with any explicit regulations concerning specific documents or information; and second, leave behind any documents or information that you believe would give a competitor a significant advantage.

PAWDOC: Sharing files

A personal filing system enables the individual to decide what to include in it, what to specify in the Index, and what to include in the file titles. Hence, a personal file system is very specific to an individual, and it is this very specificity that would make it difficult to share an index and the items it represents over the long term.

To address this problem the Reference Numbers in the PAWDOC system were designed to distinguish between different owners and different sets of information within each owner. The thinking was to enable entries for documents in other collections to be included in one’s own Index. It certainly did enable such entries to be included in the Index; however, experience over many years showed that it was almost impossible to get access to such documents after one had changed jobs or after many years had passed. Consequently, I stopped including such entries in the PAWDOC system.

Of course, the sharing of a small specific set of files for a relatively short period is much more feasible. Collaborative file sharing has been explored by the CSCW (Computer Supported Cooperative Work) community for many years; and services such as Dropbox are widely available on the net. However, this is not the same as sharing thousands of files on numerous different subjects over many years.

Corporate Document Management Systems have also been widely implemented in recent years and these do enable whole workforces to share an index and the associated files. However, these too cannot be considered to be personal filing systems since they have very specific corporate objectives with supporting regulations and constraints.

In short, the long-term sharing of personal indexes and files does not seem feasible or demanded at present.

Specific questions relating to this aspect are answered below. Note that the status of each answer will fall into one of the following 5 categories: Not Started, Ideas Formed, Experience Gained, Partially Answered, Fully Answered.

Q50. Can two or more people share the same filing index?

2001 Answer: Ideas formed: Only with great difficulty, unless deterministic indexing rules are adhered to and controlled descriptors are used. Unfortunately, this probably reduces the effectiveness of the index for all parties.

2019 Answer: Ideas formed: I have come across no evidence to suggest this is currently feasible. Corporate document management systems do enable individuals to share the same index – but such systems do not provide the longevity, portability from job to job, and freedom of choice about what to include, that a personal filing system offers. If such a system were to be developed it would need to cater for three different types of requirements:

  • Objectives: the ability to satisfy the different approaches and preferences of the individuals concerned while providing a coherent uniform system.
  • Individual characteristics: the ability to cater for things such as the different terminology used by each individual, and for the fact that each individual will only have a mental imprint of the documents they themselves have included in the system.
  • Technology: the use of compatible technology by all individuals including things like software applications to open documents, file formats, application versions, and a commonly accessible file store such as the cloud.

Q51. What is the most effective way to enable two or more people to share the same files?

2001 Answer: Experience gained: Make entries in your own index in your own words but use the other person’s reference number in your index instead of creating your own (Wilson 1992: 2.4)

2019 Answer: Experience gained: Two or more people can share files very effectively using a cloud-based service such as DropBox. However, this is probably used mostly for a limited number of files on a specific subject for a few months or years. I have not heard of large ongoing collections of personal files being shared in this way – though I suppose it might be feasible with some sort of shared index which is also held in the cloud. Large Corporate Document Management systems also enable people to share files, but such systems have different objectives, regulations and constraints and cannot be considered to be personal filing systems.

PAWDOC: Portability

Since the early 1980s, office work has become increasingly decoupled from a single static location. Most people no longer have their own offices – in fact, many people no longer have their own desks. Hot desking and working from home are commonplace.  Business travel by car, train and plane is widespread; and working away from the base location in hotels or remote offices is a regular experience for some people. This mobility, and the fact that most office work is now undertaken electronically, continues to drive the development of powerful portable computers. The modern laptop is very light and powerful, and can have a huge storage capacity. The PAWDOC collection is stored on a machine weighing 1.3 kg and takes up just under one twentieth of the total 1Tb capacity of its SSD solid state drive.

Specific questions relating to this aspect are answered below. Note that the status of each answer will fall into one of the following 5 categories: Not Started, Ideas Formed, Experience Gained, Partially Answered, Fully Answered.

Q48. Are electronic filing systems portable?

2001 Answer: Experience gained: Yes, except for the scanner. The software systems run very well on a laptop computer. Most, if not all the electronic files and scanned images can be held on a laptop’s hard disk (which in 2001 would normally have between 5 GB and 10 GB available). If additional storage is required, CDs will meet the requirement.

2019 Answer: Fully answered: Yes – modern laptops have more than enough storage capacity to store all the digitised contents of a personal filing collection; and they have the power and display technology to enable fast searching and easy reading of the selected documents. Such laptops are light-weight and eminently portable. Even portable scanners have been available for some time. However, scanning technology is now commonplace and is likely to be available in most offices (often through the office photocopier), so carrying around a portable scanner may not be worth bothering with.

Q49. Under what circumstances does a filing system need to get used away from the base office?

2001 Answer: Experience gained:

  • While travelling in trains, planes, hotels and hot desks.
  • While working for periods of days, weeks or months away from the base office. (Wilson 1992b: 2.9).
  • While at home.

2019 Answer: Fully answered: In addition to the circumstances listed in the 2001 answer (while travelling, while working away from the base office, while at home) I would add the following:

  • While attending meetings.
  • Any place and situation in which the owner is working.

PAWDOC: Technology requirements and problems

To operate a personal electronic filing system, you need a computer with a screen, a scanner, software to manage an Index and the documents, and a general approach. My colleague, John Pritchard, and I decided to explore what it would be like to operate such a system after visiting Amoco in the USA, and we followed the approach that we had seen there: every document was given a reference number and an index entry and was then stored in reference number order. Searches were performed on the index and retrieval was achieved by using the reference number.

We were able to apply the approach immediately using index cards. However, the technology to support the approach took a long time to become sufficiently powerful and cheap to become feasible for the individual to apply it: and it took many more years before it could be considered to fully support personal electronic filing systems. Consequently, much of the experience gained in using hardware and software to support PAWDOC has been in how to manage imperfect technology solutions. This has been particularly the case with computer storage which was insufficient and expensive when I first started scanning PAWDOC documents in 1996. The bulk of scanned documents had to be held offline on Magneto-Optical disks and this not only imposed a whole set of management requirements but also constrained the portability of the system. Today, however, storage is plentiful and cheap and the whole of the digitised PAWDOC collection is held on my laptop.

Scanners too have become better and cheaper since 1996. The first one I had was only capable of scanning in Black & White and one side of the paper at a time. Consequently, scanning large documents took a long time, and any colour on documents I scanned at that time has been lost. The scanner I have today takes less time to scan a page despite the fact it is also scanning in colour and both sides of the paper as it goes through the machine.

In many ways the software to support personal filing has always been in place, but its performance has been constrained by computing power. For example, the indexing software I use took over three minutes to conduct a complex search on less than 4,000 records in 1988, whilst my current version of the same software takes less than one second to conduct the same search on over 17,000 records.

The software to manage the stored documents has also been constrained by computer power – but in a rather unexpected way. In the 1980s and 90s when I first started using the PAWDOC system the conventional thinking was that a dedicated Document Management System was needed for the purpose. Such software applications were large complex beasts with numerous features and they relied on an underlying database application. Today, PAWDOC documents are stored in Windows folders labelled with a Reference Number. My laptop and the Windows 10 operating system are more than powerful enough to be able to display and search over 17,000 folders in just a few seconds. Such a solution would not have been feasible in the mid 90s, but today’s power has enabled a very complicated and constraining element of the personal electronic filing system architecture to be dispensed with.

Specific questions relating to this aspect are answered below. Note that the status of each answer will fall into one of the following 5 categories: Not Started, Ideas Formed, Experience Gained, Partially Answered, Fully Answered.

Q38. What additional software functionality is required?

2001 Answer: Partially answered:

  • A system which eliminates the need for two systems by combining simple and flexible indexing and searching functionality, and file management functionality which keeps track of the thousands of electronic files (Wilson 1996a: 3).
  • Facilities to detect low usage and to automatically recommend the destruction of paper (after scanning).
  • Intelligent synonym functionality that can recognize relationships between frequently used abbreviations and terms, and which requests the user to confirm possible synonym relationships (Wilson 1990: 96 ± 97).
  • The ability to automatically manage multi-part reference numbers of the type PAW/DOC/7653/01 and to be able to present the next unused number.
  • The ability to produce a KWIC (Key Words In Context) or KWOC (Key Words Out of Context) index (Wilson 1992a: 29,30).
  • The ability to store a set of web pages without losing the links between them (the FISH Document Management System is unable to do this because it stores each individual file with a new file name consisting of a combination of alphanumerics) (Wilson 1995b: 131)
  • Functionality to support the assembly, development and use of knowledge (Wilson 1997: 3 ± 4).

2019 Answer: Fully answered: My current views on the additional functionality listed in the 2001 answer are as follows:

  • Combined Indexing and file storage: Now that I have eliminated the Document Management System and replaced it with Windows folders, I no longer feel this is needed. However, despite retrieval being simple and quick, it could be made even more effective if the files associated with a particular Reference Number could be automatically listed under the Index entry for that number; and if the file you require could be selected and opened from that list.
  • Low useage detection: Now that all documents are digitised and paper is no longer taking up valuable space, there is no need to identify which hardcopies are not being accessed and could therefore be digitised and removed. Consequently, this requirement is no longer needed.
  • Intelligent synonym functionality: Terminology continues to change, so this is still required.
  • Management of multi-part Reference Numbers: This is still a requirement. It would make it quicker and easier to create new index entries.
  • Production of a KWIC index: I no longer produce paper backups of the index, so this is no longer required.
  • Store web pages without losing links: I now use zip functionality to combine and store the multiple files making up a single web site, so this is no longer required.
  • Nugget/knowledge management: I never clearly ascertained if this would be worthwhile or not (see more detailed discussion in the answers to questions 27 – 29, and also in the topic ‘Knowledge Development‘ elsewhere in this web site).

In addition, I would add the following:

  • Use of flexible Date formats: This is required to be able to specify BOTH exact dates (for, say, the date a document gets created or a letter is sent – dd/mm/yyy); AND partial dates (for, say, the year a book is published – yyyy – or the month and year of publication of a journal or magazine – mm/yyyy)

Q39. What technology problems have been experienced while operating the electronic filing system?

2001 Answer: Experience gained:

  • Replacing the PC requires the re-installation of all the software, which has been problematic on the last three occasions.
  • Upgrades of software can require a complex conversion process.
  • The index software (Filemaker Pro) crashes from time to time, but the Filemaker recovery function has always been able to deal with it except for an occasion over 10 years ago when the backup file had to be used (Wilson 1992a: 65, 72, 79).
  • About 30 of the files were lost in the document management system – probably in the course of moving them to off-line storage or moving them back into the PC’s hard disk (Wilson 1995b: 137, 139).
  • One of the Magneto-Optical disks became corrupted and the backup files had to be used (Wilson 1995b: 141).

2019 Answer: Fully answered: Over the last 20 years most of the technology problems I’ve had, seem to relate to four main areas – storage, specialist software, upgrades, and obsolescence:

  • Issues associated with the lack of cheap reliable storage: This problem has largely disappeared. When I first started scanning in 1996, I had to use external Magneto-Optical disks attached to my laptop, and I did suffer some data transfer and disk corruption problems. Today I have more than enough fast storage with the 1Tb SSD in my laptop.
  • The management and cost of specialist software: I had to deal with a wide variety of issues over the years with my document management software and its associated Sybase, and subsequently SQL, database. So much so that I have concluded that it is far better to avoid all specialist software if at all possible. It introduces complexity and is costly to buy, upgrade and support. While general purpose software may have fewer features, overall it is likely to be much easier to manage and use, and is likely to be a much more viable long term solution for the individual. I am very pleased to have eliminated the document management system and associated database from the PAWDOC architecture, and to now be using the much more familiar and straightforward Windows folders to store PAWDOC files in. I still use Filemaker for the Index but I regard this also as specialist software. Although it is very reliable and presents few management issues, it still has to be upgraded every three years at a cost of over £200 a time; whereas I know that I could still operate the Index if I exported the data to an Excel spreadsheet. In conclusion, I would recommend anyone setting up a personal electronic filing system to use standard multi-purpose software, preferably which you are already using, and to avoid specialist software if at all possible.
  • The complexities associated with upgrading platforms and operating systems: Moving systems from old to new computers, or upgrading operating systems, are major changes with associated risks. That’s not to say that it will necessarily be difficult – but over the years I have encountered issues and have found the more complex the systems being used the greater the challenges. The document management system had to be totally reinstalled from scratch when it was moved to a new laptop and that was something I only ever achieved once by myself without any supplier support. Now that PAWDOC only uses a Filemaker Index and Windows folders, the risks and difficulties associated with upgrades are much lower.
  • Obsolescence: As files are accumulated over the years, they may become unreadable because you no longer have the appropriate application software running on your machine. When I conducted a Digital Preservation exercise on the PAWDOC system in 2016-2018, I discovered many examples of such files, and it took a considerable effort to deal with the problems and achieve readable files again. Similar problems can affect hardware such as disks and memory sticks – though I feel less vulnerable on this front as I have sufficient storage on my laptop to cope with all PAWDOC requirements. However, anyone operating a long term filing system is going to have to undertake periodic Digital Preservation work of one sort or another to ensure that their documents continue to be readable.

Q40. What contingency arrangements can be made to minimize and overcome technology problems?

2001 Answer: Ideas formed:

  • Make clear notes on little used technology procedures and fixes.
  • Document system components and configuration settings.
  • Assemble support phone numbers.
  • Keep all of the above in hardcopy and in a place that does not require the filing system to find them.

2019 Answer: Fully answered: In addition to the four points made in the 2001 answer (make notes on procedures and fixes, document components and configurations, document support numbers, keep such documentation outside the system in hardcopy), I would add:

  • Be diligent about regularly backing up.
  • Ensure you know how to use backup data to reinstall applications.
  • If you have a specialist Index application, consider regularly exporting the data to a spreadsheet application so that, if the application fails, you still have immediate access to the Index.

Q41. What equipment is needed to operate a filing system and what are the key criteria by which it should be selected?

2001 Answer: Experience gained:

  • A high-resolution monitor preferably capable of displaying a whole A4 page in a magnification you can read.
  • A laptop computer with sufficient hard disk to store all the electronic files and scanned images in the collection, and with room for the growth of the collection.
  • An off-line storage system that can be used to make backups of all the collection’s electronic files and scanned images, as well as the electronic filing software application’s configuration, control and data files.
  • The equipment should not be too noisy.

2019 Answer: Fully answered:

  • A large screen high-resolution colour monitor big enough to display a whole portrait page sufficiently large as to be roughly readable without magnification, and//or capable of being turned into a portrait monitor as required.
  • A light-weight laptop computer small enough to be transported in hand luggage, with a high-resolution colour screen, sufficient fast SSD storage to accommodate the whole of the personal filing collection, and which makes a minimal amount of noise.
  • A colour scanner with both a sheet feeder and a flatbed capable of scanning documents at least A4 in size, which is reasonably fast, and is small enough to fit on or next to your desk. Its software should be capable of automatically adjusting the scan to the size of the document and automatically adjusting sloping originals to produce a vertical scan. It should also provide easy-to-use facilities for adjusting contrast and brightness to deal with poor originals, and for resetting after sheet feeder jams.
  • Two or more external hard disks or flash drives with sufficient capacity to store the whole of the personal filing collection, for use as a) a local backup, b) a remote in-country backup, and c) if required, a remote out-of-country backup.

Q42. What considerations should be taken into account when physically laying out the filing system?

2001 Answer: Partially answered:

  • Paper files should be placed so they are accessible while sitting at the desk (Wilson 1990: 94)
  • The scanner should be placed so it can be operated while sitting at the desk.

2019 Answer: Fully answered: Over the years I’ve had to cope with a variety of company offices and a long period of operating out of my home study. In all these situations I have tried to arrange the physical layout so I could conduct all my filing activities while seated at my desk. I’ve found this to be feasible and effective.  Hardcopy files can be placed in an upright filing cabinet (or cardboard boxes) alongside or behind one’s desk; and a scanner can be placed on the right-hand edge of the desk. Backup external drives can be placed in a pedestal drawer.

Q43. What criteria should be used to select an electronic filing system software package?

2001 Answer: Experience gained:

  • Ability to support the desired filing schema.
  • Ability to manage both hardcopy and electronic files.
  • Enables the rapid input of new items.
  • Enables easy and quick searching.

2019 Answer: Fully answered: In addition to the points made in the 2001 answer (support for the filing schema, management of both hardcopy and electronic files, rapid input of new items, and easy and quick searching), I would add:

  • Simplicity and understandability of the architecture of the system.
  • Ease of installation.

Q44. Is it feasible to construct a filing system out of multiple different software packages?

2001 Answer: Experience gained: Yes. However, provided all the requirements are met, it would be more efficient and easier to manage if only a single package was required.

2019 Answer: Fully answered: Yes, it is feasible, provided effective integration between the packages can be achieved, and provided not too much effort is required to set up and maintain the integration. However, it undoubtedly complicates matters and requires more effort to manage and maintain, therefore, the simpler the packages to be integrated the better. However, on balance I would not recommend it if a single piece of software will do the job.

Q45. How much file space do you need to store an individual’s personal files?

2001 Answer: Experience gained: Assuming only black and white scanning, no digitizing of journals or books, and no video material, a collection built up over 70 years would require approximately 53 GB (Wilson 2001a). Until experience is gained of colour scanning and digitized video, a more realistic figure cannot be estimated.

2019 Answer: Partially answered: The current PAWDOC collection can’t be considered a total lifetime collection because:

  • A substantial number of the colour hardcopies were scanned in B&W.
  • For about 10 years when I was working in Bid Management with highly confidential and fast-moving documents, the number of documents I was putting into the collection was much reduced.
  • The collection only includes about 30 years of my 40 years of work.

It should also be remembered that about half the collection was assembled under business conditions that were in transition from paper only to paper + electronic – very different from today’s environment. Furthermore, the type of work I did and my overlapping interest in technology research, dictated my coming into contact with a particular range of documents; different types of jobs and interests will dictate different numbers of documents of different types.

Having said that, all the items in PAWDOC have been digitised and the overall digital collection takes up about 46Gb.

Q46. How much file space is taken up by the average document?

2001 Answer: Partially answered: Chan’s results showed the following sizes for an A4 page: line art 87 kb; black and white 91 kb; halftone 181 kb; and colour 3347 kb (Chan 1993: 28). In practice, initial black and white scans at 240 dpi were producing an average file size of about 40 kb (Wilson 1995c: 1).

2019 Answer: Fully answered: File sizes vary depending on what application they have been created in and on whether they are scanned as colour or B&W documents. Therefore, file sizes for a number of these combinations were established using my current scanner (a Canon DR-2020U) to scan at 300dpi a single full page of typed text for the B&W document and a single page containing 5 colour photos of various sizes for the colour document.

  • B&W page created in Word 2007: 13 Kb
  • B&W page scanned in B&W to PDF: 105 Kb
  • B&W page scanned in Greyscale to JPG (the scanner would not scan in B&W to JPG): 579 Kb
  • B&W page scanned in 24 bit colour to JPG: 584 Kb
  • B&W page scanned in B&W to TIF: 69 Kb
  • Colour page created in Powerpoint 2007: 1,100 Kb
  • Colour page scanned in 24 bit colour to PDF: 808 Kb
  • Colour page scanned in 24 bit colour to JPG: 750 Kb
  • Colour page scanned in 24 bit colour to TIF: 25,389 Kb

Q47. What’s the best type of storage media to keep electronic files on?

2001 Answer: Experience gained: A hard disk in the laptop is best because it is so quick and easy to use. CDs are good because CD writers are cheap and CD drives are available in most laptops. Having said that, this does not preclude other media with similar characteristics.

2019 Answer: Fully answered: Its best to keep your files with you on a laptop – or on your mobile phone provided you have all the necessary applications on the phone and you feel the screen is big enough to be able to read the documents. However, since both laptops and phones are portable and therefore at higher risk of being lost or stolen, adequate measures must be taken to protect the data should the equipment fall into the wrong hands. Another possibility is to store the master set of files in a cloud-based service, however I believe that would be unwise due to the risks of the service failing or being subject to viruses or hacking. A cloud-based service may be suitable for backup, though external hard drives or SSD flash drives are cheap and effective enough for the purpose.

PAWDOC: Relationship between Work Patterns and Filing Activities

Dealing with documents is an inevitable part of office work, so filing work practices are not unusual – everyone has them. It’s just that if you operate a Reference Number-based filing system, those work practices may be a little different. In particular, every document has to be digitised, recorded in the Index, and placed in a single store.

Getting digitised documents is very much easier than it used to be. To start with, most documents are created and distributed electronically, so the amount of hardcopy is much reduced; and scanners these days are much cheaper and faster – and available in most of today’s offices. So, even if you are away from base, it will usually be possible to digitise hardcopy.

Recording a document in the Index is relatively quick and simple to do, provided that the number of Index fields is kept to a minimum. Then it is just a matter of creating a folder for the newly created Index entry; renaming the file to include the Reference Number and a short descriptive title similar to that specified in the Index Title field; and moving the renamed file to the newly created folder.

As with most regularly occurring activities, the less often you do it, the more the backlog builds up. I am firmly of the opinion that to operate this kind of filing system effectively, new documents should be put into the system as soon as they are received – or at least as soon as possible after that. However, whatever approach is taken, you will inevitably have to spend some time on the filing activity. If you want to reduce the time you spend, two possible strategies are, a) decide not to collect everything but only to collect documents on certain topics, or from certain people etc., so that the number of documents you need to file and manage is reduced; b) expand the scope of some index entries so that they will accommodate a greater number of documents, thereby reducing the number of new index entries and new folders that have to be created.

The benefit of having a digital filing collection is that, with today’s modern laptops and high capacity, cheap, storage, you can carry your information around with you and access it wherever and whenever you want. The commensurate downside of this is that your digital file store becomes a very precious commodity which needs to be protected and regularly backed-up in the event of loss or theft.

Specific questions relating to this aspect are answered below. Note that the status of each answer will fall into one of the following 5 categories: Not Started, Ideas Formed, Experience Gained, Partially Answered, Fully Answered.

Q36. How does this approach to filing affect work patterns?

2001 Answer: Experience gained: There are two main impacts. Firstly, it entails the regular indexing of new items as they are received and/or read. However, this not an absolute necessity since, as with any type of filing system, the new items can be piled up to be input in bulk sometime later. It is just much easier and effective to do a little often (Wilson 1990: 97). Secondly, it offers the opportunity to be much more sophisticated about capturing nuggets and developing knowledge. This almost certainly would affect work patterns, though it is not known how at present (Wilson 1997: 3 – 4).

2019 Answer: Fully answered: I am even more convinced than I was in 2001, that it is far better to include new items in the collection as soon as they are received and not to let documents accumulate. Scanning is no longer such a problem if you are away from your home base – a networked scanning capability can probably be found in most offices. If not, however, hardcopy documents have to be kept in a folder until you get back to your scanner.

It is probably best to keep a stock of the most recently acquired hardcopy (which you will already have digitised) in case you need them as working documents. This will entail having a designated box or drawer and managing it by eliminating the oldest when space becomes short. It is probably not worth recording in the index the existence of such a small specific stock of hardcopy.

Given that every item in the filing system will be digitised and stored on your laptop, it will be possible to carry around your entire collection and access it in any location. Such an accessible and useful store will inevitably become very precious, therefore you may need to take special measures to protect access in the case of loss or theft of the laptop; and you will need to maintain constant and effective backup arrangements.

Q37. What strategies can be employed to minimize user effort and maximize user motivation?

2001 Answer: Experience gained:

  • Don’t attempt backfile conversion.
  • File little and often, not lots infrequently.
  • Minimize the number of fields to input when creating index entries.

2019 Answer: Fully answered: In addition to the strategies identified in 2001 (no backfile conversion, file little and often, and minimize the number of fields), I would add the following:

  • Identify the categories of the documents you receive that you could do without and don’t file them.
  • Expand the scope of your Reference Numbers so that you can put more files in those folders without having to create new index entries.
  • Store just URL references for certain less important material so that you don’t have to copy text, create documents and save them to the digital store.

Note that the three strategies above involve making reductions in the overall set of documents that you file in the course of your work. This is dangerous because it’s difficult to predict which documents you will or won’t need to refer to in the long term. However, it’s a calculated risk based on your knowledge of what information you want and need to keep; and it may be a worthwhile risk if it reduces the filing load and gives you more motivation to keep abreast of the filing work.

PAWDOC: Hardcopy/Electronic mix

The intention of my colleague, John Pritchard, and myself when setting up our filing systems to be Reference Number-based, was to explore what it would be like to operate in an electronic office. Unfortunately, we didn’t have the electronic tools – indexing application, scanner, and document management system – to do the job properly. So, my PAWDOC filing system started off being totally paper-based.  I moved the Index into a database in 1986, but it wasn’t until 1996 that I acquired a scanner and a Document Management System (DMS); so, up until then, my operating practices had been moulded around the needs of paper, and even the electronic files I was creating and receiving were being indexed and stored in paper form (though usually with an electronic copy being kept in the Windows Explorer folder system).

This equilibrium started to change when I got the scanner and DMS. Gradually the emphasis started to move towards digital files. Whereas before I wanted to put paper into PAWDOC (giving rise to significant space problems) now I increasingly wanted to digitise and destroy paper. This change was also stimulated by the growing use of computers in the office, the rise in production of born digital documents, and their distribution by increasingly popular email systems.

In both these eras, however, I had had to manage a mix of paper and electronic files. Pre-1996 before I got the DMS, I mainly used paper and the electronic files were held in the background. From 1996, when I started using the DMS, I used a field in my Index to indicate whether I had a paper or electronic file, which enabled me to look in the correct place for one or the other. This worked well and I did operate effectively using a mix of the two media.

By the time I retired in 2012, I considered that the way paper was being used had changed completely. It was fast becoming just a secondary working medium, with the primary medium for creating and storing documents being electronic. I believe the transition to this third era, is now just about complete. It looks as though we’ll be using paper for the foreseeable future, and we will need to be capable of working with and managing both electronic and hardcopy documents: but, for the purposes of filing, there are only three types of material that need to be catered for:

  • Electronic copies of every item in the filing system
  • Hardcopy working documents – kept for a relatively short period
  • Hardcopy artefacts – significant or unusual documents that are wanted in their original physical form

The latter two categories are relatively small subsets of material – the actual size of which will be dictated by the inclinations of the filing system owner.

Specific questions relating to this aspect are answered below. Note that the status of each answer will fall into one of the following 5 categories: Not Started, Ideas Formed, Experience Gained, Partially Answered, Fully Answered.

Q33. How do you manage items that can exist in both hardcopy and electronic form?

2001 Answer: Experience gained:

  • Ensure there is an explicit marker in the index that indicates if there is an electronic file or hardcopy or both of a particular item.
  • Before throwing the paper away, check that any annotations, post-it notes etc are already included in the electronic document. If they are not, then either type or scan them in (Wilson 1995c).

2019 Answer: Fully answered: Electronic and hardcopy files can be managed in the same filing system by diligent use of the Index. In the PAWDOC system, those documents that are being retained long-term in hardcopy form (because they are significant or have unusual characteristics) have the abbreviation PHYS (for physical) in the ‘Movement Status’ field. It is also advisable to include a similar indicator in the file title of the equivalent digital document so that there is clarity throughout the system about what hardcopies exist.

Should an individual wish to keep some working documents in hardcopy form for a short period for use in meetings or to annotate them etc., then it is feasible to just keep them in a file or box after digitising them either without making any entry in the index (because they should only be a small subset of material which the individual should be familiar with); or with an indication in the index that a hardcopy is also being retained (though this imposes an additional management overhead to insert the indicator and to remove it when the hardcopy is destroyed).

Q34. Is it effective to manage electronic and paper files together?

2001 Answer: Partially answered: Definitely. It is much easier and faster than having separate indexes for each media (Wilson 1995c). In any case, since an individual has only one overall knowledge base, integrated support for that knowledge base should be provided regardless of the different media that parts of it are stored on.

2019 Answer: Fully answered: Yes, electronic and hardcopy files can be effectively managed in the same filing system. I have had extensive experience of doing so over the last 20+ years; it just requires diligent use of the Index. Everything is digitised so the default status is that there is a digital file but no hardcopy. If a hardcopy is retained as well, that information is recorded in the index.

Q35. Is it necessary to keep paper if an equivalent electronic file is available?

2001 Answer: Partially answered: Paper is still needed – even if an equivalent electronic file exists – in at least two circumstances. First, when the paper is to be used in meetings or in other situations in which it is not convenient to use a laptop computer (Wilson 1995b: 113, 114); and second, when you want to keep artefacts in their original form – be that paper, CD, videotape etc. If neither of these reasons applies, the paper can be destroyed and the ultimate scanning prize can be won – the freeing up of a large amount of physical space (Wilson 1997: 3).

2019 Answer: Fully answered: It isn’t absolutely necessary to keep paper if an electronic version is available, but individuals may wish to have both versions for two types of material: documents that you may need to use in the near future and that you prefer to work with in hardcopy format; and documents which are not easily replicated in the electronic environment or which you believe are so significant or unusual as to merit being worth keeping in their original format. The specific documents that fall into each category (if any), and the size of the resulting subsets, will depend on the needs and inclinations of each individual.

PAWDOC: Archiving

For the first fifteen years of using the PAWDOC system, I didn’t have the capability to scan documents or to manage electronic documents. Hence, in those years the filing system was oriented around hardcopy which is inherently bulky when it builds up over time. It wasn’t long before I ran out of space in the upright filing cabinet next to my desk and I was forced to select a subset of documents to put in boxes and store elsewhere in my office. As time went by, I began to run out of space for the archive boxes, so I started putting the oldest ones into the company store.

Over this period, I established an archiving routine and honed it until I had got it down to a standard procedure. I used a field in the Index to record when I accessed a document, and, if there was no entry in that field, I considered that document to be a candidate for archiving. I also put an indicator in the Index when a document was archived so that I knew where to look if I needed it. In the same way that documents were stored in the filing cabinet in Reference Number order, the archive documents were also stored in the boxes in Reference Number order to provide a reliable way of finding an archived document.

When I got a scanner and a Document Management System in 1996, my modus operandi changed. I started to scan every new document as I included it in the PAWDOC system at the same time as attempting to scan the huge backlog of hardcopy documents. After each scan I took a decision as to whether to destroy the hardcopy or to keep it – and more often than not I chose to destroy it. This then was a significant turning point when archiving was replaced by digitising. Of course, the digital route was still not that straightforward because computer systems were relatively slow, and digital storage was limited and expensive. However, as time went by and technology improved, these shortcomings were minimised. Today, it’s possible to buy a memory stick with sufficient storage for the whole of the PAWDOC collection – a lifetime of work documents – for less than £15.

Today there is still a need for hardcopy documents, but only for special or working documents. Digital versions are sufficient for the bulk of a personal collection. Consequently, there is no longer a need for bulky and growing sets of hardcopy material, and no longer a need to do any more archiving.

Specific questions relating to this aspect are answered below. Note that the status of each answer will fall into one of the following 5 categories: Not Started, Ideas Formed, Experience Gained, Partially Answered, Fully Answered.

Q31. Is it necessary to archive paper documents?

2001 Answer: Experience gained: It is only necessary to archive paper documents if:

  • You cannot scan them
  • You have not got enough space for the artefacts you want to keep in their original form.

2019 Answer: Fully answered: For paper-based collections, archiving was essential unless one had a very, very large office or study. However, digitisation now provides a store of virtually unlimited size, and, in my experience, only a very small subset of documents will be deemed to be sufficiently significant or unusual as to require being kept for the long term in both hardcopy and digital form. These subsets of precious hardcopy should be small enough to be kept in the average office or study. In practice, therefore, with modern systems, archiving is no longer necessary.

Q32. If it is necessary to archive, how do you do it and how long does it take?

2001 Answer: Fully answered: Implement a ‘date last accessed’ facility, whereby whenever an index entry reference number is accessed (to obtain the hardcopy or electronic document) you automatically record the current date. Archiving can then be performed on those index entries that have no date in the `date last accessed’ field. The whole process of using the index to select items to be archived, marking the selected index entries as `archived’, removing the items from the physical file and boxing them up takes approximately two minutes per item archived (Wilson 1992b: 2.10).

2019 Answer: Fully answered: With modern systems, I believe there is no longer a need to archive. However, if I did have to do so, I would probably use the method I developed in the 1980s and 90s – maintain a Date Last Accessed field and use that to identify documents I’m not using and which can therefore be digitised and/or archived.

PAWDOC: Use of Information

The practice of sidelining text in articles, papers, and books is not uncommon and is something I started doing in the late 1970s – primarily to assist the writing of technical books when I was working at the National Computing Centre. I started to use my PAWDOC filing system in 1981, so, by the late 80s I was aware that a) there were a lot of documents in my filing system that I hadn’t looked at for several years, and b) inside all these documents were a large number of sidelined significant points. I started to think about these points as nuggets of information, some of which perhaps were the bedrock of my developing ideas, but others of which perhaps I had simply forgotten. I wondered also if an explicit examination of all these nuggets might prompt inter-relationships to be identified and new ideas to be developed. I considered this to be a potentially valuable spin-off from all the effort expended in operating a very comprehensive personal filing system. The notion of actually making use of the information that PAWDOC contained instead of having most of it just lie there statically, was very attractive.

Consequently, I looked for some simple and inexpensive tools to try out these ideas, and came across Mind Mapping software on the free disks issued with PC magazines in late 1990s. I started experimenting with one of them but found it was too bitty just pulling nuggets out of the odd paper – I felt I needed a whole set of material to work with. So I created Mind Maps for 19 esoteric books (on subjects such as The Great Pyramid), but found it too difficult to inter-relate the different Mind Maps. That’s as far as I got.

I’m still not sure if there is any merit in explicitly managing nuggets to either just cement them in one’s mind, or to inter-relate them and develop new ideas; and, I must admit, I’ve never done any serious book review to see what other work, if any, has been done on this subject. The retrospective work that Peter Tolmie is planning to do with me may throw a little light on the impact that such nuggets may have had on me – but that’s as far as it goes. My current view is that explicitly managing nuggets is probably not worth doing, and that adding extra tasks to the job of managing a personal filing system may well be the stone that breaks the camel’s back.

Specific questions relating to this aspect are answered below. Note that the status of each answer will fall into one of the following 5 categories: Not Started, Ideas Formed, Experience Gained, Partially Answered, Fully Answered.

Q27. How can an electronic filing system be used to develop and use knowledge?

2001 Answer: Ideas formed:

  • Include substantive information in the index entries, for example phone numbers, book references, and expense claim amounts.
  • Identify the nuggets of information (i.e. the valuable bits) when you first read a document (Wilson 1997: 3 – 4).
  • Capture and structure the nuggets into the overall nugget-base at the same time as indexing the item (Wilson 1997: 3 – 4).

2019 Answer: Ideas formed: The first point in the 2001 answer to this question (‘include substantive information in the index entries’) has proved useful: so much so that I started including substantive information in the file names of digital documents (for example, total amounts claimed in the file names of expense claim spreadsheets).

With respect to nuggets, those ideas emerged from my practice of sidelining text that I thought significant. I developed the idea that these pieces of information (which I called nuggets) might be picked out, recorded and combined with other nuggets to produce novel ideas and concepts. I explored technology options that might assist in this process and decided Mind Mapping software might be worth trying, and tried it out with a variety of esoteric books that I was reading at the time. Although I Mind Mapped 19 books, I never took it to the next stage of combining them to see if I could develop any new concepts – there seemed to be no easy and effective way of doing so. That is as far as I got with this notion. I’m hoping that the experimental work I’m planning to do with Peter Tolmie on this subject might indicate if there is any merit in exploring these notions further or not.

Q28. What is the best way to capture and structure information nuggets?

2001 Answer: Ideas formed: By using a Concept Development tool. Some initial prototyping has been done using the Visual Concepts package and the eMindMaps package.

2019 Answer: Experience gained: I explored the use of Concept Development tools for this purpose by using the eMindMaps tool to capture nuggets from 19 separate esoteric books (on subjects such as The Great Pyramid). I found that, although it was probably quite a good way of summarising a book (or article or paper) on one page, there was no easy or effective way to combine several mind maps together or to relate an item on one mind map to an item on another mind map. So I concluded that such tools were not going to be an effective nugget management solution. I’m not intending to explore this any further; however, if I did, I would look into the collaborative concept development tools that I know were being explored by the CSCW community in the 1990s and from which commercially available software might have emerged by now. An alternative, much more feasible, solution might be simply to accumulate the nuggets in a spreadsheet. I guess the question of what tool to use is very much tied up to what one wants to do with the nuggets and what benefits can be achieved by working with them.

Q29. Is it feasible and practical to capture and structure information nuggets as well as indexing items?

2001 Answer: Not started

2019 Answer: Not started

Q30. Is it worthwhile building and developing an information nugget base?

2001 Answer: Not started:

2019 Answer: Ideas formed: I’ve always thought there was value in the nuggets I sidelined in articles and papers – which is why I started exploring this topic in the first place. However, whether there is any value in working with them in any way at all (either just accumulating them in a spreadsheet to cement each point in one’s mind, or inter-relating them in a specialist tool to develop new ideas) is still an unknown.

PAWDOC: Searching

The ability to search for and find a document is just about the most important aspect of a filing system, and that capability has undoubtedly been improved by increases in the power of the modern computer. For example, when the index for the PAWDOC collection was computerised in 1986 it took 228 seconds to conduct a standard search on 3200 records. By 2001, that standard search conducted on over 14,000 records took 7 seconds. That same search, now performed on over 17,000 records, takes less than a second on my current laptop – in fact it’s virtually instantaneous.

Of course, search speed is only half the story, since a targeted document also has to be selected and retrieved. The fact that whole digitised collections can be held on a modern laptop, means that this second part of the process can also be very quick. In fact, the total end-to-end search and retrieval time for the current PAWDOC system is typically between 15 and 30 seconds.

However, speed is not the most important element of a successful search. Instead, it is the ability to find what you are looking for – whether you are after a specific document or just doing a general search to see what you have on a specific subject. In the PAWDOC system, searches are conducted on the collection’s Index, therefore success is critically dependent on there being a match between the words specified in the search query and the words in the index entry of the targeted document(s). In a personal system, both elements – index entry and search query – come from a single individual’s mind, so more often than not a match is achieved. However, inevitably there are cases where a match isn’t achieved first time – and, sometimes, not at all. There are a variety of reasons for this including the passage of time (I placed my first document into PAWDOC some 38 years ago), and changing terminology as one becomes more familiar with a topic or as technology develops. To cope with these problems, strategies such as using terminology appearing in unsuccessful searches, and adding keywords to index entries when an item is eventually found, can be helpful.

Specific questions relating to this aspect are answered below. Note that the status of each answer will fall into one of the following 5 categories: Not Started, Ideas Formed, Experience Gained, Partially Answered, Fully Answered.

Q22. How long does it take to find items in the filing system?

2001 Answer: Partially answered:

  • Using the `t f o a d r’ test (find every record with the combination of t, f, o, a, d and r somewhere in it) on the 1988 Filemaker Index with 3,200 records running on a Macintosh computer, 27 hits were found in 228 seconds. The equivalent test on the 2001 Filemaker Index with 14,111 records running under Windows `98 on a Pentium II PC, took seven seconds to identify 612 hits. A search for all records containing the syllable `man’ across the same two systems took four seconds to identify 211 hits in the 1988 system, and one second to identify 1,137 records in the 2001 system (Wilson 1992a: 38, 85).
  • Having identifed the correct index entry (in the 2001 system) it takes between 10 and 20 seconds to obtain the reference number, go to the hardcopy cabinet/box, find the item and pull it out.
  • Should the index entry concerned refer to an electronic file, it takes about 6 ± 10 seconds in the 2001 system to have the document management software display the relevant folder, to double click the required item and to have it open up in the relevant application.
  • The total end-to-end retrieval time for the 2001 system is about 10 ± 30 seconds for hardcopy and 10 ± 20 seconds for electronic files.

2019 Answer: Fully Answered: Using the `t f o a d r’ test (find every record with the combination of t, f, o, a, d and r somewhere in it) on the 2019 Filemaker Pro 15 Index with 17,294 records running on a Chillblast Intel i7  computer with 8Gb of RAM, 1,224 hits were found in less than a second – in fact, almost instantaneously. A search for all records containing the syllable `man’ across the same system took less than a second to identify 1,538 hits (in fact it too was almost instantaneous).

Having identified the correct index entry, it takes between 10 and 15 seconds to copy the reference number, go to the Windows File Explorer screen, select the main PAWDOC folder, paste the number into the search field, press enter, and double click the folder when it appears. Since there may or may not be multiple files in the folder, and since different files may require different applications which probably open at different speeds, it is difficult to provide a reliable figure for selecting a file and opening it. However, as a very rough guide it is likely to take between 3 and 15 seconds.

Therefore, the total end-to-end retrieval time in the current system is approximately 13 – 30 seconds.

Q23. What can you do to speed up retrieval?

2001 Answer: Not started:

2019 Answer: Fully Answered: There are two key factors that affect retrieval times – Physical Proximity and System Integration. Physical Proximity relates to how close you are physically to the system and the hardcopy and/or digital documents. If you haven’t got the system with you then you can’t even identify the document you require, let alone retrieve it regardless of whether it is a hardcopy or digital document. If you are able to identify the document you require, and it is a hardcopy document, then the closer you are to the hardcopy documents the faster retrieval is likely to be (for example, retrieval will be faster if the hardcopies are in the same room that you are in as opposed to in a room down the corridor). If the document you require is an electronic document, then it will probably be a little faster to retrieve if it is on the same system you are using to search for documents, than if it is on some remote server elsewhere. Therefore, from a Physical Proximity perspective, retrieval can be speeded up by making sure that the Index and the digital store and any hardcopies are all as close to the user as possible.

System Integration refers to the linkage between the searchable Index and a collection’s digital files. Zero integration requires the user to remember the Reference Number selected in the search process, to go to the database of digital files, and to use the Reference Number to open the relevant folder. In contrast, a very high level of integration might be achieved by having the files being stored under a particular Reference Number, appear somewhere in the Index screen for that Reference Number, and being able to open a particular file from there. A halfway house might be to have a macro which will use a Reference Number identified in the index to automatically open up the folder of that particular Reference Number. Therefore, from a System Integration perspective, retrieval can be speeded up by reducing the keystrokes required to go from selecting an Index entry to viewing the files associated with that Index entry.

Q24. In what circumstances are searches conducted?

2001 Answer: Partially answered:

  • `Start Work’: focused assembly of information while under no pressure.
  • `Mid Work’: a search for a specific piece of information while pre-occupied with the interrupted activity.
  • `Visitor’: a search while you are talking to someone at your desk.
  • `Phone Call’: a search while you are on the phone to someone.

2019 Answer: Fully Answered: There are probably three intersecting dimensions to the circumstances in which searches are conducted: Activity (what you’re doing at the time you conduct the search); Work Content (the topic you are working on when the search is conducted); and Location (the type of place in which the search is conducted).

Four different types of Activity are described in the original BIT answer in 2001 (Start Work, Mid Work, Visitor, Phone Call) – though I have no data on the relative frequency of each of those. However, from experience I would guess that Mid Work occurred most often with the frequency of Start Work, Phone Call and Visitor occurring in that descending order.

Work Content might be the subject you are looking into, or the project you are working on, or the organisation you are working for, or any other categorisation that summarises the type of work being undertaken. Again, no data is available to identify what types of work content have been most associated with the searches made on the PAWDOC collection.

Location can be categorised as Employer’s Office, Other Organisation’s Premises, Travelling, and Home. I know that, over the years, I have indeed conducted many searches at all these types of location.

Q25. What are the most common types of searches?

2001 Answer: Partially answered:

  • The `Familiar Item’ search for an item that you have accessed several times before.
  • The `Long Lost Friend’ search for an item you are sure is there but have not accessed recently.
  • The `Shot in the Dark’ search to see if there is any material on a subject.
  • The `Literature Search’ to find everything you have on a subject.

2019 Answer: Partially answered: The 2001 BIT answer provides one perspective on the most common types of searches conducted on the PAWDOC collection (the ‘Familiar Item’ search; the ‘Long Lost friend’ search; the ‘Shot in the dark’ search; and the ‘Literature Search’). However, another perspective might be to categorise the types of document content most frequently searched for. This analysis might be feasible to perform using the ‘Date Last Accessed’ field’ in the Filemaker Index. Although this may not be an entirely accurate record of which items have or haven’t ever been searched for, it nevertheless does provide some sort of indication. Therefore, by categorising the 4,551 records which have an entry in the Date Last Accessed field (out of a total 17,294 records) and ranking the categories by number of occurrences, some indication will be gained of the types of documents that have been searched for and their relative frequency.

Q26. What are the most effective search strategies?

2001 Answer: Ideas formed:

  • Get into the habit of searching the filing system when you need some information even when you don’t think you have anything relevant; after several years you forget what you have (Wilson 1992a: 4, 25).
  • Let your mind roam freely when selecting search words; you are more likely to come up with words you originally specified as keywords (Wilson 1992a: 4).
  • Specify searches with minimal parts of words to avoid problems where spelling errors have been made.

2019 Answer: Fully Answered: In addition to the suggestions made in the 2001 BIT article (search just in case, let your mind roam freely, and use minimal parts of search words), I would add:

  • Use any older terminology you can think of if your current terminology isn’t coming up with the goods;
  • Check the results of unsuccessful searches to see if there are any terms which you might try in subsequent searches;
  • If eventually you are successful in a search that has taken some time, consider adding some additional search terms to the index entry to give yourself a better chance of quicker success in the future.