Four approaches and some decisions

I set out on this journey to find a way of archiving this pwofc.com web site. I’ve explored four different approaches, each with their own distinctive characteristics as summarised below:

Hosting Backup Facilities: No doubt different for each hosting operation, therefore my experience is limited to the hosting package that I use (which does not include a backup service).

  • Provides functions to download collections of files.
  • Creates a point-in-time replica of the content files of the web site.
  • The backup replica cannot be viewed on its own – it requires other facilities such as underlying database software to generate the web site.

HTTrack: This is a free software package that operates with a GNU General Public Licence. It downloads an internet web site such that it can be read locally in a browser.

  • Creates a point-in-time local replica which can be read offline with a browser, with PC response times.
  • It did not replicate the search facility available in my online web site.
  • It has complex configuration options and limited documentation and help – but these were not needed to undertake a simple mirror of the web site.
  • If the capture configuration specifies that external sites should not be captured, it provides URLs which can be clicked to go to that site (I did attempt to capture external web pages but the first time it failed after 2 minutes, and the second time (when I thought I had specified that it should collect just 1 specific external web page) it was collecting so much that I had to stop it running – I clearly didn’t configure it correctly…
  • The files that HTTrack produces can be zipped up into a single file and archived.

UK Web Archive (UKWA): A British Library service that stores selected web sites permanently; which captures updated versions on a yearly basis; and which makes all copies freely available on the net.

  • Requires that a web site is proposed for inclusion in the UKWA, and that approval is given.
  • Creates dated replicas of a web site which can be selected and viewed online in a browser.
  • Does not include the contents of external pages, and, in some cases, does not even provide a clickable URL of the external page.
  • In the replica site, the Home link on a page that has been linked to, doesn’t work.
  • In the replica site, two images with embedded links are not displayed and are replaced with just the text titles of the images.

Book: A copy of the web site printed on paper and bound in a book.

  • Creates a point-in-time replica of the web site on paper with some formatting adjustments to accommodate the different medium.
  • Produces a copy in a format which is very familiar to humans and which can be easily accommodated on a shelf in a house.
  • Cross referencing links work but are slower to follow than the digital equivalent.
  • May have better longevity than a digital equivalent.

As a result of these investigations, I’ve decided to:

  • Continue to use HTTrack to create mirrors of pwofc.com periodically
  • Continue to create a book of pwofc.com every five years: the next one is due in 2022 and will be called ‘Feel the Join’ (as opposed to the 2017 version which was called ‘Touch the Join’).
  • Be thankful that the British Library is archiving pwofc.com.

I think that just about wraps up all I’m prepared to do on this subject, so this journey is now complete.

The UK Web Archive

Its been over a year since I wrote about this journey, so I’ll start this entry with a short recap of where I’m up to. Back in March 2019, I decided I would explore three different ways of archiving this pwofc website. First, by using tools provided by the company I pay to host the site; second, by using a tool called HTTrack, and thirdly, by submitting the site for inclusion in the British Library’s UK Web Archive (UKWA).

My experiences with the hosting site tools was less than satisfactory, and are documented in a post on 28April2019 entitled ‘A Backup Hosting Story’. My use of HTTrack was much more rewarding; it produced a complete backup of the whole of the site which could be navigated on my laptop screen with near instantaneous movement between pages, and which could be easily zipped into a single file for archiving. This is written up in the 30Apr2019 post titled ‘Getting an HTTrack copy’.

I’ve had to wait till now to relate my experience of submitting the site to the British Library’s UK Web Archive (UKWA), because the inclusion in the archive has been a little problematic. Here’s what happened: following a suggestion from Sara Thomson of the DPC, I filled in the form at https://beta.webarchive.org.uk/en/ukwa/info/nominate offering pwofc.com for archiving. Within about three weeks I received an email saying that the British Library would like to archive the site and requesting that I fill in the on-line licence form which I duly completed. A couple of days later, on 16th March 2019, I got an email confirming that the licence form had been submitted successfully and advising that: “Your website may not be available to view in the public archive for some time as we archive many thousands of websites and perform quality assurance checks on each instance. Due to the high number of submissions we receive, regrettably we cannot inform you when individual websites will be available to view in the archive at http://www.webarchive.org.uk/ but please do check the archive regularly as new sites are added every day.”

From then on I used the search facility at http://www.webarchive.org.uk/ every month or so to look for pwofc.com but with no success. Over a year later, on 21st April 2020, I replied to the licence confirmation email and asked if it was normal to wait for over a year for a site to be archived or if something had gone wrong. The very prompt reply said, “Unfortunately there is a delay between the time we index our content and when it can be searched through the public interface. We aim to update our indexes as soon as possible and this is an issue we are trying to fix, please bear with us as we do have limited resources. Your site has been archived and it can be accessed through this link: https://www.webarchive.org.uk/wayback/archive/*/http://www.pwofc.com/.

Sure enough, the link took me to a calendar of archiving activity, which showed that the site had been archived three times – twice on 01July2019 (both of which seemed to be complete and to work OK); and once on 13Mar2020 (which when clicked seemed to produce an endless cycle of uploadings). I reported this back to the Archivist who scheduled some further runs, and who, after these too were unsuccessful, asked if I could supply a site map. I duly installed the Google XML Sitemaps plugin on my pwofc.com WordPress site, provided the Archivist with the site map url, https://www.pwofc.com/ofc/sitemap.xml, and the archive crawler conducted some more runs. The 13th run of 2020, on 22nd June, seemed to have been successful: the archived site looked just as it should. I then set about doing a full check of the archived site against the current live site to ensure that all the images were present, and that the links were all in place and working. The findings are listed below:

  • External links not collected: Generally speaking, the UKWA archive had not included web pages external to pwofc.com. Instead, when such a link is selected in the archive one of the following two messages is displayed: either “The url XXX could not be found in this collection” (where XXX is the URL of the external site); or “Available in Legal Deposit Library Reading Rooms only”. However, in at least two instances the link does actually open the live external web page. I don’t know what parameters produce these different results.
  • Link doesn’t work: For one particular link (with the URL ‘http://www.dpconline.org/advice/case-notes’), which appears in two separate places in the archive, there is no response at all when the link is clicked.
  • Home link doesn’t work on linked internal pages: links to internal pages within pwofc.com all work fine in the archive. However, the Home button on the pages that are displayed after selecting such links, doesn’t produce any response.
  • Image with a link on it not displayed: The pwofc.com site has two instances of an image with a link overlaid on it. The archive displays the title of the image instead of the image itself.

On the whole, the archive provides quite a faithful reproduction of the site. However, the fact that no information was collected for most external web pages, and no link to the external live web pages is provided either, is quite a serious shortcoming for a site like pwofc.com which has at least 26 such links. Having said that, the archive aims to collect all the web sites on its books at least once a year; and all the different versions appear to be accessible from a calendared list of copies; so, should one be able to get on the UKWA roster, this would appear to be quite an effective way to backup or archive a blog.

A Story Board a Day Evaluation

Yesterday I started an evaluation of my Electronic Story Boards. Its been over a year and a half that I first put them together and since then I’ve looked at them occasionally; referred to them when I needed some specific information; and even forgotten that some information I knew I had was actually on one of them. However, I haven’t yet made a methodical assessment of how interesting, useful or effective they are. I’m going to try and do that by looking at a different story board every day starting with No 1 and working my way through to the final one – No 35.

No 1 is the Levinson book on Pragmatics, and it’s story board effectively summarises my involvement in the Cosmos project. After looking at it, two words immediately came to mind – Rich, and Personal. That one single page is rich in content – every element bringing back powerful memories; and Personal – because all the content is to do with me.

Later on yesterday, I took a look at the electronic version on the iPad. It was simple to find – all 35 story boards are represented as thumbnails on a single Sidebooks screen on the iPad. Selecting the Pragmatics Story Board brought up a full screen image that looked exactly like the laminated version I’d been looking at on the side of my bookcase. It was just as rich and personal, and it also enabled me to click the arrows and bring up further pages of related material. But, interestingly, those further pages didn’t add a great deal to the experience. The sense of wonder and powerful feelings that I felt, were generated by the material on the main story board: the additional material didn’t really augment them. However, I thought, those supporting pages would certainly be useful if you were specifically looking for detailed information.

That was my initial experience in this 35 day evaluation. I’ll make notes as I go, and summarise my conclusions in 5 or 6 week’s time.

New version 2.5 of the Maintenance Plan Template

A couple of days ago I completed an experiment to use the Maintenance Plan template to undertake initial Digital Preservation work on a collection instead of using the Scoping document. It proved to be very successful. The collection is relatively small with only 840 digital files of either jpg, pdf or MS Office format, so there were few complications and I was able to proceed through the Maintenance Plan process steps without any serious holdups. The whole exercise took just over a week with the majority of the time being taken up by the inventory check of the digital files and of about 300 associated physical artefacts. I used the structure of the Maintenance Plan to document what I was doing and to keep a handle on where I was up to.

As a result of this exercise I’ve now added the following guidance to the beginning of the Maintenance Plan template, and equivalent text to the beginning of the Scoping document template:

If this is the first time that Digital Preservation work has been done on a collection

EITHER use the Scoping template to get started (best for large, complex collections)

OR use this Maintenance Plan template to get started (can be effective for smaller, simpler collections – retitle it to ‘Initial Digital Preservation work on the @@@ collection’ and ignore sections Schedule, 3, 4 and 7)

This concludes the interim testing and revision of the Maintenance Plan template. It has resulted in some substantial changes to the latest version 2.5 of the document (an equivalent version 2.5 of the SCOPING Document Template has also been produced). The final and most substantial test of the Maintenance Plan template will take in September 2021 when the large and complex PAWDOC collection is due to undergo its first maintenance exercise.

More than a Maintenance Plan?

Yesterday I finished the maintenance work on my PAW-PERS collection and so now have a refined version of the Maintenance Plan template based on two real-world trials. However, before publishing it, I’m going to take the opportunity to see if it could be used to start every Preservation Planning project. I’m able to do this because I have one other collection which has, as yet, had no preservation work done on it. It is the memorabilia that my wife and I have accumulated since we were married, and it is called SP-PERS.

Each of the three collections that I have subjected to Digital Preservation (DP) measures so far, have been through the process of creating a Scoping document followed by the production and implementation of a DP Plan, and finally the creation of a DP Maintenance Plan specifying works a number of years hence. However, my recent implementation of Maintenance Plans has led me to believe they might provide a structured immediate starting point for any preservation planning project.  They do not preclude Scoping documents etc. – indeed they explicitly discuss the possible use of those other tools halfway through the process. So, the opportunity to try using the Maintenance Plan template as a way in to every DP project is too good to miss. I’m starting on it today.

First trial of the Maintenance Plan

Today I completed the first real trial of a Maintenance Plan using the Plan I created for my Photos collection in 2015. It was one of the first Plans I’d put together so is slightly different from the current template (version 2.0 dated 2018). However, both have the same broad structure so the exercise I’ve just completed does constitute a real test of the general approach.

Overall, it went well. In particular, having a step by step process to follow was very helpful; and I found it particularly useful to write down a summary of what I’d done in each step. This helped me to check that I’d dealt with all aspects, and gave me a mechanism to actively finish work on one step and to start on the next. I found this to be such an effective mechanism that I modified the current Maintenance Plan Template to include specific guidance to ‘create a document in which you will summarise the actions you take, and which will refer out to the detailed analysis documents’. It’s worth noting that I was able to include this document as another worksheet in the collection’s Index spreadsheet, along with the Maintenance Plan constructed in 2015 and the Maintenance Plan I have just constructed for 2025. Being able to have all these sub-documents together in one place makes life a whole lot easier.

The exercise also identified another significant shortcoming of the template – it includes no details about the collection’s contents and their location(s). Consequently, an additional ‘Contents & Location’ section has been included at the beginning of the template.

The Photos collection has certainly benefited from the exercise; and the experience has enabled me to make some useful modifications to the template. I intend to tackle the second test of the Maintenance Plan (for the PAW-PERS collection) in the next few weeks, and will then publish an updated version 2.5 of the Maintenance Plan template which will include all the refinements made in the course of these two trials.

Maintenance Plan Template Refinement

The final piece of work in this Digital Preservation work is to test and refine the Maintenance Plan template. I’ll be doing this by implementing the following plans drawn up in earlier stages of this preservation journey:

I’m late in starting the PAW-PERS maintenance work because earlier this year I was focused on completing the ‘Sorties into the IT Hurricane’ book. Now that’s out of the way, I plan to complete the PAW-PERS and PHOTO maintenance during May and to use that experience to update the Preservation MAINTENANCE PLAN Template – v2.0, to version 2.5. The insights gained in the major maintenance exercise on the PAWDOC collection in Sep 2021, will be used to produce version 3.0 of the Maintenance Plan template. Updates to the other templates (SCOPING Document, and Project Plan DESCRIPTION and CHART) may also be made at that point if necessary. I shall offer the revised templates to the DPC for inclusion in their website. These will be the final activities in the Digital Preservation work being documented in this journey.

A subjective halfway view

I’ve just acted as subject in our first investigation into the memorability and impact of information nuggets. The nugget material, in this case, was mindmaps of key points in nineteen esoteric-type books which explore perceived unresolved mysteries from ancient Egyptology to modern secret societies.  I discovered that I could remember almost none of the points presented to me and was unable to link any of them to a particular development in my thinking. My immediate reaction to this disappointing – but probably to be expected – finding was that these are not actually nuggets of information but instead are just parts of a summary of each book.

However, on reflection, I’ve reversed that view. After all, when I was picking out the points as I read, I must have thought each of them to be significant – otherwise I wouldn’t have picked them out. So, how is a key point in a book different from a key point in, say, a five page article? Well there are some obvious differences like the book is a lot bigger and has a lot more stuff in it – most of which I’m not familiar with AT ALL. Unless one has a photographic or otherwise superb memory, you wouldn’t expect to remember everything in such a book after one quick casual read. Of course, I have the books on my bookshelf and have the look of each one locked in my memory with some ideas of what it’s about. However, this is the case because there are just a few hundred of them, and they have a rich content and the covers and spine usually have distinctively memorable images. In contrast, the articles and documents in my work collection (which are due to be investigated next), are much more numerous; are hidden away in my computer (with just a few in my physical archive box); and they all look very similar and have very few distinctive markings.

I guess I’ve expanded my thinking this morning about all this. However, I’m only the subject and we’re only half way through the overall exercise. The interesting bit will be what the researcher concludes from it all.

Nugget Investigation Plans

This entry has been jointly authored by Peter Tolmie and Paul Wilson

A key part of the motivation for keeping texts is that they contain information – nuggets – that have some kind of future value. It’s rare for a whole document to be seen as having such value in its entirety. Nuggets are more often certain sentences or paragraphs. The question is, what happens to that value over the lifespan of an archive. Does the value of specific nuggets persist? Does it change? Does it grow or reduce in relevance? Does it become eroded to the point of obsoletion? And, given the same documents at some later point in time, would the same nuggets be identified, or would something else stand out as being more important?

To assess the use and impact of identifying nuggets, three separate investigations will be conducted using documents containing nuggets that were collected over the last 38 years. In all cases, investigations will be undertaken with reference to the individual who originally identified the nuggets. Two of the investigations will focus on the individual reflecting himself upon his use of nuggets. The third investigation will focus upon extracting information about the nuggets and their use through discussion.

The investigations will be performed using two separate sets of material:

Set 1: A set of 19 MindMaps relating to esoteric books.

Set 2: Documents from the PAWDOC collection.

Approach

Investigation 1: The first investigation will be a written exercise using the set of Mindmaps and will attempt to assess:

a) Whether individual nuggets can still be recalled;
b) Whether the sources of those individual nuggets can still be recalled;
c) How significant specific nuggets are considered to be by the individual concerned;
d) What other nuggets, if any, are associated with the one in question;
e) What specific concept(s) individual nuggets are believed to have contributed towards.

Investigation 2: The second investigation will also be a written exercise and will explore what nuggets, if any, can be identified by the individual concerned in unmarked versions of randomly selected documents from the PAWDOC collection where nuggets have previously been identified. From this exercise it is hoped to deduce:

i) Whether nuggets lose their vitality over time, with new concepts being derived and becoming established.
ii) Whether looking at the documents anew, within a distinct context, will lead to the identification of different nuggets with different relevance to the individual.

Note that the first 2 investigations are interleaved so as to be able to maximise the amount of discussion possible in the concluding interview.

Investigation 3: This investigation will be conducted as an interview and will explore similar themes to those tackled in the prior investigations but in a more open-ended way, so that the reasoning involved in retaining documents for the sake of specific features can be examined in greater detail. The interview will also seek to examine in detail topics that have spanned all three investigations.

Method

Investigation 1: Nineteen nuggets, each from separate MindMaps, will be randomly selected from the MindMaps of books on esoteric subjects. Randomisation will be achieved by creating a template that divides an A4 page into 24 numbered areas of equal size. A random number generator will then be used to create nineteen separate numbers ranging from 1 to 24. The Researcher, Peter Tolmie, will generate the numbers and use each one in conjunction with one of the MindMaps and the template to identify the nugget(s) present in that location. If more than one nugget occurs in that location, the topmost one will be selected.

The Researcher will assemble the nineteen nuggets and send them to the PAWDOC Owner and Subject, Paul Wilson, who will be asked to provide a written answer to the following questions in relation to each one:

  • Do you remember where this nugget came from?
  • Do you remember why you might have marked this out as a nugget?
  • How important do you consider this nugget to be now?
  • Do you remember what other nuggets were associated with this one?
  • How did you use this nugget and what other things did you develop on the back of it?

The Subject will then be sent the original MindMaps in which the nuggets appeared, and will be asked the following questions for each one:

  • Do you remember more about why you marked this as a nugget now?
  • Now you see the MindMap it came from and the other nuggets it was associated with, do you see it as more or less important?
  • Do you remember anything more now about how you used it or what other ideas it may have contributed to?

The Subject will return his responses to the Researcher, who will categorise them and place the results in an analysis spreadsheet. The overall analysis and specific instances regarding the Subject’s responses and reactions will be written up as the findings for Investigation 1.

Investigation 2: Nineteen separate documents, each containing nuggets, will be randomly selected from documents included in the PAWDOC collection between 1981 and 2011. Randomisation will be achieved by using a random number generator to create numbers relating to the 16925 entries in the PAWDOC Index sequenced in the order in which they were created.  The Researcher will generate the numbers and use his copy of the Excel version of the Index to identify the first nineteen of the entries for which an electronic file exists (some entries just contain the information they relate to and have no associated documents – these are identifiable by the contents of the Movement Status field); and for which the associated electronic files are likely to contain highlighted nuggets (items such as, for example, Health & Safety booklets, are unlikely to contain highlighted nuggets). He will send the list of numbers to the Subject who will open the folder labelled with each particular number, take a copy of the first file that appears in each folder and will send the files back to the Researcher. The Subject will take as little notice of the file titles as possible by setting up the Windows File Explorer window to display only the beginning of the file name so that only the Reference Number, which is always at the start of the file name, is visible. Should the Researcher deem any of these files to be unsuitable, he will send additional numbers to the Subject until a satisfactory set of nineteen documents that contain nuggets has been obtained. Then, using a cropping tool or by obtaining a clean copy of the document from elsewhere, he will send clean unmarked copies back to the Subject. The Subject will read the documents, mark up any text that he considers to be nuggets, and will send the documents back to the Researcher.

The Researcher will then make a comparison between the original nuggets identified and the new nuggets identified and record the results in the analysis spreadsheet. The overall analysis will be written up as the findings for Investigation 2.

Investigation 3: Nineteen nuggets, each from a separate document, will be randomly selected from randomly selected documents placed in the PAWDOC collection between 1981 and 2011. Randomisation will be achieved by using the same procedure employed in Investigation 2.  The Researcher will take the first nineteen suitable documents that contain nuggets and randomly select one specific nugget from each document using a random number generator. The nineteen nuggets will be assembled together and presented to the Subject who will be asked to answer a similar set of questions to the first set of questions in Investigation 1. These questions will form the rough frame for the first part of an interview in which the various nuggets will be discussed.

The Subject will then be shown the original documents in which the nuggets appeared, and will be asked the following questions for each one, which will form the second part of the interview mentioned above:

  • Do you remember more about why you marked this as a nugget now?
  • Now you see the document it came from and the other nuggets it was associated with, do you see it as more or less important?
  • Do you remember anything more now about how you used it or what other ideas it may have contributed to?
  • Are there things in the original document that you didn’t mark as a nugget at the time that you would mark as a nugget now? If so, why?

A third and final part of the interview will explore the responses from across all of the assessments in a more open-ended fashion to generate deeper insights and discussion.

The Researcher will then transcribe the interview, categorise the responses and place the results in the analysis spreadsheet. The overall analysis and specific instances regarding the Subject’s responses and reactions will be written up as the findings for Investigation 3.

Conclusions: The Researcher will use the findings from all three Investigations to write the overall conclusions of the investigation.

Nuggets about Nuggets across 38 years

To review what I’ve done on the topic of information nuggets, I’ve been trawling through the PAWDOC Index and files. The earliest example of sidelined text that I can find in my document collection was from October 1981 when I was working at the National Computing Centre. I can’t remember why I started to do it – but it may well have been prompted by the method that NCC’s Chief Editor, Geoff Simons, used to construct his books. He explained to me that he read everything about a subject, identified key points and put them on Post-it notes which he stuck on the wall. When he was ready to write, he rearranged the Post-its into separate sections and in sequence within the sections – and I saw examples of this in his office. At some point I started to employ this technique to construct the best practice books I wrote at NCC – but using the word processor on our new Zynar Office System to assemble and organise the key points. Sidelining text was an obvious way to identify key material to feed into that process.

Around 1994 I started talking with City University academics Clive Holtham and David Bawdon with a view to undertaking a joint project on ‘The Paperless Office Worker’. A key strand of this work would involve me digitising my PAWDOC collection. Extracts from emails between us in the early part of 1994 included the following:

Email from Clive Holtham: ‘….I don’t record each document as Paul does, but file at a quite detailed level. What I am conscious of is how much I forget about what I already have. The equivalent of underlining is important – we need to consider something more than keywords to store with each piece.’

Reply from Paul Wilson ‘…I agree with needing to deal with the underlining problem. I have sidebars on most of my material – they are the information nuggets; but I don’t know how much use they would be out of context.’

Email from Paul Wilson: ‘… with reports , papers etc. I usually mark the nuggets of info within them – presumably these are the bits of information I really want.’

The collaboration with City University proved very productive: Clive Holtham introduced me to a product Manager in Fujitsu who loaned me a scanner; and to the owner of a small company called DDS who loaned me the Paperclip document management software. Soon I was scanning my existing paper documents and new ones as they arrived. In January 1997 I issued my 3rd briefing note on these activities and included the following towards the end of the four page document:

Despite the close relationship between filing and information use, contemporary filing systems provide little other than title and index fields to support the knowledge acquisition, synthesis and use process. Filing systems, it seems, are there just to store items and to aid their retrieval. Unfortunately, the personal knowledge acquisition, synthesis and use process is not supported adequately outside filing systems either. Some standalone packages do exist, but they are not intended to be used in a day to day manner for personal knowledge acquired in documents, electronic files and other artefacts. In fact, even the need for such support is not widely recognised.

It is not yet clear to me what support could be most beneficial. However the clues are littered throughout the practices of knowledge workers like myself. For example, whenever I read articles, papers and reports I always mark the good bits – the nuggets of information. These are key points which I particularly want to augment the knowledge in my brain. Sometimes when I have been researching a topic I collect together all the nuggets I can find, categorise them and reorder them, and synthesise a new view of the topic in question. Unfortunately, like the magnesium nodules on the floors of the deep ocean, huge numbers of nuggets now litter my filing system unseen and inaccessible. I hope they are in my brain and that they have been used to develop my current state of thinking – but I’m not so sure that their huge potential has been fully exploited.

For my filing activities to really start adding value I need tools which can record those nuggets as I consume and index each item, and which can enable me to reorganise those nuggets, add more nuggets, and synthesise new nuggets, in the process of actively developing my ideas. Such tools would, of course, maintain the links to the original source material stored in my files. And my files would become a combination of original source material and the representations of my developing thoughts and ideas.

Now that I am confident that I have the paper scanning and electronic file indexing activities under reasonable control, it seems high time to start addressing the critical area of information use and its role in knowledge management.

These are the earliest mentions I can find of information nuggets in the PAWDOC collection – and they give no indication of where I picked up the concept from. In fact I can only find one published mention of the term and that was in the Lotus Notes-oriented magazine ‘Groupware and Communications Newsletter’ from April 1998 in which one Ted Howard-Jones gave a brief description of a service implemented by a major financial institution to capture competitive information. He wrote:

‘Called Report-It!, this service captures knowledge using a secure voice-mail system and delivers categorised information directly to the desktops of office-bound managers and competitive information professionals. These nuggets of professional information are disseminated via Notes.’

The use of the term Knowledge in this article, and in my briefing paper mentioned above, reflected the fact that, in the late 1980s, the term Knowledge Management started to became fashionable and by the late 1990s had become a holy grail of IT Professionals, Management Consultants, and Academics. The first mention of the term in the PAWDOC Index appears in 1990, and occurs in a further 137 Index entries from then to 2016. The company I worked for (Computer Sciences Corporation – CSC) was a global computer services organisation with tens of thousands of employees worldwide. Its involvement in the Knowledge Management topic came from three angles: first, its clients started to ask about it and how to do it; second, its consultants and salesmen saw it as a potential source of revenue; and third, its employees and management began to think that they needed it internally to improve the effectiveness of the business. Hence, as a consultant in the UK end of the business, I was aware of or got involved in:

  • A number of initiatives to develop an offering or information for clients, including:
    • the development of a KM service by CSC Netherlands in 1990;
    • discussions with CSC UK Management Consultants who were developing KM propositions, in 1996-8;
    • the definition and design of a KM service by CSC UK personnel to address opportunities in a number of clients including ICI Paints, LUCAS Engineering, John Menzies and United Distillers, in 1996-7;
    • the publication a CSC Research Services Foundation report on KM in 1998;
    • news of CSC’s Global Knowledge Management services, in 2001.
  • The development of such systems for clients (primarily using web pages on intranets), including:
    • a presentation to ICI Paints in 1992;
    • the development of systems for KM and for a web-based ‘Gazateer’ of all KM, architecture and other organisational information, for the Nokia SCC – a new organisation being set up by CSC UK for Nokia, in 1997;
    • the design and population of a web-based KM system for Dupont Agriculture’s architectural components, in 1998.
  • the development and use of internal CSC solutions, including:
    • attendance at internal CSC workshops on developing an organisational learning infrastructure in 1996;
    • the development of an improvement process for CSC’s new application development organisation that was being designed and built from scratch, in 1998;
    • the design of a practical KM programme for CSC UK’s reorganised Consulting & Systems Integration unit, in 1999;
    • knowledge management work being done for BAE, in 1999;
    • an internal Community of Interest on the subject of Personal Knowledge Management, in 2001.

Of course, my rather lowly personal filing perspective had to be rapidly expanded as I entered the Knowledge Management (KM) arena to accommodate both high level Management Consultancy notions of ‘Intellectual Capital’ and the distinction between Knowledge and Information; and the practical need to derive benefits from an investment in KM by effectively sharing the knowledge that had been acquired. Indeed, one of my contributions to the internal Community of Interest mentioned in the last bullet point above seems to herald a change in my thinking. My opening sentence reads:

‘Since collaborating with everyone in this shared space I’ve had my eyes opened to the concept that KM is all about enabling people to find things and work together, as opposed to the idea that KM is all about nailing down bits of knowledge and providing it to people. I realise that there is significant crossover between the two approaches – but nevertheless giving priority to one or the other will result in significantly different activities.’

During the 1990s I learnt a great deal about what people thought Knowledge Management was, and also about the powerful potential of the new web technology to support KM. However, throughout this period I don’t recall any specific conversations or documents about dealing with underlined, sidelined, or highlighted text. Indeed, by the time I completed the draft of the paper summarising my PAWDOC findings in June 2001, it seems that my ideas on the subject had advanced no further than that reported above. The paper was published in the journal Behaviour & Information Technology (BIT), and addressed the subject as follows in the section on ‘Areas of Investigation and Summary Findings’:

Q27. How can an electronic filing system be used to develop and use knowledge?

  • Include substantive information in the index entries, for example phone numbers, book references, and expense claim amounts.
  • Identify the nuggets of information (i.e. the valuable bits) when you first read a document
  • Capture and structure the nuggets into the overall nugget-base at the same time as indexing the item

Status: ideas formed

Q28. What is the best way to capture and structure information nuggets?

Probably by using a Concept Development tool. Some initial prototyping has been done using the Visual Concepts package and the eMindMaps package.

Status: ideas formed

Q29. Is it feasible and practical to capture and structure information nuggets as well as indexing items?

Status: not started

Q30. Is it worthwhile building and developing an information nugget base?

Status: not started

The Concept Development prototyping mentioned in the answer to Q28 above probably started in April 2001 when I acquired a free copy of the eMindMaps software, and by the end of 2001 I had started making MindMaps of books on esoteric topics such as the Egyptian pyramids and the origin of Atlantis. I have no record of my detailed intentions in doing this, but I guess I wanted to experience the process of recording all the nuggets I found in a book – and then to explore what could be done to integrate and exploit the material from several different MindMaps. In all, I made MindMaps of 19 books over a two year period; but that’s as far as I got. At the end of 2001 I started a new job in bid management and my energies were increasingly taken up with managing very intensive bids, with documenting the bid process, and with operating a Lessons Learned programme. I had no time to pursue these information nugget ideas any further.

This concludes my review of my previous activities in the use of information nuggets. The questions I posed in the 2001 BIT paper still remain largely unanswered, and the operation of the PAWDOC system has not provided any further insights on the subject since then. However, the existence of the sidelined documents, and of the 19 MindMaps, do provide an opportunity to undertake some rudimentary practical work to explore if the information nuggets identified were memorable and of any use. Subsequent entries will outline the methods that will be used to undertake these investigations, and will report on the results.