Getting a dry grip

During a wet round of golf last Wednesday, I was reminded again of the problems of slippery wet golf club grips. In a previous wet round, I’d tried putting the club handle up inside the front of my waterproof jacket: it kept the handle dry but was fiddly. Last Wednesday, however, I tried putting the handle underneath my arm on the outside of my waterproof jacket which I found much easier, and just as effective at keeping the rain off the grip. Now, if waterproof jacket manufacturers could put some towelling or other drying device on the underside of one of the arms, which would dry already wet handles, I think we might have a solution to the problem.

Plans for an expanded ESB trial

The first iteration of this Electronic Story Board (ESB) work indicated that the concept might work for other types of items than books. So, I am planning to undertake another trial using mementos, photos, letters, household files, music, and a few more books. My intention is to explore how to include these different types in ESBs, and to see how they might inter-relate.

I shall continue to use the physical apparatus from the first ESB trial (designed to hold and display A4 laminated sheets); and to create self-contained ESBs in A4 PDF files for use within SideBooks on my iPad. These ESBs have to be self-contained, with subsets of Supporting Information included behind the first page, as I am not able to create PDFs with links to other files within SideBooks.

However, it is possible to create PDFs with links to other files within my laptop. Therefore, in addition to the set of PDFs for SideBooks, I will create an extra set with links to files of other items, for use on the laptop. For example, instead of including the first few chapters of a book within a self-contained ESB PDF, I will just include a link to the file containing the whole of the book’s contents; or, instead of including a photo within the PDF, I’ll just include a link to the relevant jpg file. The result will be smaller ESB files and, where appropriate, all the contents of each piece of Supporting Information will be accessible. This will provide a much closer simulation of the ESB system that I envisage – albeit without the immediacy of being able to manipulate a large wall display in front of you, and/or of interacting with a portable iPad. The possibilities of interacting with the ESB using voice commands will also be explored by using the Amazon Echo device in my study to call up music.

The physical apparatus being used will limit the number of ESBs to about 35 – around 5 of each type. To select the items concerned, I intend to use a random number generator to choose the first two or three mementos, and then to use any items (of any of the types being investigated) that emerge in the process of recording Associated Information. I will continue to apply this approach for each type of item until enough instances of each type have been identified. The aim is to produce both a random selection of items, and at least a few inter-relationships between the items.

To establish the Associated Information for each item, an initial assessment will be made and written up in free text form. When all items of a particular type have been assessed, a set of Category Prompts for that type will be derived from the set of free texts, and then applied to each item of that type (this process can be short-circuited for the Book items since Category Prompts for Books have already been identified in the first ESB trial). The Category Prompts will always include an ‘other comments’ section to ensure that all the points in the free text can be captured within one or other of the responses to the Category Prompts.

Shortly after creating this new set of ESBs, I’ll post a summary of the experience and of my initial impressions, here in this blog. A more detailed evaluation will then be conducted after the ESBs have been in place on the side of my study bookcase for about 15 months.

Time for Structure Substitution

The TED talk I’ve just listened to by Yaël Eisenstat (Dear Facebook, this is how you’re breaking democracy, Aug2020), is important because it explains how Facebook’s business model is dependent on creating constant interest and emotion in its users. This ultimately leads to the system essentially promoting extremism. As I was listening, it occurred to me that it is Facebook’s structures (the extra functionality provided around a simple messaging system – such as adding a ‘like’ button) that dictates this result. A Social Media system with a different set of structures could avoid such harmful effects. Perhaps it’s time for competitors, or an Open Source operation, to create a messaging system with structures that promote a society with people who listen to each other and work together; and to draw users away from Facebook. In the meantime, the more people who listen to Ms. Eisenstat’s talk the better.

ESB Uses and Supporting Model

The investigation described in previous entries used Books as the type of collection to explore the concept of the electronic story board. It showed that the collected material on an ESB tends to expand the reader’s attention beyond the particular item concerned; and that the set of information that the reader brings to mind when thinking about the item concerned becomes an entity in its own right when assembled together on an ESB. It also showed that a reader often explores the item itself, and some of these additional facets, when looking at an ESB.

If this is the case for books, it seems likely that the same effect can be achieved for other types of items – Mementos, Photos, Letters, Files – perhaps even Music. Mementos would seem to be particularly amenable to this kind of treatment since they are likely to evoke even more and stronger memories and feelings than Books. However, it must be remembered that, in the investigation using Books just completed, a particular set of categories was identified to help the owner bring to mind the related information; and it may well be that different sets of categories may be required for other types of items.

If ESBs can be used for different types of collections, each with a different set of category prompts, it may be worthwhile creating a model of all the key components with standardised terminology. I propose the following:

  • Collection Type (books, mementos, photos, letters, files etc.)
  • Recorder (the person who is identifying the associated information)
  • Category Prompts (the set of categories used to prompt the Recorder to identify associated information)
  • Item (the particular item within a collection that is the main focus of a particular ESB)
  • Associated Information (the information generated by a Recorder to surround a particular Item)
  • Rig (a standard layout for the way Items and their Associated Information are arranged on a collection of ESBs)
  • Supporting Information (additional material related to Items and to Associated Information, which can be accessed by links from an ESB)
  • ESB (a numbered display of an item and its Associated Information, with links to their Supporting Information)
  • Reader (the person who looks at an ESB and who may also access its links – this may be a different individual from the Recorder)

The diagram below illustrates how these components might fit together.

ESB Evaluation Results

I’ve now been through all 34 ESBs and made notes of between 30 and 300 words on my interaction with each one. This entry analyses those notes and derives some implications for the design of ESBs. The analysis assessed each part of the notes text and identified specific actions or observations as an itemised list. After completing this exercise for all 34 books, the itemised lists were inspected and generic statements derived for each item. For example, specific item d) for Book No 17 was ‘Read press release of merger’, and from this the generic statement ‘Prompted me to look at the related facts in the iPad version’ was derived. The generic statements were gradually standardised as the analysis proceeded and during a subsequent refinement process. The standardised generic statements were then grouped into two main sets (ESB Composition, and Reader Behaviour) and placed into one of seven categories – Layout, Content, Impact, Information access/search prompted, Facts discovered/re-discovered, Thoughts generated, and Reflections about the book. The generic statements, and the number of books for which a particular statement occurred, are shown in the following table.

Observations relating to ESB composition

The observations relating to the design of the ESB fall into the following categories: Layout, Content, and Impact.

  • Layout: All the ESBs were assembled using a standard template in which the book’s spine was placed in the centre of the page with the front cover immediately underneath it. Related points were placed around these two elements with those more intimately related to the book being closest to them. However, some of the spines and covers were smaller than others, and this clearly made a difference. In one case the spine was not recognisable, and in another it was mistaken for the wrong book. Another observation recorded that a cover was particularly noticeable. A related observation noted that some text on one of the related facts on an ESB was too small to read.
  • Content: There were several remarks about the range of material on the ESBs such as ‘Lot in the ESB’, ‘very interesting ESB’, ‘ESB seems so complete’, and ‘the range of topics on this ESB is relatively narrow’. One observation pointed out that some information on the ESBs is more familiar than other information. In three instances the presence of photos was remarked upon in a positive way, for example ‘has photos of people I know’. A feature which was not explicitly remarked upon, but which was identified during the analysis process, was that there were four instances of two ESBs which were related in some way or other.
  • Impact: Some remarks made it clear that some ESB’s distracted attention from the book and appeared to be texts in their right. For example, ‘With the ESBs you no longer focus on the book (which is what you do with a physical bookshelf) but on all the other info around it’, and ‘ESBs have become entities in their own right and the books are fading into the background’.

Observations relating to Reader Behaviour with ESBs

The observations relating to reader interaction behaviour with ESBs fall into following categories: Information access and search, Facts discovered/re-discovered, Thoughts generated, and Reflections about the book.

  • Information Access and Search: Quite often, a particular element on the ESB seemed to catch the eye (8 specific instances were noted). For example, ‘Noted that Forbes in 2002 voted it one of 3 most important business books in the last 20 years’, and ‘Noted that though it is the 53rd edition it was still fetching £10 on eBay’. Following an initial look at the ESB, I typically sought additional information either by following the link to the book itself (15 instances noted), following the links to the related information (another 15 instances noted – not necessarily the same 15), and conducting a search on the net (four instances). It is striking that several of the cases in which additional information was sought, involved reading texts I had written (6 instances) or reading documents related to work I had done (8 instances).
  • Facts Discovered/Re-discovered: In the course of seeking out additional information, I noted 13 instances in which I rediscovered information I’d forgotten – nine items I’d forgotten since producing the ESBs, and 4 items I’d forgotten a long time previously. For example, ‘The ESB confirmed I visited the Media Lab twice and with whom’, and ‘Noted that at least one was written while I was at NCC’. Furthermore, there were eight instances in which I discovered new facts from within the material that the ESBs were linked to, or from the searches I conducted on the net, for example, ‘Last page of the book refers to collaboration between NCC and CIMTECH which I’m not sure I heard about’, and ‘Read Bell’s Wikipedia entry and found he was involved in the design of the Vax computer which DEC gave us for Hicom’.
  • Thoughts Generated: As one would expect, reading the ESBs and the linked material prompted a whole raft of thoughts. The majority of those noted were related to something I had observed, experienced, or done (17 instances). For example, ‘Reflected that NCC’s demise is a sad story – but not, of course for the commercial operation NCC Group’, and ‘reflected on how right the Future Shock predictions were’, and ‘Read the last page of the first chapter and thought that the Harry Potter books might have been a more pleasurable experience than the films – perhaps true for many books.’. In another case, the experiences generated a simmering emotion within me which were re-ignited on reading an ESB. Another set of thoughts were about people I was reminded of – six instances of these were noted.
  • Reflections about the Book: The notes made on the ESBs included several reflections on the books themselves. Many of these (9 instances) were compliments about the books, for example, ‘Hardcopy was a nice design and had lots of useful info – summed up technology and capabilities of the time’, and ‘Was reminded that this is a great read’. A further two instances recorded a desire to re-read the books concerned again. There were seven observations about my relationship with the books, for example, ‘Realised I hadn’t looked at the contents of this book for a long time’, and ‘Book didn’t live up to my expectations’, and ‘Don’t think I ever read this book but watched the film’. In two cases I reflected on the physical characteristics of the book, for example, ‘Glad I kept hardcopy since tabbed books are hard to represent in scans’. Finally, for one of the ESBs, I wondered what had happened to the topics covered in the book.

Implications for ESB design

The amount of material to include on an ESB is totally dependent on the analysis of the owner’s thoughts about the book. Some books will stimulate the owner more than others. Consequently, some ESBs will inevitably contain more information than others, and be more interesting to the owner than others. However, the most significant finding from the observations about ESB composition is that the ESBs become entities in their own right, and that attention is drawn away from the books around which they are structured. Consequently, the fact that some of the book spines and covers were too small to recognise and read, becomes even more significant. No matter how much material is available to include on the ESB, the book spine and cover must be easily readable.

Two other points regarding ESB content emerged from this investigation: photos of people were highlighted a few times, so it seems worthwhile including such items where possible; and it was noted that some ESBs were related to each other. This latter point could be simply dealt with in the physical versions of the ESBs by adding a note such as ‘See also ESB #’. However, with a large electronic display it may be more useful to link directly to the related background information rather than to another main ESB – this aspect has yet to be explored.

Other than these two points, the general design of the ESB’s with the book spine and cover in the centre and other material around it, seems to work well. Of course, with a large electronic display, the constraints of an A4 page would not apply, but the principle of book in the centre with material around it would still apply. However, if the display first presented a bookshelf display of all the spines, from which a book was selected, the ESB would not need the spine and could just display the folded-out dust jacket or the front and back covers – this aspect too has yet to be explored.

Reader behaviour observations indicated that the links to extracts from the books and to related material, were well used and useful. The fact that net searches were made for additional information, and that new facts were identified in some cases, indicates that a facility to enable a reader to add additional material to a fully electronic ESB might be useful. Readers might also use such a facility to record some of the many thoughts which the observations in this investigation make clear are occurring throughout the interaction with a particular ESB.

Finding Nuggets Now and Then

The second of our investigations into the memorability and impact of information nuggets focuses on work documents that I read decades ago. I used to draw lines next to text I thought significant, so our experiment has taken a random sample of nineteen such documents, removed the marks I drew back then, and had me re-read and re-mark them.

Identifying and preparing the documents was quite a demanding process in its own right. My collaborator, Peter Tolmie, sent me document reference numbers identified by using a random number generator to select items from the 17,000+ entries in my document index. I then used Windows Explorer to pick up the first file in each reference number folder (making sure that the descriptive part of the file name was not visible) and sent the files to Peter who assessed if they possessed any marks. It took us about 5 iterations of this process, working with around 200 Ref Nos in all, to obtain nineteen suitable documents. Peter then removed all marks from the digital documents by a combination of cropping and overlaying white boxes, and sent the finished files to me.

I marked up the documents over 14-15th August, using the same reading approach I believe I have always used since those days i.e. not a detailed line by line read but more of a rapid scan through to pick up the gist of the contents and to identify key text to which I pay more attention and from which nuggets are drawn. There were about 190 pages to get through across the nineteen documents, and I used my PDF application to mark up the nuggets in a tasteful shade of green.

I had vague recollections of some of the documents, and no recollection at all of others. However, I don’t feel this particularly affected my choice of nuggets. Nor do I think that the context in which I was re-reading the documents (i.e my current retired state as opposed to the work I was doing at the time when I originally encountered the documents) was influencing my nugget selection. I started to think that perhaps my selections would be the same as the selections made by anyone – almost as though each document possesses some elements which are inherently nuggets in their own right regardless of reader. However, I did come across a few exceptions to this: for example, I marked up one short para simply because it mentioned the name of someone I knew. Another document was very specific to the organisation I worked for and I suspect my choice of nuggets was influenced by my own particular perspectives on the topics being addressed. This observation has made me muse about the possibility that each document may possess more or less ‘inherent’ nuggets depending on its place in a spectrum of document types ranging from general purpose article to company specific work text.

The exercise has also got me thinking about the difference between summary text and nuggets. Sometimes a short para summarising some key points is worth highlighting simply because it’s a quick route into the document. However, it begs the question as to whether the points being made within the summarising para, are of any great value.

These are all questions which I’m anticipating we will address at some point downstream. However, the immediate priority is to analyse the results of the exercises we have conducted. For this latter marking up exercise, my new mark- ups will be compared against the original mark-ups to see if there is any similarity. We’ll be posting the results here sometime in the next 12 months.

Four approaches and some decisions

I set out on this journey to find a way of archiving this pwofc.com web site. I’ve explored four different approaches, each with their own distinctive characteristics as summarised below:

Hosting Backup Facilities: No doubt different for each hosting operation, therefore my experience is limited to the hosting package that I use (which does not include a backup service).

  • Provides functions to download collections of files.
  • Creates a point-in-time replica of the content files of the web site.
  • The backup replica cannot be viewed on its own – it requires other facilities such as underlying database software to generate the web site.

HTTrack: This is a free software package that operates with a GNU General Public Licence. It downloads an internet web site such that it can be read locally in a browser.

  • Creates a point-in-time local replica which can be read offline with a browser, with PC response times.
  • It did not replicate the search facility available in my online web site.
  • It has complex configuration options and limited documentation and help – but these were not needed to undertake a simple mirror of the web site.
  • If the capture configuration specifies that external sites should not be captured, it provides URLs which can be clicked to go to that site (I did attempt to capture external web pages but the first time it failed after 2 minutes, and the second time (when I thought I had specified that it should collect just 1 specific external web page) it was collecting so much that I had to stop it running – I clearly didn’t configure it correctly…
  • The files that HTTrack produces can be zipped up into a single file and archived.

UK Web Archive (UKWA): A British Library service that stores selected web sites permanently; which captures updated versions on a yearly basis; and which makes all copies freely available on the net.

  • Requires that a web site is proposed for inclusion in the UKWA, and that approval is given.
  • Creates dated replicas of a web site which can be selected and viewed online in a browser.
  • Does not include the contents of external pages, and, in some cases, does not even provide a clickable URL of the external page.
  • In the replica site, the Home link on a page that has been linked to, doesn’t work.
  • In the replica site, two images with embedded links are not displayed and are replaced with just the text titles of the images.

Book: A copy of the web site printed on paper and bound in a book.

  • Creates a point-in-time replica of the web site on paper with some formatting adjustments to accommodate the different medium.
  • Produces a copy in a format which is very familiar to humans and which can be easily accommodated on a shelf in a house.
  • Cross referencing links work but are slower to follow than the digital equivalent.
  • May have better longevity than a digital equivalent.

As a result of these investigations, I’ve decided to:

  • Continue to use HTTrack to create mirrors of pwofc.com periodically
  • Continue to create a book of pwofc.com every five years: the next one is due in 2022 and will be called ‘Feel the Join’ (as opposed to the 2017 version which was called ‘Touch the Join’).
  • Be thankful that the British Library is archiving pwofc.com.

I think that just about wraps up all I’m prepared to do on this subject, so this journey is now complete.

The UK Web Archive

Its been over a year since I wrote about this journey, so I’ll start this entry with a short recap of where I’m up to. Back in March 2019, I decided I would explore three different ways of archiving this pwofc website. First, by using tools provided by the company I pay to host the site; second, by using a tool called HTTrack, and thirdly, by submitting the site for inclusion in the British Library’s UK Web Archive (UKWA).

My experiences with the hosting site tools was less than satisfactory, and are documented in a post on 28April2019 entitled ‘A Backup Hosting Story’. My use of HTTrack was much more rewarding; it produced a complete backup of the whole of the site which could be navigated on my laptop screen with near instantaneous movement between pages, and which could be easily zipped into a single file for archiving. This is written up in the 30Apr2019 post titled ‘Getting an HTTrack copy’.

I’ve had to wait till now to relate my experience of submitting the site to the British Library’s UK Web Archive (UKWA), because the inclusion in the archive has been a little problematic. Here’s what happened: following a suggestion from Sara Thomson of the DPC, I filled in the form at https://beta.webarchive.org.uk/en/ukwa/info/nominate offering pwofc.com for archiving. Within about three weeks I received an email saying that the British Library would like to archive the site and requesting that I fill in the on-line licence form which I duly completed. A couple of days later, on 16th March 2019, I got an email confirming that the licence form had been submitted successfully and advising that: “Your website may not be available to view in the public archive for some time as we archive many thousands of websites and perform quality assurance checks on each instance. Due to the high number of submissions we receive, regrettably we cannot inform you when individual websites will be available to view in the archive at http://www.webarchive.org.uk/ but please do check the archive regularly as new sites are added every day.”

From then on I used the search facility at http://www.webarchive.org.uk/ every month or so to look for pwofc.com but with no success. Over a year later, on 21st April 2020, I replied to the licence confirmation email and asked if it was normal to wait for over a year for a site to be archived or if something had gone wrong. The very prompt reply said, “Unfortunately there is a delay between the time we index our content and when it can be searched through the public interface. We aim to update our indexes as soon as possible and this is an issue we are trying to fix, please bear with us as we do have limited resources. Your site has been archived and it can be accessed through this link: https://www.webarchive.org.uk/wayback/archive/*/http://www.pwofc.com/.

Sure enough, the link took me to a calendar of archiving activity, which showed that the site had been archived three times – twice on 01July2019 (both of which seemed to be complete and to work OK); and once on 13Mar2020 (which when clicked seemed to produce an endless cycle of uploadings). I reported this back to the Archivist who scheduled some further runs, and who, after these too were unsuccessful, asked if I could supply a site map. I duly installed the Google XML Sitemaps plugin on my pwofc.com WordPress site, provided the Archivist with the site map url, http://www.pwofc.com/ofc/sitemap.xml, and the archive crawler conducted some more runs. The 13th run of 2020, on 22nd June, seemed to have been successful: the archived site looked just as it should. I then set about doing a full check of the archived site against the current live site to ensure that all the images were present, and that the links were all in place and working. The findings are listed below:

  • External links not collected: Generally speaking, the UKWA archive had not included web pages external to pwofc.com. Instead, when such a link is selected in the archive one of the following two messages is displayed: either “The url XXX could not be found in this collection” (where XXX is the URL of the external site); or “Available in Legal Deposit Library Reading Rooms only”. However, in at least two instances the link does actually open the live external web page. I don’t know what parameters produce these different results.
  • Link doesn’t work: For one particular link (with the URL ‘http://www.dpconline.org/advice/case-notes’), which appears in two separate places in the archive, there is no response at all when the link is clicked.
  • Home link doesn’t work on linked internal pages: links to internal pages within pwofc.com all work fine in the archive. However, the Home button on the pages that are displayed after selecting such links, doesn’t produce any response.
  • Image with a link on it not displayed: The pwofc.com site has two instances of an image with a link overlaid on it. The archive displays the title of the image instead of the image itself.

On the whole, the archive provides quite a faithful reproduction of the site. However, the fact that no information was collected for most external web pages, and no link to the external live web pages is provided either, is quite a serious shortcoming for a site like pwofc.com which has at least 26 such links. Having said that, the archive aims to collect all the web sites on its books at least once a year; and all the different versions appear to be accessible from a calendared list of copies; so, should one be able to get on the UKWA roster, this would appear to be quite an effective way to backup or archive a blog.

A Story Board a Day Evaluation

Yesterday I started an evaluation of my Electronic Story Boards. Its been over a year and a half that I first put them together and since then I’ve looked at them occasionally; referred to them when I needed some specific information; and even forgotten that some information I knew I had was actually on one of them. However, I haven’t yet made a methodical assessment of how interesting, useful or effective they are. I’m going to try and do that by looking at a different story board every day starting with No 1 and working my way through to the final one – No 35.

No 1 is the Levinson book on Pragmatics, and it’s story board effectively summarises my involvement in the Cosmos project. After looking at it, two words immediately came to mind – Rich, and Personal. That one single page is rich in content – every element bringing back powerful memories; and Personal – because all the content is to do with me.

Later on yesterday, I took a look at the electronic version on the iPad. It was simple to find – all 35 story boards are represented as thumbnails on a single Sidebooks screen on the iPad. Selecting the Pragmatics Story Board brought up a full screen image that looked exactly like the laminated version I’d been looking at on the side of my bookcase. It was just as rich and personal, and it also enabled me to click the arrows and bring up further pages of related material. But, interestingly, those further pages didn’t add a great deal to the experience. The sense of wonder and powerful feelings that I felt, were generated by the material on the main story board: the additional material didn’t really augment them. However, I thought, those supporting pages would certainly be useful if you were specifically looking for detailed information.

That was my initial experience in this 35 day evaluation. I’ll make notes as I go, and summarise my conclusions in 5 or 6 weeks time.

New version 2.5 of the Maintenance Plan Template

A couple of days ago I completed an experiment to use the Maintenance Plan template to undertake initial Digital Preservation work on a collection instead of using the Scoping document. It proved to be very successful. The collection is relatively small with only 840 digital files of either jpg, pdf or MS Office format, so there were few complications and I was able to proceed through the Maintenance Plan process steps without any serious holdups. The whole exercise took just over a week with the majority of the time being taken up by the inventory check of the digital files and of about 300 associated physical artefacts. I used the structure of the Maintenance Plan to document what I was doing and to keep a handle on where I was up to.

As a result of this exercise I’ve now added the following guidance to the beginning of the Maintenance Plan template, and equivalent text to the beginning of the Scoping document template:

If this is the first time that Digital Preservation work has been done on a collection

EITHER use the Scoping template to get started (best for large, complex collections)

OR use this Maintenance Plan template to get started (can be effective for smaller, simpler collections – retitle it to ‘Initial Digital Preservation work on the @@@ collection’ and ignore sections Schedule, 3, 4 and 7)

This concludes the interim testing and revision of the Maintenance Plan template. It has resulted in some substantial changes to the latest version 2.5 of the document (an equivalent version 2.5 of the SCOPING Document Template has also been produced). The final and most substantial test of the Maintenance Plan template will take in September 2021 when the large and complex PAWDOC collection is due to undergo its first maintenance exercise.