Full Study Report Available

Jenny Bunn’s comments on the full draft were that she believed it was now good enough to submit to a journal but that at 24,000 words, it was probably too long and I would likely be asked to cut it down substantially. With that in mind, I decided to finalise the document as a “Full Study Report” before working on a shorter version for journal submission.

The work to finalise the Full Study Report was rather more intensive than I had envisaged. It entailed a 10 day slog working through the report from the beginning, and making adjustments to ensure it all hung together as well as correcting grammar and typos, improving and numbering the tables and figures, and making sure the references were correct. Following that, I tidied up the large results spreadsheet and in so doing found a number of minor errors which entailed further changes in the document. Eventually, the IV in PIM Full Study Report got finished about a week ago. Then I set about identifying a journal that might accept it for publication.

One of the papers referenced in the IV report was by one Steve Whittaker, a UK researcher previously at the University of Sheffield and now at the University of California, and now very prominent in the field of Personal Information Management (PIM). I was first given his name by Andrew Cox at the University of Sheffield’s Information School, so I emailed him a copy of the abstract, with a copy to Andrew Cox by way of introduction, and asked him which journals he thought would be most appropriate to submit such a paper to. His response was that HCI journals were most appropriate for PIM papers but that a) he was not sure that this was a PIM paper as such as opposed to an archiving/Library Science paper, and b) he thought it would be a stretch for HCI readers to have a one person case study. So he advised me trying the non-HCI journals.

This exchange has begun to open my eyes to an interesting issue which I think I keep on coming up against: despite my collections of job documents, mementos, photos etc. are all highly personal and therefore definitely in the PIM domain, the indexing and management techniques I use to control them are all far more structured and organised than is usually encountered in PIM; and these characteristics make people think they are not PIM collections but fall into the categories of archiving, records management or librarianship. This is an interesting insight and one which I think I shall explore further in the coming months.

Anyway, I decided to take Steve Whittaker’s advice and focus on non-HCI journals. I struck lucky with the first one I approached – JASIST – the Journal of the Association for Information Science and Technology. I sent the abstract and full report in an email to the Editor and asked if she thought it would be appropriate to submit a cut down version to the journal. Her response was that she thought it would be, so I now have a major précis job to cut down the full report by a third to about 8000 words.

I must say, I have been impressed by the rapid responses I have received this week from both people I have emailed. Despite not knowing me they both responded – and the reply came within just a few hours in both cases. I’m quite sure that, like most professionals these days, their email load will be very high, so it is a testimony to their professionalism and to the power of email; and I count myself lucky that they were prepared to spend their time on my missives out of the blue.

Studies Finished! Paper Complete!

This morning I completed the IV in PIM paper – and I’m very pleased to be able to move on to something else. It’s consumed me over this last month – particularly the literature review which required much painstaking reading and analysis. Academics certainly earn their money when they write papers.

The final stage in the IV study involved analysing the reasons why 109 items in a collection of 400 mementos were retained. This exercise identified a need for the following changes to the Updated PIM Retention Criteria that emerged from the second study:

  • “Copying explicitly prevented by copyright” will be removed.
  • “ Items relating to the legality of an institution” will be removed.
  • “Executive Policy document” will be removed.
  • “Other – specify reason” will be added.
  • “Does not belong to the Owner” will be added.
  • “For easy access and showing to others” will be added.
  • “Items that the Owner wants to keep as mementos of his/her life” will be added.

The paper comes to three main conclusions:

  • The  NARS Intrinsic Value characteristics provide a useful starting point for considering the question of what originals to retain in PIM collections; but only seven of the nine IV characteristics are applicable within the PIM domain and some of those seven require adjustment to their scope and naming. Furthermore, they need to be accompanied by a further 12 additional criteria to make a comprehensive set of PIM Retention Criteria (PIMRC).
  • The set of 18 PIMRCs that emerged from this paper are unlikely to be definitive or complete, and consequently an “Other” criteria has been included as one of the 18. Nor are the PIMRCs mutually exclusive. The studies reported in the paper indicate a high occurrence of two or more criteria applying to any one item.
  • It is thought unlikely that individual Owners of PIM collections will want to apply a checklist of PIMRCs methodically, but are far more likely to use such information as background guidance. Owners who inherit or are given collections may be more inclined to use the PIMRCs particularly for their initial assessment of a collection. It is believed that knowledge about PIMRCs will assist the general ongoing research into the PIM domain.

Jenny Bunn of UCL’s Department of Information Studies has been very helpful in commenting on elements of the study and on the draft paper as it has developed; and I now await her comments on the results of the final stage, and on the Discussion and Conclusions sections. Then will come the question of whether the paper is worth putting forward to a journal and if so, which one. So this journey is not quite complete yet.

The Customer-Developer Divide

I’ve had a frustrating time over the last few months because I lost access to the Behaviour & Information Technology journal (BIT).  I have two equivalent email addresses (@btinternet.com and @btopenworld.com) and the professional body I pay my online access subscription to passed the other email address over to the Publishers. The result – I lost access to the journal – but confusingly some articles are free to all so it didn’t seem like I had lost total access…. It’s all sorted now – but I’m still finding it tortuously time consuming to manage the steady flow of new versions of articles as they pass through the review and refinement process. So much so, that I’ve decided to stop making temporary entries in my filing index. I shall just mark articles I’m interested in as ‘favourites’ in the Taylor & Francis (T&F) mobile app; and only make an entry in my index when the final published journal comes through. I guess only time will tell whether this makes things easier or not

I’ve made a number of attempts to pass on my suggestion of providing functionality for users to manage articles as they flow through the process, but with no success to date. It seems that there is a huge divide between T&F customers and the developers of their mobile app. However, during my recent access difficulties I established contact with a second level support person called Vamshi Tulasi who said he would pass my suggestions on to the appropriate people. So, today I have sent Vamshi the following points and await with interest to see if I do eventually manage to have a dialogue with the  developers.

1. Management of articles that alerts are provided for. A single article appears several times – once for each stage in the process. However, I only want to have to deal with it once. Having decided I’m not interested in an article I don’t want to waste my time rediscovering that fact each time I am alerted to it as it goes through each separate stage. Likewise with articles I HAVE decided I’m interested in: I don’t want to have to keep looking at them at every stage – I want to look at the articles I AM interested in when they are actually published. However, there is no functionality to mark articles so every article has to be checked every time. I think this is probably an issue which most users are experiencing. Did you consider this when designing the application? Do you have any plans for individuals to mark items as they come through so that they know which articles they are or are not interested in?

2. Search function for journal list: When setting up the mobile app it is necessary to navigate to the Journal you have subscribed to. The journal “Behaviour & Information Technology” (BIT) is in a most illogical position and it took me about 15 minutes to find it – very annoying! Try finding it yourself before looking at the end of this message to see where it actually is. Unfortunately there is no search facility for journals – only for articles – so there is no option but to keep trying various categories until you find it. Are you planning to introduce a search function for the journal list? [answer to location of BIT: computer science – computer engineering]

Interestingly, I just got a very quick response from Vamshi saying “We are coming up with the new version LFM 3.1 , which will be going to release soon. In this version hope your all issues will be resolved.”. We shall see.

Second Study Done

The second IV in PIM study, involving the scanning of 745 items (13,500 pages), is complete. It largely confirmed the Draft PIM Retention Criteria identified in the first study with only two small changes in the wording being identified:

6. “Small publications of around A4 size or less with fixed spine bindings and/or special papers”  to be changed to “Publications with fixed spine bindings and/or special papers”

14. “Aesthetic or artistic quality” to be changed to “Aesthetic or artistic quality including photos”

This was a major milestone in the Document collection since it completed the scanning of some 32 archive boxes which was started in 1996 – a very long and arduous process. The collection now consists of an index and around 200,000 electronic files all on my laptop; and about 440 original documents (14,000 pages) retained in three filing boxes in my study. In principle I can get to any and every item in the collection very rapidly from my study desk.

The final stage of the IV in PIM study will investigate mementos – very different material from the office documents that have been investigated so far – and it is anticipated that this will provide a demanding test of the PIM Retention Criteria that have been identified to date.

First study done: Draft PIMRC here!

The first IV in PIM study of 344 retained items in a large collection of Job documents, is complete and it produced some unexpected results. It established that IV characteristics were only applicable to 71 of the items, and analysis of the reasons for retaining the other items produced a set of Draft PIM Retention Criteria (Draft PIMRC) that was rather different to that which had been anticipated. Here they are:

  1. Digitisation to be performed later
  2. Items to be put to work in their original form
  3. Items for which only the originals confirm their validity
  4. Trophy items to be collected and enjoyed in the future.
  5. Large documents which have particular qualities of impact and integrity.
  6. Small publications of around A4 size or less with fixed spine bindings and/or special papers
  7. Publications which mention friends, colleagues or the owner
  8. Items published by an organisation or programme that the owner works/worked for
  9. Items that the owner has written, produced, assembled or made a significant contribution to
  10. Physical features which make it difficult to digitise the item and/or to reconstruct it from the digital copy
  11. Items illustrating a physical form due to a development in technology
  12. Age that provides a quality of uniqueness
  13. Copying explicitly prevented by copyright
  14. Aesthetic or artistic quality
  15. For use in exhibits
  16. Item relating to the legality of an institution
  17. Executive Policy document

The detailed results are included in IV in PIM, 26Jan2014 – v0.4. I’ll now get started on the second study which will use the Draft PIMRC to assist in making retain/destroy decisions for 4 boxes of Job documents that have yet to be digitised. I anticipate that the results from that work will probably be available in a couple of months.

The first of three studies is underway

Jenny Bunn of UCL provided excellent commentary on the draft Introduction and Methodology sections of the IV in PIM paper, and consequently I have completely revised the methodology [IV in PIM, 20Jan2014 – v0.3]. In summary, the approach I will take is to conduct three studies of retain/destroy decision making using two separate collections – a Job Documents collection and a collection of Mementos. For the first study (using the Job Documents collection), a previous categorisation of ‘Reasons for not destroying the paper’ (RFND criteria), made before I was aware of the NARS Intrinsic Value report, will be compared with the NARS Intrinsic Value (IV) characteristics, and a draft set of PIM Retention Criteria (PIMRC) will be derived from the results. The second study will try out and refine the draft PIMRC in the course of digitising those items in the Job Documents collection that have not already been digitised; and the third study will try out and refine the draft PIMRC by reviewing the retain/destroy decisions that have already been made when digitising the Mementos collection. The knowledge gained in each of the second and third studies will then be combined to produce a final set of PIMRC.

I have already started work on the first study and aim to have completed it, and to have derived the draft PIMRC, by the end of this week.

Ideas for exploring Intrinsic Value

In December 2013, Jenny Bunn of UCL’s Department of Information Studies, alerted me to work in the Archival domain on the Intrinsic Value of documents. In particular, the US National Archives and Records Service (NARS) produced an influential report in 1980 titled “Intrinsic Value in Archival Information” which defined nine criteria for retaining an item in its original form after it had been digitised. It immediately occurred to me that my Document collection and Memento collection could be used to establish if the NARS Intrinsic Value (IV) criteria are applicable to Personal Information Management (PIM) practices. So, I defined three research studies in the form of Introduction and Methodology sections of a journal paper and sent them to three people for comment: Ann O’Brien (University of Loughborough), Jenny Bunn (University College London), and William Jones (University of Washington). William Jones has already responded and I have taken his comments into account in the latest draft of the paper – IV in PIM, 05Jan2014 – v0.2. Once I get the remaining two responses and take any feedback into account, I shall start work on the studies.

Intrinsic Value of Artefacts

One of the people Neil Beagrie suggested I get in touch with was Elizabeth Shepherd, an Archivist and Records Management specialist in UCL’s Department of Information Studies. I duly emailed her early in Dec2013 and she asked Jenny Bunn, a Lecturer in the Department who is initiating a new teaching module on Digital Curation in January 2014, to contact me. Since then, Jenny and I have had a number of exchanges and we have agreed that there is potential for her students to make use of my document collection as a resource – though there is too little time to sandwich it into the early 2014 syllabus. Instead, I may go down to speak with her students in February or March.

Jenny also alerted me to a report on the Intrinsic Value of documents produced by the US National Archives and Records Service (NARS) in 1980. This is highly relevant to the work I am doing on the artefact in the digital age. So much so, that it has inspired me to define a clear set of research activities to establish if the NARS Intrinsic Value characteristics are relevant in Personal Information Management practices. Since this is now a distinct piece of work with clear objectives I shall continue to report on it under the separate heading of Digital Age Artefacts.

New Scanner – Canon DR2020U

Last Friday my new scanner – a Canon DR-2020U ADF + Flatbed – was delivered, and I have spent the last few days trying to integrate it into my system and exploring its functions. I ordered it through Tradescanners who have an excellent web site enabling comparisons to be made between a wide range of products. The scanner arrived within 24 hours of me placing the order which was excellent. Unfortunately, I’ve experienced two different sets of problems – first my BT Digital Vault software seems to interrupt the scanner software significantly (a problem widely reported on the net – the underlying software, FSHosting, just hogs the CPU); and secondly my existing scanner and Document Management software, which could use an ISIS driver but doesn’t because I haven’t got one for it, seems to interfere with the ISIS driver that came with the Canon scanner. Other than that, the new scanner seems to do everything its supposed to – full duplex scanning of both sides of the paper as it goes through, paper size detection, blank page detection and elimination, and saving to PDF, JPG or TIF as required. I’m pleased – but am having to work through the problems.

The field has exploded in the last 15 years

In an effort to understand what is going on in the world of Personal Electronic Filing, a few weeks ago I emailed some people I had identified from papers and web searches. The results have been very rewarding.

It is now clear to me that what was a niche area in the 1990s has expanded hugely to become a topic in its own right with a large body of literature and a worldwide community of interest. The rise of personal computing, email, social media and the mobile phone has effectively made most individuals – whether they know it or not – personal information managers; personal information is now considered to extend to photos, calendar entries, text messages, social media material etc.;  and the ubiquity of electronic media has necessitated the development of the field of data forensics to capture and identify evidence. The field of Data Preservation is of particular interest to Libraries and Museums which are grappling with the practical problems of curating collections which include digital material. There appear to be many initiatives underway in all these areas, of which various EEC-funded projects, the UK Data Preservation Coalition, the US Library of Congress guidance notes, and William Jones’ Personal Information Management workshops are probably just the tip of the iceberg. I’m grateful to Neil Beagrie for linking me into much of this material.

With this new awareness I have begun to try and understand the role that my personal collection might have. In particular, I’m wondering if it could become a Test Set for exploring Data Preservation issues rather than the original aim of being a Test Set for Personal Indexing and Retrieval (an objective which seems to have become defunct since the rise of the Search Engine). This could be a useful focal point in my continuing search to find people to collaborate with.