Repository sought

In the last few months I’ve been making good progress on figuring out how to undertake a Digital Preservation project. Since I’m getting close to being ready to undertake digital preservation work on the PAW/DOC collection, I decided to make an attempt to find a home for the collection before I start. That way, I can tailor the digital preservation work to the requirements of the receiving repository – should I be lucky enough to find anyone who is interested. Anyway, I now have a short two pager to send to repositories which might be interested. This is the second version. Dave Thompson of the Wellcome Foundation (who I met on the UCL online Digital Curation course) was kind enough to comment on the first version and his observations resulted in a substantial rewrite. I’ve sent it to 6 organisations – Loughborough University’s Centre for Information Management, Manchester University’s Computer Science Dept, City University’s Cass Business School, UCL’s Dept of Information Studies, the National Archives, and The Science Museum Wroughton Library and Archives. If I get a positive response from any of these all well and good. If I do not I shall proceed with the Digital Preservation work as planned.

Some three years ago I made a list of activities I wanted to undertake with the PAW/DOC collection, and this seems a good moment to summarise where I’m up to – the activities and their status are described below:

  • Scan the remaining 4 boxes of paper. Take the opportunity to explore scanning in colour and using PDF. Possibly also using OCR – though this is of much lower priority. DONE (but not OCRd)
  • Write a paper on “The paper artefact in the digital age” using an analysis of the contents of PAW/DOC as the basis for the paper. DONE
  • Explore the issues of longevity and survivability of file formats and of digital indexing and file management systems, using PAWDOC as the basis for the work. This could also include moving the material from FISH and even Filemaker. STILL TO BE DONE
  • Revisit all the requirements listed in my 2001 BIT paper to identify current status and opportunities for further work. STILL TO BE DONE
  • Scan all remaining PAW/DOC paper i.e. all those items in the three archive boxes (most of which have been identified as artefacts to be retained in their physical form). STILL TO BE DONE – but next on list – I’m trying to find a binding machine to be able to sheet feed the documents with comb binding
  • Check that all index entries are valid (i.e. not blank and with an appropriate Movement Field entry) and have an associated populated FISH entry. STILL TO BE DONE
  • Write up a guide to the material and to the technology supporting it. STILL TO BE DONE
  • Hand over PAW/DOC and its supporting technology to the new owner and provide training for the people who will be managing it going forward. STILL TO BE DONE

An antidote for a brainwash

These two statements in William Keegan’s article in today’s Observer have prompted me to flesh out this idea: “it should never be forgotten that the coalition inherited a burgeoning economic recovery in the summer of 2010 and proceeded to bring it to a halt with its misguided programme of austerity” and “ I think I heard the prime minister come out yet again on the wireless the other day with that pre-Keynesian howler – much in vogue with the German economic establishment – that when the private sector cuts back, it makes sense for the public sector to cut back too. On the contrary, it does not make sense, and was the reverse of what was needed after the depression which followed the financial crash of 2008-09”. I’m fed up with politicians, clerics, lobbyists and other people with an axe to grind, brainwashing us with stuff that I suspect they do not fully understand, or that they are twisting to their own ends, or, worse still, that they are simply lying about. I’d like to see these points tested in the media by sending them to the Telegraph, Times/Sunday Times, Financial Times, Guardian/Observer, and The Independent, and asking them to research the following aspects: a) was the quote an accurate record of the statement? b) was any additional meaning imparted by the context in which the statement was made? c) what evidence did the person making the statement base it on? d) what are the findings of the research that has been done on the subject e) what experiments/empirical tests have been performed to validate each main set of findings? e) what is the broad consensus of the professionals in the field concerned regarding each of the main sets of findings?

An Update – This Work on Hold

This work has lain dormant for a little while now – but only because I’ve been focusing on other supporting activities. In particular, I’m exploring the field of Digital Preservation with the aim of undertaking work to ensure that the contents of my work document collection is long lasting. In the process of doing that I’m also trying to publicise the existence of the collection in order to find someone who might be interested in giving it a long term home. So, I don’t intend to any further work on Personal Document Management until I’ve finished the Digital Preservation investigation.

For the record, I did actually go and talk to Jenny Bunn’s Digital Curation students at UCL on 27Feb2014. I talked for about 20 minutes, provided a handout (the odd layout is because it is designed to be printed double sided), and there was some Q&A at the end. I also had an interesting conversation afterwards with Jenny. However, it prompted no further interest in the work document collection.

Finally, a word about Anne O’Brien of Loughborough University who I started collaborating with on this topic in early 2013. The last contact I had with her was in September of that year, and I had heard nothing more from her or about her until I read in the November 2014 issue of the Loughborough University Alumni magazine that she had died in May 2014. Tom Jackson of Loughborough’s Centre for Information Management where she worked, confirmed in an email that she had died of a heart attack and that her death had come as a huge shock.  I’d like to record here that, in our brief collaboration, Ann was very helpful to me and gave me a number of substantial steers which moved the work I was doing forward both in terms of content and contacts.

Contents App for Different Types of Doc

It’s good to have a contents template when creating documents of a particular type – an audit report, an IT Architecture document, a Project Plan, or a Preservation Plan, for example – there must be hundreds of different types in use today. It would be very useful to have an iPad app which provides all the standard contents for different types of documents. In some instances, a few of the main headings may not be relevant, or you don’t want to go down to such great levels of detail. So the app could help you choose which subset of all the possibilities would be most useful for a particular set of circumstances. Content components could be suggested by users and moderated by the owner of the app.

Intrinsic Value of Artefacts

One of the people Neil Beagrie suggested I get in touch with was Elizabeth Shepherd, an Archivist and Records Management specialist in UCL’s Department of Information Studies. I duly emailed her early in Dec2013 and she asked Jenny Bunn, a Lecturer in the Department who is initiating a new teaching module on Digital Curation in January 2014, to contact me. Since then, Jenny and I have had a number of exchanges and we have agreed that there is potential for her students to make use of my document collection as a resource – though there is too little time to sandwich it into the early 2014 syllabus. Instead, I may go down to speak with her students in February or March.

Jenny also alerted me to a report on the Intrinsic Value of documents produced by the US National Archives and Records Service (NARS) in 1980. This is highly relevant to the work I am doing on the artefact in the digital age. So much so, that it has inspired me to define a clear set of research activities to establish if the NARS Intrinsic Value characteristics are relevant in Personal Information Management practices. Since this is now a distinct piece of work with clear objectives I shall continue to report on it under the separate heading of Digital Age Artefacts.

New Scanner – Canon DR2020U

Last Friday my new scanner – a Canon DR-2020U ADF + Flatbed – was delivered, and I have spent the last few days trying to integrate it into my system and exploring its functions. I ordered it through Tradescanners who have an excellent web site enabling comparisons to be made between a wide range of products. The scanner arrived within 24 hours of me placing the order which was excellent. Unfortunately, I’ve experienced two different sets of problems – first my BT Digital Vault software seems to interrupt the scanner software significantly (a problem widely reported on the net – the underlying software, FSHosting, just hogs the CPU); and secondly my existing scanner and Document Management software, which could use an ISIS driver but doesn’t because I haven’t got one for it, seems to interfere with the ISIS driver that came with the Canon scanner. Other than that, the new scanner seems to do everything its supposed to – full duplex scanning of both sides of the paper as it goes through, paper size detection, blank page detection and elimination, and saving to PDF, JPG or TIF as required. I’m pleased – but am having to work through the problems.

The field has exploded in the last 15 years

In an effort to understand what is going on in the world of Personal Electronic Filing, a few weeks ago I emailed some people I had identified from papers and web searches. The results have been very rewarding.

It is now clear to me that what was a niche area in the 1990s has expanded hugely to become a topic in its own right with a large body of literature and a worldwide community of interest. The rise of personal computing, email, social media and the mobile phone has effectively made most individuals – whether they know it or not – personal information managers; personal information is now considered to extend to photos, calendar entries, text messages, social media material etc.;  and the ubiquity of electronic media has necessitated the development of the field of data forensics to capture and identify evidence. The field of Data Preservation is of particular interest to Libraries and Museums which are grappling with the practical problems of curating collections which include digital material. There appear to be many initiatives underway in all these areas, of which various EEC-funded projects, the UK Data Preservation Coalition, the US Library of Congress guidance notes, and William Jones’ Personal Information Management workshops are probably just the tip of the iceberg. I’m grateful to Neil Beagrie for linking me into much of this material.

With this new awareness I have begun to try and understand the role that my personal collection might have. In particular, I’m wondering if it could become a Test Set for exploring Data Preservation issues rather than the original aim of being a Test Set for Personal Indexing and Retrieval (an objective which seems to have become defunct since the rise of the Search Engine). This could be a useful focal point in my continuing search to find people to collaborate with.

 

A Second Column for Facets

I’ve been giving the Excel Index that I developed last year a lot of use – mainly for the Memento Management  activity – and I’ve decided that having just one column for Facet is not enough. Inevitably there are cases where you want to specify two facets (for example, Loughborough and Rugby) and this is easily done by just putting one after the other with a comma between in the single Excel cell. The trouble is that Excel’s filter facility lists things alphabetically so, in the example above, if you look for Loughborough the entry “Loughborough, Rugby” appears in the appropriate position. However, if you are looking up “Rugby” the “Loughborough, Rugby” entry does not appear in that position so you may miss that particular item related to Rugby.

I’ve addressed the problem by including a second column for Facet, and by including both entries in both columns but with one in reverse order to the other, for example, in Column 1 “Loughborough, Rugby” and in column 2 “Rugby, Loughborough”. This ensures that, provided a search is done in both columns for a particular facet, you will find every instance of that facet and all secondary facets used with the facet being searched for.

Replica Computer Collecting

As computer technology powers ahead, people look back with nostalgia on the earlier models that they used, so the time is ripe for the production of small scale replicas for collection and display. Of course, being computers, they might do a little more than just look good. Building in a chip holding information about the model, and a wifi capability, would enable it to display its details on a local screen; and, depending on the particular selection of models that you have collected together, particular functions and processes could be programmed to occur. For example, the display of footage of an early computer guru speaking, the ability to pay an early game, or the ability to undertake the next stage of a complex puzzle.

Reasons for Keeping Hardcopy

I’ve been doing some preliminary practical work for the study of ‘the artefact in the digital age’ that I’m doing with Ann O’Brien. To gain an idea of the range of reasons for keeping hardcopy rather than just having a digitised version, I’ve reviewed the 357 items that I have chosen to keep rather than scan and throw away. Nineteen categories emerged. Ann and I will use this initial insight to plan in detail the practical work I am going to do in scanning four boxes of material that have not yet been scanned.