The field has exploded in the last 15 years

In an effort to understand what is going on in the world of Personal Electronic Filing, a few weeks ago I emailed some people I had identified from papers and web searches. The results have been very rewarding.

It is now clear to me that what was a niche area in the 1990s has expanded hugely to become a topic in its own right with a large body of literature and a worldwide community of interest. The rise of personal computing, email, social media and the mobile phone has effectively made most individuals – whether they know it or not – personal information managers; personal information is now considered to extend to photos, calendar entries, text messages, social media material etc.;  and the ubiquity of electronic media has necessitated the development of the field of data forensics to capture and identify evidence. The field of Data Preservation is of particular interest to Libraries and Museums which are grappling with the practical problems of curating collections which include digital material. There appear to be many initiatives underway in all these areas, of which various EEC-funded projects, the UK Data Preservation Coalition, the US Library of Congress guidance notes, and William Jones’ Personal Information Management workshops are probably just the tip of the iceberg. I’m grateful to Neil Beagrie for linking me into much of this material.

With this new awareness I have begun to try and understand the role that my personal collection might have. In particular, I’m wondering if it could become a Test Set for exploring Data Preservation issues rather than the original aim of being a Test Set for Personal Indexing and Retrieval (an objective which seems to have become defunct since the rise of the Search Engine). This could be a useful focal point in my continuing search to find people to collaborate with.

 

A Second Column for Facets

I’ve been giving the Excel Index that I developed last year a lot of use – mainly for the Memento Management  activity – and I’ve decided that having just one column for Facet is not enough. Inevitably there are cases where you want to specify two facets (for example, Loughborough and Rugby) and this is easily done by just putting one after the other with a comma between in the single Excel cell. The trouble is that Excel’s filter facility lists things alphabetically so, in the example above, if you look for Loughborough the entry “Loughborough, Rugby” appears in the appropriate position. However, if you are looking up “Rugby” the “Loughborough, Rugby” entry does not appear in that position so you may miss that particular item related to Rugby.

I’ve addressed the problem by including a second column for Facet, and by including both entries in both columns but with one in reverse order to the other, for example, in Column 1 “Loughborough, Rugby” and in column 2 “Rugby, Loughborough”. This ensures that, provided a search is done in both columns for a particular facet, you will find every instance of that facet and all secondary facets used with the facet being searched for.

Replica Computer Collecting

As computer technology powers ahead, people look back with nostalgia on the earlier models that they used, so the time is ripe for the production of small scale replicas for collection and display. Of course, being computers, they might do a little more than just look good. Building in a chip holding information about the model, and a wifi capability, would enable it to display its details on a local screen; and, depending on the particular selection of models that you have collected together, particular functions and processes could be programmed to occur. For example, the display of footage of an early computer guru speaking, the ability to pay an early game, or the ability to undertake the next stage of a complex puzzle.

Reasons for Keeping Hardcopy

I’ve been doing some preliminary practical work for the study of ‘the artefact in the digital age’ that I’m doing with Ann O’Brien. To gain an idea of the range of reasons for keeping hardcopy rather than just having a digitised version, I’ve reviewed the 357 items that I have chosen to keep rather than scan and throw away. Nineteen categories emerged. Ann and I will use this initial insight to plan in detail the practical work I am going to do in scanning four boxes of material that have not yet been scanned.

Underwater Treasure Hunts

When I swim I enjoy going underwater and sometimes picking things up from the bottom. The other day it occurred to me that, by integrating some electronics and chips into the tiles on the bottom of a pool, it might  be possible to make them light up and to switch them off with a touch. This ability would enable a challenging underwater treasure hunt to be constructed. Tiles could be lit singly such that a new one is lit when the current one is touched. Alternatively, the lights could show numbers such that you have to switch them off in the correct order. Software would enable courses of differing tile placement but with the same length routes, to be constructed such that individuals could compete to get the fastest time.

Memento metamorphosis

On our recent weekend in Zurich, I saw teacups doubling as lampshades and open books hanging as mobiles from string threaded through holes bored through the pages. Both made me think how everyday objects can be given other contexts and uses which, conversely, makes their own essence more noticeable. I wondered if this might be a way of bringing some of those buried mementos and artefacts to life? Would it work, for example, to create a mobile with some of one’s keepsakes?

We need an ORI mark system

As I contemplated the display of digital versions of posters and pictures, my mind wandered to other aspects of digitisation such as the creation of objects with 3d printers. The slippage between real and digital will become increasingly easy and prevalent. Consequently the world of the future will need ways to easily, quickly and reliably differentiate between what is original and what is copy – an ORI mark system. Detailed forensic investigation will always be an option to get an absolute answer, but people will need a more immediate mechanism to help them navigate an increasingly mixed reality world.

Storing Large Movie Files

In the last week or so I’ve been exploring the use of video editing and conversion tools – primarily to deal with the conversion and storage of personal cine film. However, since I also have about 15 pieces of video indexed in my document management system, I decided to try and use the same tools on them. The video is all on DVDs for two main reasons. First, most of it is in DVD video format (multiple files with extensions of either VOB, IFO or BUP) which is not conducive to storing in a document management system except in a zip file; and second, up to now I havn’t had sufficient storage in my PC to cope with the sizeable volumes of video files. Both of these constraints can now be overcome. I have the tools to convert multiple DVD video files into a single MP4 file; and my current laptop has 750Gb of storage of which two thirds is currently empty.

Overall, the exercise of moving the material on DVDs into the Fish document management system has been successful. All but one item has been transferred, and all the movies play directly from Fish when selected. However, a number of experiences are worth recounting:

  • Inaccessible DVD files: One DVD was transferred from VHS video to DVD by IC Video Ltd in the same way as several of the others, However, although it usually plays OK when put into the DVD slot, neither the FreeStudio or Movie Maker software I’m using was able to convert successfully from the DVD. Indeed, I’m not even able to copy it from the DVD to my laptop.  So that material will have to stay on the DVD until I find a solution.
  • Movie file formats: I also have a number of TV programmes downloaded from the net with .MPEG extensions and which play successfully on my laptop. However, my FISH document management system does not seem to support that extension. I did try changing the extension to MP4 (which FISH does support) but found the quality was much reduced. In the end I discovered that FISH does support a .M1V extension with which the files do play successfully. This prompted me to read up about MPEG and I discovered it has many versions with .M1V predictably being MPEG1.  I don’t plan to spend a great deal of time trying to understand all about movie formats, and am just happy to have found an extension that plays on my laptop and can be stored in my document management system. However, it has reminded me that there is much greater complexity in all these standards than meets the eye and consequently I shall be keeping the physical DVDs in case I encounter problems downstream.
  • DVDs with data files: A few of the physical DVDs contain not video files but collections of ordinary files. For example, one is the installation disk for an old version of my FISH document management system. Another is the installation disk for my Home Use versions of Microsoft Word and Excel. With these I simply zipped up the files and stored the zip file in Fish.
  • Large file sizes: Although file size is no longer such a problem on the laptop as previously, it still needs to be taken into account for backup purposes. All the material in my document management system is placed into so-called ‘bins’ – standard Windows folders specially configured by FISH. Over the years I have limited the size of the bins to around 200Mb to facilitate manageability and the taking of backups. These days I’m able to backup to 4.7 Gb DVDs – though I still try and keep the bin size to around 200 MB. With these movie files, however, some of the file sizes are well over 1 Gb, so I have created specific bins for them to go in and ensured that the sum of the files within them do not exceed 4.7GB (which is the advertised size of the DVDs) so that I will still be able to take the necessary backups. Unfortunately, as I discovered when trying to burn to disk, the usable size of blank DVDs is somewhat less at 4.37Gb. Consequently, I subsequently had to move some of the files around the bins I had created to keep the bins under the 4.37Gb limit. After over 30 years of personal computing, things are still never easy….

 

A New, Simpler, Cheaper Filing System

Most of the electronic filing that I’ve done up to now has been for business documents, but now I’m starting to focus more on personal documents. The electronic system I’ve been using can cater for all kinds of documents, but I’ve decided to have a separate system for my personal material. This is mainly because the business documents may have research and historical uses and may be taken elsewhere, while the personal documents are for the use of myself and my family.

In setting up a filing system for personal documents I could re-use the one I’ve used for business documents since 1981. However, even though it uses relatively small scale software products, I have found it relatively costly and complex to maintain over the years as new versions have had to be obtained and installed to keep pace with new operating systems and other developments. Most upgrades to new versions have been disruptive and time consuming. Furthermore, the need to perform such upgrades to ensure continuity of operation imposes additional uncomfortable pressure. I want to minimise these problems in my personal filing system.

The solution I’m going to try out will have an index in Excel, and the base material will be stored in the Windows filing system. At present, Windows and Excel are ubiquitous and are integral parts of the basic computer system that I maintain for my own use. Hence the system will not incur additional cost, and there will be no special software functionality to make things more complicated. However, this simplicity is only achieved by sacrificing some flexibility in index searching. With the Filemaker index that I use for my business documents, multiple search terms and options (all, one or the other, not present etc.) will deliver a list of all records that match the criteria. This will not be possible to replicate in as the Excel FIND command simply steps through matching entries – though the filter facility in conjunction with a ‘Facet’ field will go some way towards it. I’m prepared to forego this functionality to achieve the substantial long term benefits of simplicity.

I will use the same index fields as for my business document with a few additions to cope with the limitations of Excel. The full list of fields will be as follows:

Reference number: mandatory for each entry and having four parts: an Owner identifier (PAW for Paul Wilson), a Set identifier (PERS), a Serial number (e.g. 1817) and a Sub-Serial number (e.g. 01). So, a typical Reference No looks like this: PAW-PERS-1817-01). The purpose of the Serial number is to enable new documents to be given the next number on the list, i.e. the number signifies nothing other than the physical location of the document in the file. The purpose of the sub-serial number is to enable two or more documents to be kept physically together in a file even if the later one is logged after the subsequent serial number has been allocated. The Reference No will be written/attached to the relevant physical documents/artefacts, and will be included at the beginning of the equivalent electronic file(s). Note that, for my business documents, the separator in the Reference No is a slash (PAW/PERS/1817/01), however a slash is not a valid character in Windows file names so it has been replaced with a dash (PAW-PERS-1817-01).

Title: mandatory for each entry, unlimited in length (subject to the size limitation of  an Excel cell) and the contents to be at the owner’s discretion (i.e. it could be different to the actual title on a document). Note that up to the first 100 characters will be copied and pasted after the Ref No in the file names of associated electronic files; therefore Titles should be constructed to make this first 100 characters as informative as possible. The Title may also include Keywords or Phrases as described below.

Keywords or phrases: optional for each entry and unlimited in number, specified entirely at the owner’s discretion, separated by commas and added to the end of the Title after three dots.

Facet: optional for each entry and consisting of a single word or phrase which can be used as a broad search term by being selected from the Excel Filter list.

Physical Location: optional for each entry and used to indicate if there is a physical item associated with this entry and, if so, where it is located.

Electronic version: optional for each entry and used to indicate if there is an electronic item associated with this entry.

Publication Date: optional for each  entry and used to specify the exact date (ddmmmyyyy) which the material concerned came into being.

Year: mandatory for each entry in full form (yyyy) and used to indicate the year in which the material concerned first came into being (this is needed in addition to the Publication Date because it is not possible to specify an exact date for some items and an Excel field can only have a single date format).

Creation Date: mandatory for each entry and used to specify when the Index entry was created.

It is the scanning of various documents for a photobook of my work experiences that has prompted me to do this right now. So I shall start using this new system for those items immediately. After it has accumulated a substantial amount of material I will report again on how effective it is.

HomeScale

Sometimes it’s just hard to imagine the size of things when they are written about in the newspapers – or even when they are shown in pictures. A recent TV programme about glaciers and icebergs showed a huge glacier front many kilometres long and very high, but it was impossible to grasp its scale until the research vessel came in shot at the foot of this monster. Of course the place we do fully grasp the size of is our home and its environs. So maybe some innovative software company could create an app which would enable you to select an object or geographic place and have it inserted into a picture of your house. With appropriate accompanying sounds, the insertion of the Grand Canyon into the back garden (and surrounding land for many miles each way) would be both dramatic and instructive.