{"id":2722,"date":"2026-02-14T07:49:59","date_gmt":"2026-02-14T07:49:59","guid":{"rendered":"https:\/\/www.pwofc.com\/ofc\/?p=2722"},"modified":"2026-02-14T08:07:35","modified_gmt":"2026-02-14T08:07:35","slug":"an-ai-roadmap","status":"publish","type":"post","link":"https:\/\/www.pwofc.com\/ofc\/2026\/02\/14\/an-ai-roadmap\/","title":{"rendered":"An AI Roadmap"},"content":{"rendered":"<p>Having got the <a href=\"https:\/\/link.springer.com\/book\/10.1007\/978-3-031-86470-4\">Collecting book<\/a> published, I\u2019ve been wondering what\u2019s left to do in this OFC journey. Clearly, I need to update the <a href=\"https:\/\/www.pwofc.com\/ofc\/2017\/09\/01\/u1-the-ofc-online-tutorial-welcome-and-contents\/\">OFC tutorial<\/a> which is now some 8 years old. However, it was an email some ten days ago announcing a report on <a href=\"https:\/\/www.archives.org.uk\/ai-preparedness-guidelines-for-archivists\">AI Preparedness for Archives<\/a> that started me on a mini-voyage of realisation, and that prompted me to write this post.<\/p>\n<p>I took each piece of the report\u2019s guidance and wrote notes of how I would apply it to my <a href=\"https:\/\/www.pwofc.com\/ofc\/2019\/08\/31\/pawdoc-architecture\/\">PAWDOC<\/a> document collection. Then I asked ChatGPT how I could focus an AI chatbot onto a specific archive of data. In the <a href=\"https:\/\/www.pwofc.com\/ofc\/wp-content\/uploads\/2026\/02\/2026-02-Exchanges-with-ChatGPT-re-building-an-AI-interrogation-capability-for-private-collections.docx\">follow-up Q&amp;A<\/a>, and after providing a 10-line description of PAWDOC, ChatGPT designed a production architecture for such a research grade archive followed by a grant-fundable academic architecture and a cost estimate. At this point, I resisted ChatGPT\u2019s offer to create a \u2018ready-to-copy grant proposal document (including abstract, methodology, outcomes, evaluation plan)\u2019 and went for a walk, stunned BOTH by ChatGPTs capabilities AND by the potential for enhancing PAWDOC with an AI interrogation capability.<\/p>\n<p>I should say at this point, that the ChatGPT descriptions I had read were very general in nature and assume an understanding of many activities \u2013 they were most definitely not a cook book with \u2018do this then that\u2019 instructions. I was aware that my actual knowledge and understanding of what would be required was pretty much zero, and actually doing it for real would be a steep learning curve.<\/p>\n<p>Having mulled all this around in my head for a few days, it seems clear to me that closure of this OFC journey cannot occur until I understand and experience how AI can be used to augment the interrogation of two types of private collection:<\/p>\n<ol>\n<li>primarily text-based archives; and<\/li>\n<li>collections more focused on objects.<\/li>\n<\/ol>\n<p>These are the types of collections that, in my experience, are most likely to be possessed by private individuals. Note that I am explicitly focusing on \u2018private\u2019 collections, because institutions undoubtedly manage their archives and collections differently from private individuals: processes are formally designed to ensure effectiveness and longevity; tasks get done because staff are paid to do them; and IT support is usually at a far greater scale and complexity. Much work is underway to apply AI to institutional collections; however, <em>my<\/em> focus is to understand how individuals can apply it to their own private collections. With my current level of understanding, I believe I need to investigate the following specific aspects:<\/p>\n<ol>\n<li><strong>The practicalities of preparing a private archive for AI<\/strong>. To explore this, I would most likely use <a href=\"https:\/\/www.pwofc.com\/ofc\/2019\/08\/31\/pawdoc-architecture\/\">PAWDOC<\/a> (a primarily text-based archive), and b. my <a href=\"https:\/\/www.pwofc.com\/ofc\/2014\/11\/23\/done-and-digitised-1980-2011\/\">Mementos<\/a> collection (more focused on objects), to explore this topic.<br \/>\n\u223c<\/li>\n<li><strong>Researching if AI is capable of accessing and understanding the contents of files other than text &#8211; sound, image, and video (Large Language Models &#8211; LLMs \u2013 specifically deal with text).<\/strong> This is particularly relevant to collections more focused on objects.<br \/>\n\u223c<\/li>\n<li><strong>The practicalities of building an AI interrogation capability for a private archive which has<\/strong> <strong>only an index and information within its digital file names<\/strong>. This would probably be the simplest implementation and so the best one to do first to learn some basics. I would use my Mementos collection to investigate this.<br \/>\n\u223c<\/li>\n<li><strong>The practicalities of building an AI interrogation capability for a private archive which does have machine readable text content.<\/strong> To investigate this, I could just use the app-generated content within PAWDOC (for example, all the Microsoft office documents within PAWDOC). Alternatively, and more ambitiously, I could try to OCR some or all of the PAWDOC scanned documents and include them in the investigation.<br \/>\n\u223c<\/li>\n<li><strong>The practicalities of building an AI interrogation facility for a private collection containing a combination of text, sound, image, and video.<\/strong> The viability of this investigation would depend on the outcome of 2. above. It would probably involve extending one or both of the implementations described in 3 and 4.<\/li>\n<\/ol>\n<p>If I get round to any or all of these journeys, they would be recorded in their own separate spaces within this pwofc.com website; and, should they get completed, their results would be used to update the OFC tutorial. Only then would I consider closing this OFC journey.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Having got the Collecting book published, I\u2019ve been wondering what\u2019s left to do in this OFC journey. Clearly, I need to update the OFC tutorial which is now some 8 years old. However, it was an email some ten days &hellip; <a href=\"https:\/\/www.pwofc.com\/ofc\/2026\/02\/14\/an-ai-roadmap\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[],"class_list":["post-2722","post","type-post","status-publish","format-standard","hentry","category-order-from-chaos"],"_links":{"self":[{"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/posts\/2722","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/comments?post=2722"}],"version-history":[{"count":8,"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/posts\/2722\/revisions"}],"predecessor-version":[{"id":2731,"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/posts\/2722\/revisions\/2731"}],"wp:attachment":[{"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/media?parent=2722"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/categories?post=2722"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/tags?post=2722"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}