Phase 5 Completed

Phase 5 in this journey was to assess the possibility of investigating the contents of a large archive by providing an AI with not only the Index, but also the actual digital files relating to the Index entries. As explained in the previous post, AIs are only able to hold a limited amount of information in their memory; so, for large archives I’m employing a strategy whereby the AI is asked to provide an answer to a subset of the archive’s information; and once that answer has been provided, another subset is presented for an answer. When all the subsets have been exhausted, the AI is then asked to stitch all the answers together to provide an overall answer for the archive. That is the approach I’ve taken here: I assembled a whole set of digital files, divided them up into groups which the AI is capable of handling, got answers from the AI for each group, and then got the AI to stitch the answers together. To highlight the practicalities of this exercise, the whole process is described in some detail below.

As in the previous phase, the archive I’ve used for this test is my PAWDOC work archive of some 17,000 Index entries and 31,000 associated files. I decided I’d do two distinct tests: one with files about a very specific subject with a limited life-span; and one with files related to a very general topic which might appear anywhere in the lifetime of the archive. For the first I chose to investigate the two years (1977-78) that I worked for a company called ‘CPC’; and for the second I chose the word ‘measurement’. In each case, I performed a search of the relevant term (‘CPC’ or ‘measurement’) on my PAWDOC index in the Filemaker application. These produced lists of 21 and 40 index entries respectively. PAWDOC index entries can have any number of files associated with them, and in these cases, it turned out that‘CPC’ had 48 files, and ‘measurement’ had 72 files.

The next part of the process was to split the files into sufficiently small subsets such that each subset would be accepted in full in a single request to the AI. The two AIs I used were Claude and Copilot, both of which limit file uploads to 20 per conversation; Claude limits an individual file to 30Mb and Copilot to 50Mb. So, I chose to limit my subsets to the following:

Number of files: 19 (this to include the ‘PAWDOC Guide’ file and a file of the Index entries related to the files in this subset: leaving room for 17 content files)
Max individual file size: 28Mb
Max Total file size: 40Mb
Text files to be no bigger than 18k characters

Armed with these criteria I set about dividing up the content files into their subsets and ended up with 5 for each one:

Subset	No of files (less Guide & Index Entries files)	Largest file	Total subset file size
A (CPC)	16	1.4 Mb	7.7 Mb
B (CPC)	15	4.6 Mb	15.4 Mb
C (CPC)	13	5.2 Mb	20.3 Mb
D (CPC)	2	25 Mb	35 Mb
E (CPC)	3	20.8 Mb	37.9 Mb
F (Measurement)	17	5.6 Mb	14.5 Mb
G (Measurement)	17	2.8 Mb	14.1 Mb
H (Measurement)	16	20 Mb	33.4 Mb
I (Measurement)	17	7.2 Mb	18 Mb
J (Measurement)	5	19.5 Mb	25.1 Mb

I encountered three issues in the course of creating the subsets

TIF files: Most of the PAWDOC files are scans of hardcopy documents in multi-page TIF files. However, the free version of Copilot does not support TIF files; and Claude only supports single page TIF files, so I had to turn all the TIF files into PDFs. As it turned out, that was a quick process: select a file; right click the mouse and choose ‘Open with [PDF App]; when it opens in the PDF app, select ‘Save As’ and save it as a PDF in the relevant Subset folder.
One of the files (a scan of a desk diary) was 35.7 Mb – a little over my Individual file size limit of 28 Mb. So, I split it into two files of 21Mb and 15 Mb respectively.
One of the files was an old PowerPoint file with a ’ppt’ extension. I believe Claude and Copilot only accept pptx extensions so I produced a pptx version for the subset.

With the subsets prepared, the final thing to do was to specify the following prompts:

For the CPC subsets: PAWDOC is a work document collection built up from 1981 to the present day. The attached files include a Guide outlining how PAWDOC is constructed and what it contains. Also attached is a subset of PAWDOC’s Index and the actual digital files associated with those index entries. The following request is to be undertaken using the information in the attached files: Outline everything that you can find out about CPC in the attached files, including its origins, operations, locations, products, finances, people and culture; and describe the contributions that Paul Wilson made while he worked for the company. Present the results in a word document with an executive summary at the beginning.

For the Measurement subsets: PAWDOC is a work document collection built up from 1981 to the present day. The attached files include a Guide outlining how PAWDOC is constructed and what it contains. Also attached is a subset of PAWDOC’s Index, and the actual digital files associated with those index entries. The following request is to be undertaken using the information in the attached files: Describe everything that you can find out about ‘Measurement’ in the attached files, under a set of appropriate category headings which should include the philosophy of measurement, attitudes towards measurement, and the pros and cons of measurement. In a final section suggest interesting further research that could be undertaken to extend the findings reported here. Present the results in a word document with an executive summary at the beginning.

Running the subsets through the AIs was relatively quick for Claude. Indeed, I’d completed the whole exercise – definition of the questions, creation of the subsets and runs through both the CPC and Measurement tests – within the space of two days. I’m not sure whether or not that was helped by the fact that I was still operating under the Claude Pro plan for which I had paid £18 for a one-month subscription. For Copilot, however, it was a different kettle of fish. I got through the CPC tests and the first of the Measurement subsets, quickly enough. But at that point I came up against something I hadn’t encountered before – Copilot’s daily upload quota. Apparently, this is nothing to do with the number of files you upload, but rather the number of upload events you initiate in a 24-hour period (an event can be an upload of anything from 1 to 20 files). When you exceed this, you have to wait a rolling 24 hours before you can upload anything again – but there is no where to look to see when you can start uploading again. So, I ground to a halt on the Copilot Measurement tests, and wasted a lot of time trying to find out what the problem was. Having found out, I decided to abandon the Copilot Measurement tests for reasons that will become obvious below.

Copilot’s individual subset answers to the CPC question were all reasonably good, 6-8 page answers with a fair amount of detail – though four out of the five were mainly in bullet point format within headings and sub-headings. The merged document was in similar style and I gave it 7 out of 10 with following comments:

This is a 10-page report with 10 separate sections and lots of sub-headings with the text being primarily in the form of bullet points. There’s plenty of detail in the bullet points, but no citations back to the specific Reference Numbers from which they came. The bullet point format means that there is little additional commentary or embellishment and inevitably makes it a less informative read. It took Copilot just 8 seconds to produce this merged report as compared to an average of 38 seconds to produce the subset answers.

However, the Claude answers to the CPC questions were a significant cut above the Copilot versions, with 8-13 pages of discursive text, and a great deal of detail. I scored Claude’s CPC merged document 9.7 with these comments:

This 21-page merged document is of a very high standard with the following contents: Exec Summary; 1. Origins and Corporate Structure; 2. Locations; 3. Products; 4. Finances; 5. People; 6. Culture and Working Environment; 7. Paul Wilson’s Contributions at CPC, 8. Quick Reference Summary; 9. Further Work Recommended for Researchers. In each of these sections there is a huge amount of detail described in discursive text. Claude has clearly inspected and interpreted many, if not all, of the 49 files provided – all of which were scans in PDF documents. I noticed a few doubtful assumptions: a) In section 2.4 it is suggested that Perivale was a CPC location (I don’t think it was – though I couldn’t find the ‘Wray memorandum of December 1976’ to check, which is interesting); b) section 6.5 suggests that ‘Wilson’s formal 11-stage systems methodology paper, produced c.1978 ‘ was actually in operation (but it was only a proposal); and c) some items in the Project Portfolio list in section 7.5 were taken from a scan of my desk diary and may not have actually been as significant as they sound, eg. ‘Corn Shipment Simulation’ and ‘Forecasting’ and ‘Factory Open Day Coordination’. There may well be other misinterpretations I didn’t spot, but despite these, this is a hugely impressive, comprehensive, thorough, and highly detailed report. It took Claude just over 6 minutes to produce as compared to an average of 7 and a half minutes to produce the subset answers.

The Claude Measurement answers were of the same ilk. In fact, because the topic is so broad, each of the individual answers made good reading in their own right. I scored Claude’s merged Measurement document 9.8 out of 10 with the following comments:

This is a very comprehensive 23-page report on a wide variety of aspects of ‘measurement’. There is a 4-page discussion on the ‘The philosophy of measurement’ under 6 sub-headings. The ‘Attitudes to measurement’ section has 5 sub-headings; and the pros and cons of measurement are addressed in similar detail (10 pros and 11 cons). The section on ‘Measurement Instruments and Frameworks’ lists 22 different mechanisms. Section 7 provides 11 suggestions for further work; and a full listing of all the documents used in the analysis rounds of the report. As with the CPC report, it appears that Claude inspected and interpreted most, if not all, of the 79 files provided, of which 75 were scans in PDF documents, three were word documents, and one was a PowerPoint file. This merged report is excellent. How accurate it is, is something I wouldn’t be able to say without doing many days if not weeks of work. It took Claude 8 minutes to produce this report as compared to an average of 5 and a half minutes to produce the subset answers.

One very practical point emerges from these answers: the difference in the time it took for Copilot and Claude to arrive at their answers is striking. Copilot operates in seconds, whilst Claude operates in minutes. There is a distinct possibility that this has something to do with the quality of the results they produced; and it is another reminder that some LLM models are better than others for particular tasks.

Regarding the overall objective of these Phase 5 tests, these results clearly indicate that it is indeed feasible to have AI investigate an archive through its digital files. Furthermore, the results are likely to be even more impressive than those that can be obtained when providing the AI with just index entries and file names. Consequently, there is even greater likelihood that researchers will employ these techniques to explore archives, and even less likelihood that they will spend time verifying the results. This is a serious long-term issue for archives, for researchers, and for the integrity of the global information canon.

In these tests I did try to get the AI to suggest further work to mitigate these problems, and, indeed, several suggestions were forthcoming. Although I’m not convinced that they directly addressed the accuracy issue, I do think they indicate that better prompts, more focused on identifying potential issues and mitigations, could be developed.

Below records the breakdown of the time I spent on Phase 5 and across all phases.

Activity	No of Tasks or task breakdown	Elapsed time	Time spent
Phase 1	70 (started 05Mar2026)	43 days	105 hrs
Phase 2	8	4 days	11 hrs
Phase 3	· Create test files, test, analyse results · Research & draft pwofc.com posts	3 days 4 days	15 hrs 12 hrs
Phase 4	· Create test files, test, analyse results · Research & draft pwofc.com posts	14 days 11 days	80 hrs 31 hrs
Phase 5	· Create test files, test, analyse results · Research & draft pwofc.com posts	4 days 2 days	13 hrs 8 hrs
Totals		85 days	275 hrs

OFC

Order from Chaos, Digitisation, and their intersection

Phase 5 Completed

Leave a Reply Cancel reply