{"id":1610,"date":"2019-04-30T08:17:38","date_gmt":"2019-04-30T07:17:38","guid":{"rendered":"http:\/\/www.pwofc.com\/ofc\/?p=1610"},"modified":"2019-04-30T08:17:38","modified_gmt":"2019-04-30T07:17:38","slug":"getting-an-httrack-copy","status":"publish","type":"post","link":"https:\/\/www.pwofc.com\/ofc\/2019\/04\/30\/getting-an-httrack-copy\/","title":{"rendered":"Getting an HTTrack Copy"},"content":{"rendered":"<p>HTTrack is a free-to-use website copier. Its web site provides the following description: \u00a0\u201c<em>It allows you to download a World Wide Web site from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer. HTTrack arranges the original site&#8217;s relative link-structure. Simply open a page of the &#8220;mirrored&#8221; website in your browser, and you can browse the site from link to link, as if you were viewing it online<\/em>.\u201d<\/p>\n<p>I downloaded and installed HTTrack very quickly and without any difficulty, then I set about configuring the tool to mirror pwofc.com. This involved simply specifying a project name, the name of the web site to be copied, and a destination folder. The Options were more complicated and, for the most part, I just left the default settings before pressing \u2018Finish\u2019 on the final screen. There was an immediate glitch when I discovered that I had not provided the full web address (I\u2019d specified pwofc.com instead of <a href=\"https:\/\/www.pwofc.com\/ofc\/\">https:\/\/www.pwofc.com\/ofc\/<\/a>); but having made that change, I pressed \u2018Finish\u2019 again and HTTrack got on with its mirroring. \u00a0Some 2 hours 23 minutes and 48 seconds later, HTTrack completed the job, having scanned 1827 links and having copied 1538 files with a total file size of 212 Mb.<\/p>\n<p>The mirroring had produced seven components: two folders (hts-cache and <a href=\"http:\/\/www.pwofc.com\">www.pwofc.com<\/a>) and 5 files (index, external, hts-log, backblue and fade). \u00a0The hts-cache folder is generated by HTTrack to enable future updates to the mirrored web site; the external file is a template page for displaying external links which have not been copied; backblue and fade are small gif images used in such templates; and the log file records what happened in the mirroring session. The remaining wwwpwofc.com folder and index file contain the actual contents of the mirror.<\/p>\n<p>On double clicking the Index file, the pwofc.com home page sprang to life in my browser looking exactly the same as it does when I access it over the net. As I navigated around the site the internal links all seemed to work and all the pictures were in place, though the search facility didn\u2019t work. External links produced a standard HTTrack page headed by \u201c<em>Oops!&#8230; This page has not been retrieved by HTTrack Website Copier. Clic to the link below to go to the online location!<\/em>\u201d \u2013 and indeed clicking the link did take me to the correct location (I believe it is possible to specify that external links can also be copied by setting the \u2018Limit\u2019 option \u2018maximum external depth\u2019 to one, but my subsequent attempt to do so ended with errors after just two minutes; I abandoned the attempt). The only other noticeable difference was the speed with which one could navigate around the pages \u2013 it was just about instantaneous. From this cursory examination I was satisfied that the mirror had accurately captured most, if not all, of the website.<\/p>\n<p>An inspection of the log file, however, identified that there had been one error \u2013 &#8220;<em>Method Not Allowed (405) at link www.pwofc.com\/ofc\/xmlrpc.php (from <a href=\"https:\/\/www.pwofc.com\/ofc\/)\">www.pwofc.com\/ofc\/)<\/a><\/em>\u201d. According to the net, a PHP file \u2018is a webpage that contains <strong>PHP<\/strong> (Hypertext Preprocessor) code. &#8230; The <strong>PHP<\/strong> code within the webpage is processed (parsed) by a <strong>PHP<\/strong> engine on the web server, which dynamically generates HTML\u2019. Interestingly, I wasn\u2019t aware of having any content with such characteristics, but, on closer inspection of the files in my hosting folder, I found I had lots of them \u2013 probably hundreds of them. I tried to figure out what the error file related to but had no clue other than its rather striking creation date \u2013 23\/12\/2016 at 00:00:00 \u2013 the same date as several of the other PHP files. I had not created any blog entries on that day, so my investigation ground to a halt. I don\u2019t have the knowledge to explore this, and I\u2019m not prepared to spend the time to find out. My guess is that the PHP files do the work of translating the base content stored in the SQL database into the structured web pages that appear on the screen. I\u2019m just glad that there was only one error \u2013 and that its occurrence isn\u2019t obviously noticeable in the locally produced web pages.<\/p>\n<p>The log file also reported 574 warning which came in the form of 287 pairs. A typical example pair is shown below:<\/p>\n<p>19:31:13\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Warning: \u00a0\u00a0 Moved Permanently for www.pwofc.com\/ofc\/?p=987 19:31:13\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Warning: \u00a0\u00a0 File has moved from www.pwofc.com\/ofc\/?p=987 to \u00a0 \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 https:\/\/www.pwofc.com\/ofc\/2017\/06\/29\/an-ofc-model\/<\/p>\n<p>I tried to find a Help list of all the Warning and Error messages in the HTTrack documentation but it seems that such a list doesn\u2019t exist. Instead there is a Help forum which has several entries relating to such warning messages \u2013 but none that I could relate to the occurrences in my log. As far as I can see, all of the pages mentioned in the warnings (in the above instance the title of the page is \u2018an-OFC-Model\u2019), have been copied successfully so I decided that it wasn\u2019t worth spending any further time on it.<\/p>\n<p>All in all, I judge my use of HTTrack to have been a success. It has delivered me a backup of my (relatively simple) site which I can actually see and navigate around, and which can be easily zipped up into a single file and stored.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>HTTrack is a free-to-use website copier. Its web site provides the following description: \u00a0\u201cIt allows you to download a World Wide Web site from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files &hellip; <a href=\"https:\/\/www.pwofc.com\/ofc\/2019\/04\/30\/getting-an-httrack-copy\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[28],"tags":[],"class_list":["post-1610","post","type-post","status-publish","format-standard","hentry","category-blog-archiving"],"_links":{"self":[{"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/posts\/1610","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/comments?post=1610"}],"version-history":[{"count":1,"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/posts\/1610\/revisions"}],"predecessor-version":[{"id":1611,"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/posts\/1610\/revisions\/1611"}],"wp:attachment":[{"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/media?parent=1610"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/categories?post=1610"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.pwofc.com\/ofc\/wp-json\/wp\/v2\/tags?post=1610"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}