Hints on Using the London Gazette search engine -here

 

By Forrest Anderson

FINDING TEXT ON A PAGE AND COPYING AND PASTING INFORMATION

The pages of the printed London Gazette have been scanned and then put through an Optical Character Recognition (OCR) process. This gives the best of both worlds - not only can the original document can be seen on the screen, but the text can be search for particular words. Indeed, for WW1 and WW2, the complete text is meant to be searchable which is wonderful.

Go to http://www.gazettes-online.co.uk/archiveSearch.asp?WebType=0&Referer=WW1

and enter "gerald patrick heffernan" into the "Find:" box (I prefer to use lower-case for all my searches). In this example and throughout this posting, do not enter any quotation marks - I'm only using them to make it clear what words should be entered into the box or what appears on the screen.

Make sure "World War 1 Records" is chosen, and press the search button.

You should get two hits:

Gazette Edition, Issue 31370 30-May-1919

Gazette Edition, Issue 30111 1-June-1917

Click "View Edition in PDF format" for the first of these hits - the 30th May 1919. Another window will open up to show the page.

Hefferman is here somewhere, but where? One thing you can do is to look at the Acrobat toolbar immediately above the words SUPPLEMENT TO THE LONDON GAZETTE at the top of the page. The 6th icon from the right shows a full page of paper (not in a box), and if you click on that, you'll fit the whole page on your monitor, not just the top part of the page. This *may* make searching the page easier, depending on your monitor's resolution. Click on the 4th icon from the right to revert back to what it was (fit width).

A better way to search a page is to use the "Find" icon on the Acrobat toolbar, which looks like a large pair of binoculars. If you click this icon, enter "heffernan" into the search box, and click the Find button, you'll be taken straight to Heffernan's entry in the right-hand column. You can click the "Find Again" icon in the Acrobat toolbar to see if the name appears again on this page (but it'll only search this particular page).

OK, we've found Major Hefferman. However we can copy and paste his entry into a wordprocessor or another program. To do this, look for the "T" icon on the toolbar - it's just to the right of the + and - magnifying glass icons. Click on the arrow to the right of the "T" and select the "Column Select Tool". Now take the I-beam cursor and draw a rectangle round the text you want to copy, and when you release the mouse button the text should be highlighted. Press CTRL-C (hold down the CTRL key on your keyboard, and tap the "C" key) to copy the text. Start up or switch to your word processor or whatever program you want to use, position the cursor where you want the text to go, and press CTRL-V to paste his entry. The text will appear in your word-processor as:

T./Maj. James Gerald Patrick Heffernan, M.C., 1st Bn., R. Dub. Fus.

The ability to copy and paste text is a *very* useful facilty, and large lists of men can be copied quickly.

RECOGNITION ERRORS

Now to demonstrate the first problem! Go up the column and draw a rectangle round the entries from Major Grasett down to Captain Henry Ronald Hall. Using CTRL-C and CTRL-V, copy and paste the entries to Notepad or your favourite word-processor. You should get:

Capt. and Bt. Maj. Arthur Edward Grasett, M.a, R.E.

T./Oapt. (T./Maj.) Frederick Buss Graystone, M.C., R.A.

T./Lt.-C'ol. James McGavin Greig, W. York.R., attd. 18th Bn., York, and Lanes. R.

Maj. Howard Charles Grabble, R.F.A., T.F.,attd. 523rd Sge. Bty., R.G.A.

Capt. Edward Jo>hns Grinling, M.C., I/4th Bn., Line. R., T.F.

Maj. Arthur Marjoribanks Guild, High. Cyc. Bin., attd. I/19th Bn., Lond. R.

Maj. Atthelstane Claud Gunter, 488th Sge. Bty., R.G.A.

Capt. (A./Maj.) Henry Ronal'd Hall, M.C., A/47th Bde., R.F.A.

Now we can see a few problems with the optical recognition process! Every single one of these entries has an error:

Capt. and Bt. Maj. Arthur Edward Grasett, M.a, R.E. M.a, R.E. should be M.C. R.E.

T./Oapt. (T./Maj.) Frederick Buss Graystone, M.C., R.A. T./Oapt. should be T./Capt.

Buss should be Russ

T./Lt.-C'ol. James McGavin Greig, W. York. R., attd. 18th Bn., York, and Lanes. R. T./Lt.-C'ol. should be T./Lt.-Col.

York, and Lanes. R. should be York. and Lancs. R.

Maj. Howard Charles Grabble, R.F.A., T.F., attd. 523rd Sge. Bty., R.G.A. Grabble should be Gribble!

Capt. Edward J>ohns Grinling, M.C., I/4th Bn., Line. R., T.F. J>ohns should be Johns I/4th should be 1/4th Line. R. should be Linc. R.

Maj. Arthur Marjoribanks Guild, High. Cyc. Bin., attd. I/19th Bn., Lond. R. Bin., should be Bn. I/19th Bn., should be 1/19th Bn

Maj. Atthelstane Claud Gunter, 488th Sge. Bty., R.G.A. Atthelstane should be Athelstane

Capt. (A./Maj.) Henry Ronal'd Hall, M.C., A/47th Bde., R.F.A. Ronal'd should be Ronald

 

Having seen all the errors in the "translated" page, it is not surprising that there's going to be problems with searches. When you ask the London Gazette's search engine to look for Athelstane Claud Gunter, it won't find the entry above, because it thinks he's called Atthelstane Claud Gunter. Similarly, you'll never find John Grinling's entry, because the computer thinks his name is Johns Grinling.

Let's test this by going to the initial search page and entering "john grinling" into the search engine. It finds only one entry - Gazette Edition, Issue 29608, 2-June-1916. It did not find the page we have been looking at, which was for 1919.

Now go back and search for "jo>hns grinling". This time it will give you another entry - the page for 1919 which we have just been looking at. Similarly, you need to search for Grabble if you want to find Major Gribble!

 

PROBLEMS WITH WORDS AT THE END OF A LINE

Go back to the page with Major Heffernan and Major Gribble. The URL is http://www.gazettes-online.co.uk/archiveViewFrameSetup.asp?webType=0&PageDuplicate=n&issueNumber=31370&pageNumber=0&SearchFor=Gerald%20Patrick%20heffernan&selMedalType=&selHonourType=

Look at the top of left of the page, and find the following entry, which I have copied and pasted below:

Capt. (T./Lt.-Col.) Victor Leopold Spencer Cowley, M.C., R. Iri Rif., attd. 31st Bn., M.G. Corps.

Apart from "Ir." having been translated as "Iri", the entry seems to be correct. We should therefore be able to do a search for "Victor Leopold Spencer Cowley" without any problems.

Go to the main search screen and feed that name into the search box. There are no hits.

Let's try "Leopold Spencer Cowley" instead. No, it doesn't find him using that name either.

Let's try "Spencer Cowley". No, still no luck.

How about "Spencer"? *Surely* it must find him? If you want to try this, it will say "We found 1239 Gazette Editions that contain "Spencer"". It even finds a mention of the name in the same issue as we've been looking at, but it's on another page. Still we haven't found Lt Col Cowley's entry...

This is how the end-of-line problem shows itself. The search engine does not seem to be able to find some words that occur at the end of a line. Instead, it treats the last word of one line as being part of the first word on the next line. In our example...

Capt. (T./Lt.-Col.) Victor Leopold Spencer Cowley, M.C., R. Iri Rif., attd. 31st Bn., M.G. Corps.

...the search engine joins Spencer to Cowley to make one word - spencercowley, and just searching for plain spencer won't pick him up. Similarly, searching for Cowley won't find him either.

Now go back and run a search for "spencercowley". It brings up two hits:

Issue 31370 30-May-1919

Issue 30450 28-December-1917

It's now found him! We've just been looking at the 1919 edition, and if we check out the 1917 edition, we get the following entry:

Capt. (A /Maj ) Victor Leopold Spencer

Cowley, R. Ir Rif., attd M.G. Corps

Note that Spencer appears at the end of one line, and Cowley at the beginning of the other, just like the 1919 example that we've been dealing with.

In order to get round this problem, one has to make multiple searches, joining names together in case one of the names appears at the end of a line, and is followed by another name at the beginning of the next.

For "Victor Leopold Spencer Cowley" you could search for "victor leopold spencercowley", "victorleopold spencer cowley" or "victor leopoldspencer cowley".

Another way round the problem is to put a wild-card character at the end of a forename or surname, and in this search engine it is an asterisk (*). We have already found that a search for "Victor Leopold Spencer" won't work, because Spencer is at the end of a line, but a search for "Victor Leopold Spencer*" *will* work.

Note that the end-of-line problem doesn't affect words which end in a comma or period. If the entry had been written as...

Capt. (A /Maj ) Victor Leopold Spencer Cowley,

Royal. Ir Rif., attd M.G. Corps

...then we would have found him under "Victor Leopold Spencer Cowley", and the "Cowley" wouldn't have been joined to the "Royal." on the next line. A punctuation mark at the end of the line acts as a terminator and stops the words joining together.

 

PROBLEMS WITH WORDS AT THE BEGINNING OF A LINE

This is very much related to the above-mentioned problem but is more serious as there often isn's a solution. We now know that the first word on a line is often regarded by the computer as being the last part of the last word on the line above. Not only do we have problems searching for the last word on a line, it also means that we also have problems searching for the first word in a line.

Go to http://www.gazettes-online.co.uk/archiveViewFrameSetup.asp?webType=0&PageDuplicate=n&issueNumber=28981&pageNumber=0&SearchFor=jones&selMedalType=&selHonourType=

which is Gazette Issue 28981, published on the 20 November 1914, Page 20 of 128. Half-way down the page in the right-hand column is:

The undermentioned to be Second Lieutenants

(on probation) : —

Dated 21st November, 1914 (unless otherwise stated).

Richard Arthur Joseph Corballis, 3rd

Battalion, Dorset Regiment.

Harold Walter Edmund Crouchley, 3rd

Battalion, Lancashire Fusiliers.

Ronald Andrew Douglas, 3rd Battalion,

Royal Highlanders.

Reginald Leyland Heney, 4th Battalion,

South Staffordshire Regiment. Dated 3rd

October, 1914.

Charles Curetou Herbert Jones, 3rd Battalion,

Royal Warwick Regiment.

Let's see if we can find one of these men. If you do a search for "reginald leyland heney", you'll only get one hit - Issue 28928 6-October-1914 - and not the November issue that we are looking for.

Why can't it find him? Because the search engine has combined the last word of the previous line with his first forename. It thinks he is

called:

"Highlanders.Reginald Leyland Heney"

Do a search for that name (without the quotation marks), and you'll find him! Because Leyland is not at the start of the line, and Heney is not at the end, searching for "leyland heney" works just fine, and this is the poor work-around.

 

DIFFERENT WAYS OF WRITING A NAME

 

If you read my message in the following thread:

Re: [britregiments] [Fwd: [Mon] London Gazette On-Line] Heffernan

...you'll see that the entries relating to James Heffernan were presented in a variety of ways:

Temporary Second Lieutenant G. Heffer-man

Temporary Lieutenant J. G. P. Heffernan

T./Maj. James Gerald Patrick Heffernan

I think this is just one of these things you have to live with, and you need to experiment by searching for an officer or soldier using different styles and combinations of his name.

HYPHENS

Again, in the Heffernan thread, I found the following entry:

Gazette Issue 29001 published on the 8 December 1914. Page 19 of 36

The Royal Dublin Fusiliers.

9th Battalion—

Temporary Second Lieutenant G. Heffer-man to be temporary Lieutenant. Dated 19th November, 1914.

If you search on "hefferman" (note that it *should* be heffernan, but the original printed London Gazette spelt it incorrectly), you'll get 15 hits, but not a hit for this issue of the 8th December.

However, if you search for "heffer-man", you'll get 16 hits, and the 16th will be the one we are wanting. Note that it doesn't mean that the other 15 are also written as heffer-man. If you want to be exhaustive with a search, try inserting a hyphen into the word you are searching for!

AND FINALLY...

Lastly, for those who like homework(!), see if you can find the announcement of the Victoria Cross to Lt Col William Herbert Anderson. It's a rather tragic story, as he was the eldest of four sons who were *all* killed in the Great War. It *is* possible to find him, using the techniques above, and to give you a little help, the announcement appeared in 1918...!

CAVEAT

Once they learn about the end-of-line and beginning-of-line searching problem, the techs at the London Gazette will probably change the way their search engine works, which will affect what I've written.

I have used Adobe Acrobat Reader 5.0.5. Other people might be using older versions of Acrobat which might not act in quite the same way or have the same appearance (eg where I mention the 4th icon from the right in the Acrobat toolbar)

 

Forrest Anderson

 

http://www.military-researcher.com/

Edinburgh

Scotland

 

Email him at

Forrest Anderson forrest@military-researcher.com