RSS .92| RSS 2.0| ATOM 0.3
  • Home
  • About
  •  

    Places you won't find in any dictionary

    January 12th, 2010

    Tobar an Dualchais is an amazing archive of Gaelic and Scots speech and song samples. Under the hood, each of their records is annotated with places – the names of the village, or island, or parish, where the speaker came from.

    We’ve been trying to Unlock their placename data, so the names can be given map coordinates, and the recordings searched by location. Also, I wanted to see how much difference it would make if the Ordnance Survey 50K gazetteer were open licensed, thus enabling us to use it for this (non-research) project.

    Out of 1628 placenames, we found 851 exact matches in the 50K gazetteer and 1031 in the geonames.org gazetteer. Just 90 placenames were in the 50K but not in geonames. There’s a group of 296 placenames that we couldn’t find in any of our gazetteer data sources. Note that this an unusual sample, focused on remote and infrequently surveyed places in the Highland and Islands, but I had hoped for more from the 50K coverage.

    There are quite a few fun reasons why there are so many placenames that you won’t find in any dictionary:

    • Places that are historic don’t appear in our contemporary OS sources. Many administrative areas in Scotland changed in 1974, and current OS data does not have the old names or boundaries. Geonames has some locations for historic places (e.g. approximate centroids for the old counties) though without time ranges.
    • Typographical errors in data entry. E.g. “Stornooway” and “Stornaway” – using the gazetteer web service at the content creation stage would help with this.
    • Listings for places that are too small to be in a mid-scale gazetteer. For example, TAD data includes placenames for buildings belonging to clubs and societies where Gaelic sound recordings were made. Likely enough, some small settlements have escaped the notice of surveyors for OS and contributors to geonames.
    • Some places exist socially but not administratively. For example, our MasterMap gazetteer has records for a “Clanyard Bay”, “Clanyard House”, “Clanyard Mill” but not Clanyard itself. The Gazetteer for Scotland describes Clanyard as “a locality, made up of settlements” – High, Low and Middle Clanyards.
    • Geonames has local variant spellings as alternative names, and these show up in our gazetteer search, returning the more “authoritative” name.
    • Limitations in automated search for descriptions of names. For example, some placenames look like Terregles (DFS) see also Kirkcudbrightshire. I’m hoping the new work on fulltext search will help to address this – but there will always need to be a human confirmation stage, and fixes to the original records.

    It’s been invaluable to have a big set of known-to-be-placenames contributed in free-text fields by people who aren’t geographers. I would like to do more of this.

    I saw a beautiful transcript of an Ordnance Survey Object Name Book on a visit to RCAHMS. Apparently many for the English and Welsh ones were destroyed in the war, but the Scottish ones survived. But that is a story for another time.