Saturday, July 9, 2011

Let's consider the regions

In the last post I tried to make a start on ordering our data and representing it graphically. This helps define some jobs we have.

First, we need definitions of regions which are big enough to be statistically significant. Second, we need to define haplogroups in the smallest possible clades that are still clear and meaningful. In this post I will just discuss the first challenge.

The obvious aim of the project is to eventually have a big data set for every traditional county in Britain and Ireland. But at this time, at least just using our own data, for some counties we have very little.

Therefore, I have adapted some fairly standard ways of defining regions, joining together counties until our data is rich enough. Unfortunately, for now this mean Wales is one region, and Ireland and Scotland are defined reasonably broadly. It will be interesting, of course, to compare our results in these regions to those of Wales DNA Project, the Ireland Y DNA project, and the Scottish DNA project. Eventually maybe we can work with them and other projects to develop bigger and better descriptions of the genetic diversity of Britain and Ireland.

I've used the so called "NUTS 3" regions of the republic of Ireland,

...but the only one I did not merge with several others was the Border Region, which surrounds modern Northern Ireland, which is of course part of the United Kingdom. Also, I re-united all of Tipperary, which today is in North and South parts. Both parts are in my "Western Ireland".

For Scotland, the best data we have is for the counties south of the narrow part between the Firth of Forth (near Edinburgh) and the mouth of the River Clyde (near Glasgow), the area which includes the bulk of the modern population. I was able to split this into two near and traditionally meaningful parts. On the east is the area between Edinburgh and England which was settled early by Northumbrian Anglo Saxons. In modern parlance, this is called the region of Lothian and the borders. On the west, from Glasgow down to England, was a Welsh speaking kingdom. In modern terms this region is called Strathclyde (the northern part) and "Dumfries and Galloway".

For some modern Scottish region names see Wikipedia.

For the northern part of Scotland I could only get big data sets by uniting most of the highlands into one region. Perthshire, and the "lowland" shires to its south and east, I have been able to separate out.

For England I've used modern definitions of regions,
...except that I have taken the opportunity, given our data, of inventing one new region which I call "East and North of London". This includes the Thames Valley counties of the South East England Region, and the inland part of the East of England region. This neatly allows us to split out the coastal region of East Anglia, which is of course interesting for anyone interested in trying to find signs of ancient movements of people.

There is also a good East Anglian DNA project we can compare to. I invite help and comments concerning how this project's data compares to those of other projects.

For now I have not yet attempted to develop anything with the various remote islands. We have some data but not much yet. (The Isle of Wight is however part of the region south of London. It is very close to the mainland.)

So here is a map:-


  1. I think historically Gloucestershire belongs with Herefordshire and Worcestershire rather than the south west.


  2. It would make sense to divide Ireland into it's historic provinces. This would include the nine county Ulster. Both Donegal and Cavan were part of the "Ulster plantation". Likewise it make more sense including Waterford within a "Munster" grouping given that it is part of that province.

    The provinces are:
    Connacht: Galway, Mayo, Roscommon, Sligo, Leitrim

    Munster: Clare, Limerick, Kerry, Cork, Tipperary, Waterford

    Ulster: Donegal, Cavan, Monaghan, Fermanagh, Tyrone, Derry, Antrim, Down, Armagh

    Leinster: Dublin, Meath, Louth, Longford, Westmeath, Offaly, Laois, Kildare, Carlow, Kilkenny, Wexford, Wicklow

    Of course it would probably be a good idea to remove Dublin from Leinster, given the massive growth of Dublin in the last 90 years it's basically made up of people from all over the island (capital city syndrome). As a result it could skew the Leinster results.

  3. Thank you for that feedback!

    1. Gareth, I have also been thinking about this. In the posts previous to this one you can see I was playing with the idea of having an "East of the SW" region, but is was a bit small, and arbitrary. Simply slipping Gloucestershire into the East Midlands might be a better compromise.

    2. Paul, I had also been thinking of the old provinces. I guess one reason I avoided it so far is that won't show a split between the Republican and UK sections of Ulster, which may be of interest. I will look. I also want to make sure each region has enough data.

  4. I am only interested in the maternal U5 haplomap. I mean it's of course interesting to see the emerging map of male haplotypes. But it's my U5b2b2 haplotype that came from the UK (probably England). The earliest HVR1+HVR2 match is in early Massachusetts. Another tree with only an HVR1 match also goes back to there. Although that one continues on back to Norfolk, England. As for my earliest known female, she was born in North Carolina (or VA?). Oh well, don't pay attention to me.