Genes, Romano-British history and bullshit: analysis of a dispute

In July/August 2017 there was a bit of ‘disagreement’ on the social media platform Twitter, with some right-wing users attacking the BBC over an educational cartoon about Roman Britain, because some of the characters were shown with a darker skin than others. Sides were taken in articles and postings by experts and others on the question of ethnic diversity in Roman Britain. This same issue has popped up again on occasions since.

Mary Beard, the classical historian, emphasised what is known about the diversity of the Roman presence in Britain, and came in for considerable abuse from the far-right.There was an intervention on the other side from Nassim Nicholas Taleb, the statistician, who made various assertions based on genetics. It’s not my purpose to discuss these arguments here: it does seem that both sides talked past each other to some extent, especially in regards to the meaning of ‘diversity’. You can find more about this here and here and here (and other articles).

There is just one particular point I am interested in here, and it arises from an intervention on Twitter from Taleb, which I responded to at the time.

@wmarybeard: this is indeed pretty accurate, there’s plenty of firm evidence for ethnic diversity in Roman Britain

@nntaleb: Historians believe their own BS. Where did the subsaharan genes evaporate? NorthAfricans were lightskinned.
Only “Aethiopians”, even then

@nntaleb: We have a clear idea of genetic distributions hence backward composition; genes better statisticians than historian hearsay bullshit

Trying to leave aside the left-right political positions and the racist motivations involved in the wider discussion, this particular argument seems to reflect an idea that supposed ‘hard science’ (in this case, statistical genetics) trumps ‘soft’ (history/archaeology). I do not want to get into discussion of ‘hardness’ and ‘softness’ here, but I shall try to analyse Taleb’s specific argument about genetic distributions.

There are three sets of data in this problem. One is the set of data on the genes of the living population of interest, and another is the set of genetic data on the historic (or prehistoric) population that the living population is being compared with. There is also another set of data: the written records, archaeological finds and all other items that comprise what Taleb dismisses as ‘hearsay bullshit’.

All of these sets have their own specific technical problems. The historical data certainly have problems of interpretation amongst themselves: dating of objects and authorship or precise meaning of documents, for example, may be uncertain.

But collecting genetic data also has considerable technical problems: for example, obtaining reliable data from living populations is much easier than it is from fossils or human remains, as there are typically fewer remains than there are available living humans. Also the DNA from human remains may have degraded over time or been contaminated with other DNA, such as from microbes, and needs careful separation.

There is an additional problem with the data from both living and past populations: we have to be reasonably sure that the sets we have are representative of the populations they are taken from, since in neither case can we analyse the DNA of every individual. The genome of an individual is ‘data’ or an ‘anecdote’ to the same extent as a single archaeological find or written record is. It doesn’t tell us anything about the population until it is put into context with all the other similar data. This is a particular case where statistics is used: it is a mathematical tool to help us make a (probabilistic) estimate of the genetic composition of the population that our data sets come from. This is not an automatic process: it involves making assumptions and using the right statistical method, so it has its own issues and uncertainties.

We can think of the problem we are trying to solve here as a theory: how can we explain the genetic makeup of the modern population in terms of the genetic makeup of the past population at the period of history that is of interest? Looking at it this way, we can see that there is no ‘backward composition’ we can automatically use to derive one from the other, whether it uses statistics or any other mathematical techniques. The genetics of the modern population must depend on its history, as well as the scientific principles of genes. In the time that elapsed between the past and the present populations, we need to know what has happened to the population. Did certain groups migrate in or out of the population, was there mixing of different populations, was the population subjected to ethnic cleansing or genocide? These are questions that are very hard to settle, for example, to what extent did invading Anglo-Saxons displace the British currently settled in what is now England. Historians have good reason to believe that the Anglo-Saxons formed a new ruling class, and that some of the existing British were displaced by migration, but there is still dispute as to what proportions of the original population were killed or displaced.

So clearly we can’t ignore history here. The current genetic composition of a population is a result of both genetics and history. That history is part of the problem to be solved. Some of it will be attested by artefacts and documentation, some of it is purely hypothetical (and might be solved with assistance from the genetics). The point here is that the historical evidence cannot be dismissed: any theory, including reconstructing ancient populations, must reconcile all the relevant evidence, or it is simply inadequate. If it is contradicted by the evidence (and the evidence is not found to be defective), then the theory is false, and needs modification or replacement. The evidence here includes (at least) the available genetic data and the historical evidence, including written records and archaeological finds.

Note that I am not making claims about the actual history of Roman Britain here.  I don’t have the necessary expertise in the technical fields. I am simply trying to analyse the problem, to see why genetics is not in itself adequate to the problem, and the history cannot be dismissed. Solving the problem necessarily needs the technical expertise of the geneticists, historians and any others with relevant knowledge.

To see why the history is essential, consider the following scenarios (not an exhaustive list) that might happen to a population of interest:

  1. The population remains isolated from any other.
  2. A small number of immigrants arrives, they seize power and become the ruling class, and eventually merge with the main population through interbreeding.
  3. A large number of immigrants arrives and merges with the main population through interbreeding.
  4. A foreign power occupies the country, using troops from other populations, but remains largely separate from the main population. In the end it (largely) leaves.

All these scenarios will leave both different genetic traces and historical and archaeological data. Likely, given the inadequacy of the evidence, the problem will never be finally settled, as there will always be anomalies, gaps in the data and unsolved questions.


Statistics is a branch of mathematics. Strictly speaking, it is fit only for analysing distributions of pure numbers. Whenever statistics is used as a mathematical tool in solving questions about the real world, other restrictions apply. We are then not dealing with pure numbers but with physical entities (and the concepts we use to understand those entities). In the particular problem above, these entities include people and their genes. In the physical world, these entities are subject to other principles, including the human lifecycle and the physical processes of genetic combination.

Sadly, statisticians, however good they are at statistics, can lose sight of this fact, and claim authority for statistics that is simply not justified. A very good example of this is in the supposed debate over human-caused global warming, where statisticians have weighed in on the ‘sceptic’ side with statistical analyses that simply ignore the laws of physics. One statistician has called this ‘mathturbation‘. Climate measurements are not simply numbers, they are properties (such as temperature or carbon dioxide concentration) of physical entities such as air or oceanic water, and our theories about them are part of physics.

In trying to understand things that happen in the real world, outside mathermatical textbooks, you can’t ignore the technical experts and their knowledge: statistics may prove them wrong, but only when correctly applied to the subject data.

Nassim Nicholas Taleb

The involvement of Taleb in this debate was very strange. He seemed to glory in taking the part of extreme right-wing participants in (sometimes vicious) attacks on historians on Twitter. He derided Mary Beard’s academic credentials, and frequently calls experts in other fields ‘bullshitters’. Yet his claim about genes and statistics reproduced above is bullshit, where bullshit is the term for not actually lying, but giving the impression of having knowledge he didn’t actually have.

Taleb’s background is as a statistician and a trader on financial markets. Financial markets are about the nearest thing in the real world to pure numbers, and mistakenly thinking that climate data are in some way similar to markets has led many people into error. Perhaps this confusion applies to other areas as well.

Taleb also seems to have a very thin skin, and has blocked me, and apparently many others who disagreed with him, on Twitter.

Alfred Wegener and continental drift: Crackpot or heretic?

It is not uncommon for writers who wish to disparage science to refer to Alfred Wegener and his theory of ‘continental drift’. People laughed at him, they say, and ridiculed his ideas, and they are not laughing now. The establishment saw continental drift as a crackpot theory, or a threat to some existing theory. He was a heretic against the scientific establishment, and did not live to see his ideas triumph. Here is another example of how science is only, or only a little better than, a set of opinions of scientists that can be overthrown at any time. What is called ‘science’ is just the opinion of the majority of scientists in a field, and a plucky loner (Wegener was not a geologist) may eventually overthrow the established opinion and receive the credit he deserves.

This view of science is particularly comforting to religious extremists, postmodernist philosophers and science-deniers of all stripes (climate change, AIDS, vaccines, and so on).

But it’s false (and could not be true in its full-blown postmodernist form in any case, because if it was true, why should the heretic be any more right than the established views?)

Quite recently, Matt Ridley, who used  to be an admired science writer in his own field of biology, invoked Wegener (amongst others) as an example of a heretic who was persecuted by scientists but eventually  triumphed. This is by way of lauding a ‘sceptic’ who Ridley thinks (without presenting any evidence) will one day show that humans are not causing  global warming.

Perhaps I’ll come back to Ridley later. For this occasion I want to comment on Wegener, and I’ll start by stating some facts.

  • Wegener’s theory was taken seriously by geologists, even though they were rightly sceptical.
  • Wegener was not a heretic, because he had nothing to be heretical against.
  • Wegener was not the father of plate tectonics, which is not the same thing as ‘continental drift’.
  • Science was working pretty much as it should in his case. (And I dare say this was probably the case for most of the ‘heretics’ Ridley mentions.)

Let’s consider the state of geology in Wegener’s day, around 1915-30. The world had been mostly mapped, and many geological structures around the world, particularly those of potential economic value, had been mapped too. The main geological periods had been identified, many rock strata had been placed in their correct order and some absolute dates had been obtained using radioisotopes, showing that the world was much older than previously thought, even though the dates were not as accurate as those we have now. The geological discoveries tied in with the paleontological (fossil) discoveries, which were explained by evolutionary theory.

But there were lots of puzzling observations of the earth that could not easily be explained. The apparent ‘fit’ of the outlines of some continents – and, particularly, the rock formations on each side – was just one of them, the one that engaged Wegener.  But there was much more.

  • Why were there mountains, if the earth is as old as was now known? It was known that the processes of erosion of rocks would remove mountain ranges in tens or hundreds of million years.
  • Why are the largest mountains in the huge Alpine-Himalayan and Andes-Rockies ranges?  Why are these ranges made up of sediments – as identified by the fossils in them – that had apparently been deposited in submarine trenches called ‘geocynclines’ tens of kilometres deep? And where are the geosynclines of the present day, and if there are none, why not?
  • Why are there volcanoes and earthquakes, and why are they located where they are?
  • Why do some rocks show glaciation in the tropics and others show tropical life in the polar regions? Did the rocks move, or did the climatic zones?

And so on and on. It’s important to remember that a lot of details we now know were not available then and did not become available till the 1950s and 1960s. One particularly important clue that was missing was that the ocean floors are very much younger than most of the continental rocks, less than about 200 million years old, and were formed by spreading from ridges of volcanic activity, such as the Atlantic mid-ocean ridge. Another important detail that needed to be understood is the structural relationship between the continental shields, the oceans and the mantle beneath  them.

There were actually many theories for some of the phenomena, but nothing that explained it all. In such a case, scientists are right to be sceptical. We tend to consider theories better if they bring together lots of isolated observations into one consistent body of explanation, as evolution does in biology and quantum theory does in physics and chemistry. There was no such thing proposed at that time for geology (evolution dates of course from Darwin’s time and the foundations of quantum mechanics were mostly laid in the 1910s/20s).

Some years ago I was on one of many field trips in the mountains of northern Oman. These offer some of the most spectacular (and visible, given the desert climate) geological displays in the world. Vast sheets of rocks, many of them from the bed of a long-gone ocean, and some of them from deep in the volcanic ocean crust, have been thrust far inland over an older land surface, in some places rucking up  the older rock into mountains thousands of metres high, like a vast carpet on a slippery floor. One of the other participants, an FRS in geology, commented that until the coming of plate tectonic theory the only available explanation for this, and all the rest of geology, was magic.

One thing to remember is that a theory must explain those observations that are, on the face of it, inconsistent with the theory. For example, ‘continental drift’ explains why some facing shorelines approximately fit (for example, eastern South America and south-western Africa) but what about those many shorelines that don’t fit?

Another thing that was missing was a mechanism for continental drift. To accept causal relationships, scientists want to know the exact mechanisms by which one thing causes or relates to another. In Wegener’s own field of meterorology, the underlying physical mechanisms of the weather were already known.  In the early 20th century, lots of fundamental work was going on into how chemical reactions occur (their mechanisms). Nowadays, there are scientists studying the mechanisms of genetics and how organisms develop. Wegener had nothing to offer on these lines regarding how continental drift occurred.

Crucially, there was at least one other theory that seemed to explain the observations and it was probably more acceptable at first than continental drift, although it faded as more evidence came in. That is, that the continents were originally connected, but that the land between them had foundered beneath the sea – perhaps more plausible, in the absence of relevant evidence, than moving continents!  This other theory eventually was disproved by the finding that the ocean floors are of very different material from the continental shelves.

There is a book reviewing the state of earth science around the time of Wegener’s death (J A Steers, The Unstable Earth, 3rd ed 1942, originally published 1932) which devotes many pages to discussing continental drift and the evidence relating to it. Clearly the theory had been taken very seriously but as it was incomplete and had at least one rival theory, geologists were right to be sceptical. In fact (as Karl Popper explained) it is right and proper to be as critical as possible of any theory, as it is the one that survives criticism the best that eventually prevails. No doubt Wegener experienced personal remarks and academic bitchiness, but he probably didn’t receive much worse than other proponents of conjectural theories receive. (Incidentally, the objections to Semmelweiss – another of Ridley’s ‘heretics’ – were probably much to do with his attitude and behaviour towards other physicans).

And what theories the Steers book contains! There are many, covering different aspects of geology, and some of them seem pretty strange to us now. For example, there was a theory that the earth had a tendency to collapse into a tetrahedral form, at the same time creating the force that pushed up mountains. This was based on the observation that the main continental shields of the earth form approximately the corners of a tetrahedron, which we now know (whether it is true or not) is no more than a coincidence and a red herring. Much of the theorising in the book  is based on the suggestion that the earth is contracting through cooling.

The book makes it clear that at that time the evidence was stacking up in favour of continental drift and the theory of land bridges was losing favour. Wegener’s theory is given at least the same prominence of that of a prominent expert on earthquakes, Harold Jeffreys, who proposed that the earth was undergoing thermal contraction.

The mechanism of what would later be called plate tectonics (attributed to Arthur Holmes) is also discussed in the book in rudimentary form – the idea that continental plates are mobile on the mantle beneath them.

Before plate tectonics, geology was a mass of unexplained and puzzling phenomena and various theories were widely debated. It was plate tectonics that brought them all together in one wide-ranging and unifying and satisfactory explanation. This happened in the 1950s and 1960s. I won’t go into it here as there are plenty of places where you can read about it. But plate tectonics is much, much more than just ‘continental drift’. Wegener made a contribution to our later understanding – which he did not live to see as he died on an expedition to Greenland studying Arctic weather (his own research field). But there is no reason to regard him either as a heretic or a victim of unreasonable doubt.

Drama in the Karakoram

[BPSDB] A drama, largely unnoticed in the rest of the world, has been going on in the Hunza valley, in the beautiful and terrible mountains of northern Pakistan. In January, a large landslide in January killed about 20 people, cut off the Karakoram Highway between Pakistan and China, and blocked the Hunza river, creating a large lake which has been steadily growing in size.

Pakistan Army engineers have been trying to create a spillway to drain the lake, but overtopping of the natural dam seems imminent. There is a danger that the dam may give way, creating a flood that will threaten tens of thousands of people living downstream. The situation is made more dangerous by further rock falls and the summer melting of the snow on the mountains.

A number of bloggers have been covering events, including Dave Petley of Durham University: here and here.

The biggest control knob: Carbon dioxide in earth’s climate history

[BPSDB] By Richard B Alley of Penn State University.

This was the keynote lecture at the American Geophysical Union meeting (a vast conspiracy of scientists to find out all they can about how the Earth works) last year.

It’s a good summary of what we know about the role of CO2 in the Earth’s climate, a proof that climate scientists really do take into account all the climate changes that have happened over the Earth’s history, and why that knowledge is still bad news for us as we belch huge amounts of buried carbon into the atmosphere.

Vodpod videos no longer available.

more about “A23A“, posted with vodpod

Countering disinformation on climate

[BPSDB] In the wake of the latest outrageously dishonest headlines misquoting Phil Jones, the excellent  Open Mind blog presented a good account of the error, and also initiated a civilised and productive discussion on how to present the facts to the general public. I urge you to read it. I hope to post some thoughts of my own on this in the near future.

Why should I make the data available to you

[BPSDB] In many comments on the CRU hack I’ve seen it alleged that Professor Phil Jones of the University of East Anglia Climate Research Unit denied his data to another researcher with the words, “Why should I make the data available to you, when your aim is to try and find something wrong with it?”

Whenever I’ve seen it quoted, it’s implied that  Jones made the comment in one of the emails. I finally got around to looking for it, in the file I downloaded soon after the hack was made public – and it ain’t there. Not perhaps surprising, as it really doesn’t sound like the sort of thing an academic would say, except jokingly or sarcastically.

Indeed, the words are there. In August 2007, a fellow researcher warns Jones that the words are being attributed to him by someone else. In October 2009, another colleague sends Jones a copy of the text of an article in the National Review of 23 September 2009. In this Patrick Michaels quotes Warwick Hughes as alleging that Phil Jones said, “We have 25 years or so invested in the work. Why should I make the data available to you, when your aim is to try and find something wrong with it?”.

So it doesn’t appear to be evidence that Jones actually wrote it.  Given the unreliability and political commitment of all the links in this chain, and that this was one of the main pieces of evidence for the supposed ‘conspiracy’, I think there is even less evidence of wrong-doing.

Conspiracy theories have a tendency to spawn new conspiracies: here’s a climate ‘sceptic’ who thinks the CRU staff may have leaked the emails themselves to make ‘sceptics’ look stupid. If so, they’ve succeeded.