Last time we started to look at the way that Proto-Indo-European split up into ten different branches, and how some of those branches (including Germanic) may have continued to interact in what we might call a ‘European dialect area’ even as they were also going their own ways linguistically. So far, I’ve been talking about what’s often called the internal history of Indo-European, dealing with changes to the languages themselves (sound changes, analogies, coining new words, and other such things). Today, I want to start looking at the geography, and dates, of early Indo-European – its external history, which is basically looking at the speakers of languages in time and space.

A time machine would make this all a lot easier.

There are at least three major questions here, though they all kind of get tangled up with one another: where was Proto-Indo-European proper spoken, when was it spoken there, and why/how did the ‘explosion’ into various sub-branches happen? More generally, what did the early history of Indo-European look like on the ground? These are major questions that have debated for a long time (pretty much as long as the idea of ‘Indo-European’ has been around), and they remain extremely controversial. I will only be able to give the barest summary here of what I think the really important points are. This post will mostly focus on the when part of the problem.

First of all, let’s take note of some basic dates we can take for granted. The earliest written evidence we have for any Indo-European language comes from Hittite names recorded by Assyrians. These include personal names from around 1900 BC, though there are some place names apparently in some form of Anatolian that are found around half a millennium before that. Records of two other Indo-European branches occur a few centuries later. We get direct records of Greek and Indic both from around 1400 BC, and some of the Vedic Sanskrit hymns, though only written down much later, probably date to a similar period (though a precise dating of these is difficult). So the Indo-European languages must have split up sometime before 2000 BC – and probably a good while before, to give enough time for each branch to have developed in its own way by the time we get records of it. How long would this kind of differentiation have taken? That’s a million-dollar question, but unfortunately languages don’t change at a fixed, strictly measurable rate. We’re probably talking on the order of maybe one to five thousand years, but that... doesn’t exactly pin things down very well.

A tablet from a horse-training treatise, by a certain Kikkuli.
The majority of the text is in Hittite, but a number of technical terms, such as pa-an-za-u̯a-ar-ta-an-napanza-u̯artanna 'five turnings', are clearly borrowed from Indic (the Sanskrit counterpart to this would be *pañca-vartana-). 

Geographically, the earliest records of Indo-European languages are all from central, northern, or western Eurasia, so we tend to think Proto-Indo-European was probably in one of these regions. Even this isn’t an entirely sure thing, though. The Turkic languages, for instance, are often thought to have originated in or near Mongolia, which is no longer primarily Turkic speaking, and today Turkic languages are concentrated much more in Central and Western Asia. Still, it’s at least safe to rule out the Americas and Australia entirely, and it would seem unlikely that Proto-Indo-European came from sub-Saharan Africa or Southeast Asia. But even if we stick with ‘central or western Eurasia’, that’s still a huge expanse of territory, and we’d really like to narrow things down a lot more.

To get more precise than this, for either time or place, we need to turn to other kinds of evidence. One of the more useful tools in our box is ‘linguistic palaeontology’. The principle here is simple: if we can securely reconstruct a word for ‘dog’ for Proto-Indo-European (and we can: it’s *ḱwon-, from which hound, canine, and, believe it or not, cynic all ultimately come), then we would infer that the speakers of Proto-Indo-European belonged to a society that had domesticated dogs. This is a safe assumption anyway, since dogs have been widely domesticated for well over ten millennia, long before even the earliest suggested dates for Proto-Indo-European, but you get the idea of how linguistic palaeontology might work.

A modern day *ḱwon-.

Things get more interesting when we turn to vocabulary that has a more specific reference, either to some kind of animal or plant that lives in a certain area (which might help us anchor the language in space), or else to a bit of technology (which we might use to date the language to after the invention of said technology).

Unfortunately, very little useful information can be gotten from plants and animals, and for a good reason: usually the words we can reconstruct most confidently are those that are attested in lots of different branches, and where the meanings have remained nicely stable (or only shifting in ways we can clearly account for). But the only plant and animal names that are likely to be found without semantic oozing over such a large area are pretty general ones. So we can easily reconstruct words for things like wolves and bears (which are common across Eurasia), but it’s hard to reconstruct a word for lions (which are only found in the territories of a few ancient Indo-European languages). This might mean that Proto-Indo-European came from a lion-less place, but it also might mean that a bunch of branches of Indo-European lost their old word for ‘lion’ after leaving regions with lions. Sometimes we can make informed guesses, but flora and fauna can rarely be pinned down with nearly enough certainty to be overly helpful for hunting the Indo-Europeans. (There’s an additional complication that the current ranges of many life forms have changed in the past few millennia, so linguists need to make sure they’re talking to palaeobotanists and zoologists about these things.)

Ashurbanipal rejecting a reconstructed word for 'lion' in Proto-Indo-European (photograph, uncolourized).

Technology is a little more promising, and I’ll jump straight to the single word that’s attracted the most attention (and controversy): wheel. This word has lots of cognates throughout Indo-European: especially Greek kuklós (whence cycle), Sanskrit cakrá- (which you might know in its very extended religious-philosophical sense as chakra, particular points in the human body), and Tocharian A kukäl. These all go back to a Common Indo-European form *kʷekʷlós, which is supported by cognates from a wide geographical range of dialects.

Alongside wheels, we can similarly reconstruct Common Indo-European words for axel and a verbal root *weǵʰ- that (despite occasional objections) almost certainly had a Common Indo-European meaning of ‘convey in a wheeled vehicle’ – the first part of veh-icle in fact comes from this verb, as is that of wag-on, among many others in this large word-family. There’s also a second ‘wheel’ word, *rot-, which is reflected in languages from German to Latin – think rotate – to Sanskrit. All this suggests a Common Indo-European vocabulary for not just wheels in general (which could include things like pottery wheels), but specifically wheeled vehicles.

A wheeled vehicle.

Archaeologically, these words have attracted a good deal of attention, since wheels aren’t actually all that old in the grand scheme of things, and wheeled vehicles are apparently datable to around 4000-3500 BC (or so the archaeologists say). This makes it very likely that there was still some sort of ‘Common Indo-European’ after 4000 BC, when these words could acquire their reference to wheels and carts, so that the major breakup of Common Indo-European would have to fall within the window of 4000 BC to 2000 BC.

Some scholars, seeking an older date, have tried to dismiss the argument, pointing out that words like café are common in all Romance languages, but are obviously recent borrowings. But this kind of objection doesn’t work very well: in Romance, the coffee words stand out for being too similar to one another. The don’t show all the sound correspondances that a genuinely old word would, and this betrays them as being as being late borrowings. All of these wheel-and-cart words, by contrast, do show all the expected sound changes, and so the words themselves must be fairly old. If they were borrowings or ‘late’ coinages (‘late’ meaning after the stage of Proto-Indo-European proper), they would still have had to occur at a relatively early date, before the various branches of Indo-European had had much time to split up: they would have to belong to Common Indo-European.

A genuine Roman coffee.

Other scholars have accepted that the words themselves are old, but argue that the meanings are late. But this view either requires a lot of coincidence (the meaning of ‘axel’ would have had to arise, for the same word that originally meant something else, in at least five different branches of Indo-European), or else suggests that these early dialects were still in pretty close contact with one another so that the shift in meaning in one dialect could influence that in the other dialects. It’s hard to escape the conclusion that the wheel words were present at least in a ‘late Common Indo-European’ dialect continuum.

A more serious (and I think interesting) issue is that Hittite has an almost totally different vocabulary for wheeled vehicles. None of the words I’ve mentioned so far have any secure cognates in Hittite or any of the other languages in the Anatolian branch. In the previous post, I noted that Anatolian is often thought to be first branch of Indo-European to start going its own way. A very natural suggestion (which has, in fact, been suggested by many scholars) is that Proto-Indo-European as such may not have known wheeled vehicles, but that this technology became a major part of speakers’ lives only after Anatolian was linguistically no longer interacting closely with the rest of Indo-European. But these remaining Common Indo-European dialects would not begin to diverge seriously until after their speakers had adopted wheel and cart technologies.

Drawing of a Hittite chariot (based on an Egyptian mural).
By the time the Hittites give us records, they were of course long familiar with wheels, and had a full vehicular vocabulary. It was just a different set of terms from what we find in the rest of Indo-European, as if both Anatolian and Common Indo-European had needed to separately devise a new wheel-based lexicon.

If we link that to the apparent archaeological dates for wheeled vehicles, we might infer that Anatolian split off between (using generously broad ranges) maybe 4500 and 3500 BC, while the rest of Common Indo-European stuck pretty closely together until sometime after the 4000-3500 BC range. This is not a perfectly secure model (what if Anatolian didn’t actually branch of first? what if there was a genuine Proto-Indo-European wheel vocabulary, which either Anatolian or the other dialects changed for some reason?), but it’s a reasonably plausible one, and I think the basic arguments are accepted by a majority of scholars now (though not by everyone, by any means!). At the least, the evidence of wheeled-vehicle words within Common Indo-European would seem to demonstrate pretty well that this dialect continuum had not yet significantly differentiated into the various sub-branches much before, say 4000 BC at the earliest (since that’s the earliest date archaeologists seem to give for wheeled vehicles).

That might give us a rough time period – with uncertainties, and only in terms of ranges, but on the whole this is pretty good as far as arguments about prehistory go. But what about place? And just what was it that made Indo-European come to be spoken across such a large area, so that it could end up splitting into so many branches in so many different places? These are questions for the upcoming posts.


Further Reading

The literature on trying to locate the speakers of Proto-Indo-European in time and space is enormous. The whole debate is often called the ‘homeland’ question, though I’ve avoided using this term myself. It seems to me to give a wrong impression about what the debate is all about. It’s not as if the place where Proto-Indo-European happened to be spoken is particularly special. It’s entirely possible that Pre-Indo-European had only entered its ‘homeland’ a short time before it subsequently spread out further, and it’s a fairly popular idea that large chunks of Common Indo-European had already moved away from the ‘homeland’ before spreading out from this newer location. Asking the ‘homeland’ also puts the focus on where, even though when is just as interesting a question (which is why I’ve started with that side of things here, and will only get to the where in the next couple of posts). But this is my particular bugbear, and if you read further on this topic you’ll find most people using the term ‘homeland’ pretty freely.

And if you do want to read further, there are a few excellent places to start. Jim Mallory’s book In Search of the Indo-Europeans has an entertaining overview of the history of the question going back to the nineteenth century. The importance of the wheeled vocabulary in dating (and partly in placing) Indo-European is stressed in David Anthony’s influential book The Horse, the Wheel, and Language, which is probably still the best book outlining the mainstream view of where and when Indo-European was spoken (which is, at least in broad terms, the view I’ll be taking in the next couple of posts).

Anthony also has a more recent paper, co-authored with Don Ringe, going over his ideas (including the wheel argument, which they elaborate in much more detail) much more succinctly, which is an excellent place to start if you want an overview of this line of thinking.

I should also say that some people would like to date Proto-Indo-European very much earlier. This forms an integral part of a hypothesis by Colin Renfrew, developed in his book Archaeology and Language: The Puzle of Indo-European Origins that the Indo-European languages originated in Anatolia (part of what’s now Turkey) and spread around with Neolithic revolution, as the first farmers filtered slowly across Europe. For this to work, Proto-Indo-European would have to be dated to before 6000 BC. This approach runs into some major difficulties, not least of which is the wheel vocabulary: Renfrew has to suppose that the Indo-European dialects were still in contact with one another, and sufficiently mutually interpretable so that some 3000 years later a whole package of wheel terms could spread across a good deal of the family, with the words being adjusted in each language to look as if they’d come from Proto-Indo-European. Renfrew’s book is very interesting, and highly informative about early agriculture in Europe, but probably not a very compelling theory as far as Indo-European is concerned. Renfrew himself has at least partially walked back his theory in a recent lecture:

A different approach to dating Proto-Indo-European is something called glottochronology. This is the idea that languages change at broadly measureable rates – they don’t totally transform overnight, but they are also always changing at least a little bit. Specifically, the idea here is that vocabulary changes at a measurable rate, so that if you compare the amount of shared vocabulary in two related languages, you can at least roughly guess how long ago the two languages split apart. Traditional glottochronology was developed in the 1950s, and quickly faced a backlash on the grounds that, basically, it just doesn’t work (at least in Western scholarship; it retained much more respectability to Soviet academia, and to this day Russian scholars often give glottochronoligical dates a weight that can seem startling to European or American researchers).

More recently, a series of studies have revived a form of glottochronology, using different mathematical models and claiming to better account for the range of variations in vocabulary replacements. Most of these construct trees of the Indo-European family, and then try to estimate the ages of various points on the tree, going back ultimately to an estimate for Proto-Indo-European. I’ve already mentioned several important studies along these lines in the previous post (not a comprehensive list):  Gray & Atkinson 2003Bouckaert et al. 2012, and Chang, Cathcart, Hall, & Garrett 2015. The first two conclude the Proto-Indo-European was probably spoken before 6000 BC (at least this is how the researchers characterize their findings – actually the 95% confidence interval includes a very wide range of dates, going as late as 3500 BC, which fits much more comfortably with the wheel word evidence), while the last averages considerably later (ranging from 5100-2800 BC). All the researchers involve accept the idea that older dates broadly support Renfrew’s farming hypothesis, while younger dates support Anthony’s approach, which I’ll discuss more in the upcoming posts.

There are at least two major problems with all studies of this sort. First, if the variability and ranges are taken seriously, and ranges rather than mean (average) dates are used, then the conclusions generally span too wide a time frame to be all that useful. Second, I suspect that all studies of this type come up with Proto-Indo-European as being too old: the spread of the Indo-European languages across such a wide area must have involved a great deal of linguistic upheaval and language contact, making it likely that the rates of vocabulary replacements were consistently on the faster end, compared to the rates calculated from the later histories of Indo-European languages. This reinforces the point that a mean date is liable to be misleading: if there’s anything to such studies at all, we should probably concentrate on the more recent ends of their proposed ranges.

If you want to read even more of my thoughts on all of this, I wrote up a response to some of the important research on all this published in 2015:

This is hardly it for a reasonable 'further reading' section, but I've already gone on long enough. I'll be mentioning more readings on placing Indo-European in the upcoming posts, many of which are also relevant to this post (and conversely, most of the stuff mentioned here has a bearing on that question too – as I said at the start, all of these questions get kind of tangled up with each other).

