I once knew a guy named Anatoly who had a bicycle… does that count?

This image search result may technically be for a bicycle itself named Anatoly or something, but what am I gonna do? Not use it? Don’t think so.

At one point in this blog, I talked about how historical linguists determine where and when the Indo-European languages were likely to have begun. The use of such methods usually puts the Indo-European homeland as somewhere in the Black Sea steppes around 3,500 B.C. What I didn’t know until just about now, however, is that that turns out not to be the end of the story. In a recent paper in Science (Bouckaert et al., 2012), a group of researchers attempts to use Bayesian phylogeographic methods to find a Indo-European homeland. What are Bayesian phylogeographic methods, you may ask? As far as I understand, it starts out with cladistics methods used to infer phylogenies (i.e. family trees of biological reltaionships) from a list of (usually genetic) traits. And then putting that together with locational information to chart geographic, as well as temporal, changes. Beyond that, what I know is only that this method is often used to chart disease spread – phylogeographic analysis of samples of a virus coming from a specific outbreak, for example, can help determine the time and location of the outbreak. And now, this technology is being brought to bear on the outbreak of Indo-European. This would be all great and good, except that the homeland posited by this research is very much not the Black Sea steppes 5,500 years ago, but Anatolia, 9,000 years ago. By their model, Indo-European developed roughly contemporaneously with agriculture, and spread with agriculture for thousands of years before the steppe nomads showed up in the picture. Here is the New York Times’ Nicholas Wade on the two theories. Historical linguists by and large seem to be skeptical of the new findings, largely because the evidence for a proto-Indo-European origin for horse-words and other steppe technologies and phenomena is so strong. How could the agrarian Anatolians be spreading the word for “wheel” via the same processes as other proto-Indo-European words if the wheel itself hasn’t been invented yet? There is additional skepticism because not that many people in the field of historical linguistics understand how phylogeography works, and as a result the researchers’ tools remain somewhat opaque. An obvious critique is that language spread is less “tree-like” than genetic spread is. Languages in the same area tend to take on the same characteristics and even share the same vocabulary, despite not being closely related. Therefore, the choice of traits (realistically the word-set) used as input to the phylogeographic machinery becomes very crucial. What did the researchers use, and how did they control for potential non-tree-like developments? Find out by visiting their site with a layman’s introduction, an animated map of their proposed evolution of Indo-European, a response to critics, and a link to the paper itself. They note that their method did correctly predict the origin of the Romance languages to be Rome about 2,000 years ago. On the other hand, Razib Khan sees that it pegs Romani as the outgroup for the modern Indo-Aryan languages. One mistaken branching does not invalidate a phylogenetic method, of course. But the specific fact that it is Romani, associated with a nomadic population and a language with many loanwords, does suggest the method may have specific limitations.

In general, the bringing in of biological methods “back” to the study of linguistics is a topic that is very interesting, but fraught with disagreement. Don’t assume that something published in Science on the topic is incontrovertible (or, conversely, junk). For example, about two years ago, Quentin Atkinson, who is one of the researchers on the above paper, put out a blockbuster paper in Science (Atkinson, 2011) positing the reduction of phonemic diversity away from the African heartland to be analogous to the reduction in genetic diversity, and therefore proposed that all languages evolved out-of-Africa, and the reduction in phonemes is due to a series of founder effect-like losses in diversity. That paper generated a lot of press and even more controversy: what were the marks of “phonemic diversity” used and are they appropriate (Mark Liberman noted that tonal diversity seems to be being over-weighted and as a result places where tonal languages cluster may have their “phonemic diversity” over-estimated)? Are effects of the influence of neighbouring languages on each other being taken into account? And is there even any reason to think anything like the founder effect occurs for phonemes? As far as I can tell, the question is far from settled, but many of the criticisms of the Atkinson paper seem valid. Perhaps the only lame-ass conclusion that I can give that would be undisputed here is that the importation of genetics methods into linguistics is an exciting new phenomenon that holds a bunch of promise and new developments are being eagerly awaited.

Advertisements
This entry was posted in language, science. Bookmark the permalink.

3 Responses to I once knew a guy named Anatoly who had a bicycle… does that count?

  1. vj says:

    Interesting. But ‘d be lying if I said I got more than 10-12% of that.

    • zolltan says:

      The paper or my post? ’cause if the second… I’m sorry dude. I didn’t mean to be confusing. Looking back, the last paragraph does use a bunch of jargon: the founder effect is the idea that a new population formed by a small subgroup moving to a new place has less diversity than the larger population it emerged from. So if some group is the result of many iterations of such moves (e.g. Polynesians), you would expect them to be less genetically diverse than another group of the same size currently. In genetics, this is well-accepted. The question is, does this happen to phonemes, i.e., roughly, the amount of different sounds in a language as well? On one hand, Atkinson’s point is that it seems to. Others point out that the genetic founder effect has an obvious mechanism: the subgroup has less genetic diversity. But there’s no reasonable mechanism proposed for a founder effect for number of sounds in a language. Or was something else confusing?

      Anyway, how’s Haifa?

  2. Anatoly says:

    That’s some authentic frontier gibberish.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s