Skip to main content

Correlation Discovered Between Genetic History, Grammar in Northeast Asia

Grammar reflects genetic history better than other cultural data in Northeast Asia. | Manuel Bovio and Peter Ranacher

Scientists have identified a significant correlation between the genetic history and a language's grammatical structure in populations in northeast Asia. The correlation does not extend to a language's words (lexicon), sounds (phonology), or music, according to their new study in the August 20 issue of Science Advances. The findings suggest that shared grammar may point to the existence of ancient relationships between certain languages that pre-date existing language families.

The analysis, which includes 14 populations encompassing 11 language families in the region, found that grammatical similarity predicted genetic similarity between populations, while genetic similarity also predicted grammatical similarity.

"It was surprising to see that grammar, the hidden layer in language, but not the words or sounds, best reflect genes, the hidden layer in heredity," said Peter Ranacher, a postdoc in the department of geography at the University of Zurich in Switzerland and a co-first author of the study.

"I was surprised at the lack of relationship between genes and music, as it contradicted our previous findings from a smaller-scale study in Taiwan," added Patrick Savage, an associate professor in the faculty of environment and information studies at Keio University in Japan and a corresponding author of the study. "But that's how science works. Sometimes your predictions are right, sometimes they're wrong, and that's what makes it interesting."

Cultural Connections

Scientists across disciplines have long been fascinated by the possibility of relationships between human genetic history, language, culture, and even music. Back in the late 1950s, the ethnomusicologist Alan Lomax launched the ambitious and controversial Cantometrics project, which sought to relate elements of 1,800 songs from 148 populations around the world to social structure and cultural patterns.

In 1988, the Italian geneticist Luigi Luca Cavalli-Sforza attempted to test the relationships between genes and languages in populations around the world, concluding that considerable parallels may exist between the evolution of genetics and the evolution of language. In 2003, Jared Diamond, the American geographer, historian, and author of the popular science book Guns, Germs, and Steel, and the Australian archaeologist Peter Bellwood published an analysis of shifts in 15 language families during the rise of the first farming societies. They concluded that the distribution of languages was not random, but tied to agriculture.

Most previous research, however, has examined similarities between cultural variation and genetic variation at small scales and within particular language families, with few studies looking at these parallels across language families using a diverse collection of cultural data.

"Relationships between genes and culture are one of the big questions in human history," said Hiromi Matsumae, an assistant professor in the department of molecular life sciences at Tokai University in Japan and a co-first author of the study. "But there is no previous study integrating genetic relationships with music and three markers of languages — lexicon, grammar, phonology — beyond language families."

Investigating Northeast Asia

To fill this research gap, Matsumae and colleagues decided to compare culture and genomes from in and around northeast Asia, a genetically and culturally diverse region with many small language families and languages unrelated to any others in the world. Their choice to focus on northeast Asia was calculated. The relationships between genes and language had been well-studied within single language families in other parts of the globe, such as the Indo-European and Austronesian languages, and within a single family grammar tends to differ only slightly. But northeast Asia is home to not one but 11 language families and two cultural zones — circumpolar culture in Siberia and Chinese-influenced East Asian culture. Unpacking the relationships in this region would call for more than conventional lexical analyses.

Along with this opportunity, studying relationships between genes and culture in northeast Asia offered special challenges. "Cultural analysis was a barrier for us because we were required to have multidisciplinary knowledge behind the problem," said Matsumae.

Nonetheless, Matsumae and colleagues gathered the data they needed and examined variation in language (including grammar, phonology, and lexicon), music (including song structure and performance style), and genomes (measured by genome-wide variants called single-nucleotide polymorphisms or SNPs) across northeast Asia.

"We decided to apply the analogy of genetics for language and musical data to see entire differences and similarities between groups rather than particular elements in songs and languages," said Matsumae. "We have created a format for song data in the same format as SNPs data, then calculated differentiation between populations with a conventional indicator used in population genetics. Then, we extracted similarities between languages, rather than focusing on particular elements of languages, like the similarity of word order."

To test for correlations, the researchers estimated how much of the variation in one set of variables could be explained by the variation in another set. They found that more than half the variation in grammar could be explained by genes, and vice-versa.

Matsumae and colleagues had identified a relationship between genetic history and the rules that govern word and sentence creation, but it was still unclear why this relationship existed. One possibility was recent contact — the current proximity of neighboring societies could lead them to have similar patterns in genes and grammar. Another possibility was shared descent — languages in the same language families could have inherited similar properties.

Or something much deeper could be at work — the relationship could trace far back into time, before modern language families even formed. Through further statistical analyses, the researchers excluded the first two possibilities, suggesting the relationship between grammar and genetics is embedded deep within our roots.

"The correlation we found might suggest unknown evolutionary mechanisms in grammar, such as a factor reflecting human migration history,” said Matsumae. 

The findings also point to grammar as a third indicator of language capable of offering insights into relationships between languages that cannot be inferred through lexicon and phonology.  

“Lexicon has been used to infer the relationships between languages, but it has a big restriction for application,” said Matsumae. “It is not suitable to infer relationships beyond language families because they do not share a vocabulary. Phonology can do this, but there is a problem with its usage – language contact influences similarities of phonology, meaning that geographically neighboring languages are similar. We propose grammar as a new indicator of languages that changes more slowly over time than lexicon and phonology."

[Credit for associated image: MD111/ Flickr]