skip to content

Cambridge Language Sciences

Interdisciplinary Research Centre
The missing structural pattern in human languages

Linguists working at Cambridge and Newcastle have discovered that a certain combination of words is impossible in all languages, despite it being quite easy to understand what it would mean.

Linguists have known for many years that, among the world’s languages, roughly half are like English in that they put the object of the verb after the verb in a basic, simple sentence, like “The cat ate (Verb) the mouse (Object)”. The other half are like Japanese, in putting the object before the verb: “Nekoga nezumio (Object) tabeta (Verb)” (literally “Cat mouse ate”). In general, languages like English also put an auxiliary verb, like “have” before the main verb: “The cat has (Aux) eaten (Verb) the mouse (Object).” Japanese-type languages put the auxiliary after the verb: OVAux “Nekoga nezumio tabete iru” (literally “Cat mouse eating is”). These kinds of pattern, very prevalent among thousands of the world’s languages, are called harmonic.

But some languages are disharmonic. In the Germanic family of languages, of which English is a member, we find some disharmonic languages. English, as we saw, is harmonic (AuxVO). So are the Scandinavian languages. German is also harmonic (with a complication that this order is only observed in subordinate clauses): “ dass die Katze eine Maus (Object) gefressen (Verb) hat (Aux)”. But Dutch shows a disharmonic OAuxV order: “ .. dat de kat een muis (O) heeft (Aux) gegeten(V).” West Flemish dialects allow a different disharmonic order, AuxOV: “.. dat de kat heeft (Aux) een muis (O) gegeten (V).”

Surveying all the Germanic languages, the team noticed that all the possible harmonic and disharmonic combinations of Aux, V and O are found, except one: VOAux. Looking at as many varieties of Germanic as they could, including Old English (the language of the Anglo-Saxons) and Old Norse (the language of the Viking sagas), they found the same: in all these millions of words, across all these centuries: VOAux order is absent. And it remains absent even where languages which, between them, have the ingredients to produce this order are in close contact: English, which has VO order, and Afrikaans, which has V-Aux order, have been in close contact for well over a century in South Africa, with many speakers today regularly mixing the two languages; all the Germanic orders given above are possible in the most English-influenced varieties of Afrikaans, except one: VOAux.

The team then observed that the same is true beyond the Germanic languages. Looking in particular at languages with freer word order, such as Basque and Finnish, which allow VO as well as OV order and AuxV as well as VAux order, they found that one and only one combination is mysteriously lacking: VOAux. They extended their survey across the languages of the world, and found that this pattern is consistent: one order is missing.

Furthermore, the missing word order is not just characteristic of combinations of a verb with an object and an auxiliary, but holds much more generally for analogous combinations of nouns,  prepositions, adjectives, conjunctions, etc., in widely divergent languages: it is very generally a missing structural pattern in human languages. 

The striking thing about this, the team points out, is that it has nothing to do with meaning. Why should “The cat eaten (Verb) the mouse (Object) has (Aux)” be any less comprehensible than “The cat the mouse eaten has” or “The cat has the mouse eaten”? Instead, the team believes that their generalisation sheds light on the interplay of Noam Chomsky’s concept of a universal grammar, including universal principles governing the structure of linguistic expressions, with principles of efficient processing of speech and efficient learning.

The research team are Anders Holmberg (Newcastle University), Ian Roberts, Theresa Biberauer, and Michelle Sheehan (all University of Cambridge). Their findings are reported in a volume recently published by Oxford University Press and an article soon to appear in Linguistic Inquiry.


What we do

Cambridge Language Sciences is an Interdisciplinary Research Centre at the University of Cambridge. Our virtual network connects researchers from five schools across the university as well as other world-leading research institutions. Our aim is to strengthen research collaborations and knowledge transfer across disciplines in order to address large-scale multi-disciplinary research challenges relating to language research.