Giving back: Pronouncing English words that end with -ive

Paradoxically, the better your skill in a second language, the more your mistakes stick out.

I work with a couple of French folks whose English is so good that they are effectively native speakers, as far as I can tell.  It’s super-impressive—if my French were ever anywhere near as good as their English…

It’s their very skills themselves that make it obvious when they make a pronunciation error–it’s as if I were making a pronunciation error.  It is not at all the case that I don’t make pronunciation errors in my native language, and people most definitely do notice them–but, I suspect that they’re all the more obvious precisely because (a) I’m a native speaker, and (b) I’m an “educated native speaker” (sounds hoity-toity, but it’s a technical term in linguistics).  I would guess that many of my “smaller” mistakes in French go unnoticed because they get lost in the thick fog of all of my other mistakes–in my native language, though, they all stand out.

hoity-toity: pretentious.

So, when my French-speaking-colleagues-who-are-essentially-native-speakers-of-English-too make pronunciation errors in English, it is, indeed, noticeable.  Happily, their English-language pronunciation errors often fall into a single category, and that’s what we’re going to go after today–my little attempt to repay more hours than I even want to think of that they’ve spent hammering on my pronunciation/lexicon/syntax/politeness/EVERYTHING in French.

You may have noticed that written vowels in English are pronounced differently than those vowels would be pronounced in essentially every other written language on the planet.  (That’s just a fraction of all languages, by the way–the vast majority of languages have no writing system.)

The reason behind all of this English-versus-the-world divergence in vowel sound pronunciation is something called the Great Vowel Shift.  It changed the pronunciation of many vowel sounds, and it happened after English spelling was mostly established.  The result was that English vowel sounds didn’t line up with their spelling as well as they used to.

The Great Vowel Shift, with approximate dates–and yes, with some training in phonetics, it does make perfect sense. Picture source:

One of the changes in pronunciation affected words that happen to be spelled with an at the end.  It’s a silent now, but it wasn’t always.  The preceding vowel sound changed–in a very systematic way that requires knowing a bit about what you do with your mouth to make sense of–and one of the consequences was that if that preceding vowel was i, it went from being pronounced like in most languages to being pronounced like the word eye is pronounced today.  

So, today, if you’re an Anglophone kid, you grow up being taught that when a word ends in -iCe, where means any consonant, the indicates the sound of the word eye.  There are plenty of examples of this:

  • five
  • drive
  • dive
  • thrive
  • alive
  • hive
  • archive
  • strive

But–and this is a big “but” (which is why I italicized and underlined it)–iCe (followed by a consonant followed by an at the end of the word) is not always pronounced that way.  There are plenty of times when it is not, and those tend to be longer words that educated people would use, and my French co-workers are super-educated, so they use these words.  For some of the native speakers of French that I know, mis-pronouncing these words is essentially the only mistake that I ever hear them make in English.  So: let’s work through some of these.

You’ll notice something about the words that are pronounced the way that Anglophone kids are told you always pronounce -iCe: they tend to be single-syllable.  Consider:

  • five
  • drive
  • dive
  • thrive
  • live (the adjective only, as in live bait)
  • alive
  • hive

But, not all single-syllable words of this type are pronounced that way.  Here’s the one counter-example that I can think of:

  • give

And, not all of the words in which -iCe is pronounce like “eye” are single-syllable words.  The counter-examples that I can think of:

  • archive
  • derive
  • arrive
  • survive
  • revive
  • deprive

I know what you’re thinking now: Zipf, this is simple–regardless of the number of syllables, the is pronounced as in five if it’s in a STRESSED syllable.  And, yes, that almost works–but, consider archive, which is stressed on the first syllable, but is still pronounced like five.

…and live is weird–when it’s a verb, it’s pronounced like give, but when it’s an adjective, it’s pronounced like five.  

OK, we’re more or less good with the words that end in iCe and get pronounced like five.  What about the words that don’t get pronounced like five?  Let’s take a look at some.  Now, I’m not going to select these randomly.  I went to this web page on the web site.  What it gave me is a list of words that end in -ive, sorted by how frequent they are.  Here’s what the output looks like.  You’ll notice that every word is followed by two numbers.  The first one is the length of the word in letters, while the second one is how many times the word occurs in every million words of text.  (What collection of texts did they do their counts in?  They don’t say.)  So, give is 4 letters long and occurs 1735 times per million words, executive is 9 letters long and occurs 171 times per million words, and so on.

Screen Shot 2018-01-26 at 16.40.33

With that list in my greedy little fingers, I’ll go through it and pull out some of the ones that are not pronounced like five.  That gives us this:

  • receive
  • executive
  • alternative
  • objective
  • representative
  • conservative
  • effective
  • initiative
  • positive
  • relative
  • olive

…and there’s a little attempt to help with the already-almost-perfect English spoken by so many of my French colleagues.  Got a funny story related to mispronunciation?  Tell us about it in the comments…

The last duel in France: traces in syntax

The last duel in France leads to a discussion of syntactic theory, ’cause that’s how I roll.

Wanna watch the last duel in France?  Here you go.  Scroll down past the video for an excerpt from an article on the topic from Le monde and the definitions of some of the French vocabulary therein.

The article in Le monde: click here.  Some relevant vocabulary:

retrousser [+ sleeves or pant legs] : to roll up.  Elle avait un de mes pyjamas dont elle avait retroussé les manches.  (Camus, L’étranger)

l’hôtel particulier : like a château, but it’s in a city, versus being in the country, and it could just as well be owned by a bourgeois as an aristocrat–I think it’s actually more likely to have been owned by a bourgeois, at least in Paris.  Don’t quote me on this.

Dans un jardin ombragé par des arbustes bienveillants, enveloppé d’une douceur printanière, chemise blanche, col ouvert, manches retroussées, deux hommes, épée à la main, se jugent, se jaugent, puis, sur un signe de l’arbitre, croisent le fer. Quatre minutes plus tard, le combat cesse un des deux duellistes ayant été touché par deux fois au bras. Cette scène n’est extraite d’aucun roman ou film de cape et d’épée. Elle eut lieu il y a exactement cinquante ans, le 21 avril 1967, dans le parc d’un hôtel particulier de Neuilly-sur-Seine.

English notes

wanna: the written form of the contraction of want + to.  One of the interesting things about this contraction is that it is only possible in specific syntactic contexts, and is absolutely impossible in others.  This lets you distinguish between the following.  Suppose that the following situations exist:

  1. There is going to be a contest.  Whoever wins the contest will be awarded a horse.  There are a number of horses available, and the winner of the contest will be able to choose the horse that they will receive.
  2. There is going to be a horse race.  One of the horses will win the race.

In situation number 1, if you want to ask someone which of the horses they would choose were they to win the contest, you could ask the question in either of two ways.  The second one is more casual, but they are both completely acceptable from a linguistic point of view:

Which horse do you want to win?

Which horse do you wanna win?

In situation number 2, if you think that someone has a preference regarding the winner of the race, and you want to ask them which of the participating horses they hope will emerge the winner of the race, you only have one option:

Which horse do you want to win?

Google the quoted phrase “which horse do you wanna win” and you will get 5 results, all of them in Japanese.  WTF, you’re wondering…

Screen Shot 2018-03-07 at 10.21.41

What you’re seeing in the Google results is sentences that illustrate interesting syntactic phenomena.  Most of the literature on syntax is written about English syntax (blame Chomsky), mostly by (notoriously monolingual) anglophones, and the classic examples in the field are hence mostly in English.  (Actually, the only classic non-English examples that I can think of are in Swiss German–more on that another time, perhaps.)  The which horse do you want to/wanna win sentences are used in classic transformational-generative grammar to argue for the existence of something called a trace.  This is held to be something that is present in the structure of the sentence, but that is not observable–the claim is that you can’t “see” it, but it’s there.  What is that “it”?  The idea is that underlying those two sentences are two “deeper” forms:

  1. For situation 1 (there’s a contest, and the winner gets a horse): Which horse do you want to win [the horse]?
  2. For situation 2 (there’s a horse race, and one of the horses will win): Which horse do you want [the horse] to win?

(Linguists in the audience: yes, I am simplifying this for didactic purposes–no hate mail, please.)  In both cases, the bracketed [the horse] goes away; in the second case, the “trace” that is left behind blocks the contraction of want + to to wanna.

Screen Shot 2018-03-07 at 10.46.55


Now, I know what you’re thinking: It’s obsessing about things like this that keeps Zipf from ever getting a second date.  …and you’re right, I imagine.


Matching game IV: Zipf’s Law in French

Zipf’s Law is why if someone is looking for a web page and types “dogs in marseilles” into the query box, your search engine should pay no attention to the word “in,” some attention to “dogs,” and quite a bit of attention to “marseilles.” 

Zipf’s Law describes the frequencies of words: there is a very, very small number of words that occur very, very often, and a very, very large number of words that occur very, very rarely–but, they do occur.  This blog is focused on one of the consequences of Zipf’s Law: it means that if you are seriously studying a second language, you are going to run into words that you don’t know every day for the rest of your life.

img_6216You know how the matching game works: we have words in English, words in French, and we match them.  Today’s words (and a tiny bit of grammar) are taken from the discussion of Zipf’s Law in the book Recherche d’information: Applications, modèles et algorithmes, by Massih-Reza Amini and Éric Gaussier, second edition.  Recherche d’information is information retrieval, the task of finding documents in response to an information need: what Google does for you every day.  One of the great embarrassments of linguistics is the fact that information retrieval is mostly about language, in the sense that mostly what you’re looking for is web pages with stuff written for them and you use words to find them–and yet, most of the work of information retrieval is done without actually doing anything that looks very much like doing anything with language.  At its heart, the technology of information retrieval is almost entirely done with counting and very simple arithmetic–nothing linguistic there.  You could think of that very simple arithmetic as taking advantage of Zipf’s Law–the very simple arithmetic is used to figure out things like the fact that if someone is looking for a web page and types dogs in marseilles into the query box, your search engine should pay no attention to the word in, some attention to dogs, and quite a bit of attention to marseilles when it is making the decision about which web pages to put at the top of the search results.  Scroll down to find today’s vocabulary items, and click on the pictures of the relevant pages from Amini and Gaussier’s book if you’d like to see those words in context.  As for me: a second cup of coffee, go over these flashcards, and then off to the lab.  Today’s goal: explain why researchers calculated the ratio of vocabulary size to length of conversation of a bunch of soldiers–after chasing them through the woods, catching them, depriving them of food and sleep, and then interrogating them.


I included La fréquence du second mot because I’ve been trying to understand when to use second and when to use deuxième.  If I understand the Académie’s Dire/Ne pas dire page correctly, the Academy would prefer that this be deuxième, but not even the Académie thinks that it’s mandatory to make the distinction:

On peut, par souci de précision et d’élégance, réserver l’emploi de second aux énoncés où l’on ne considère que deux éléments, et n’employer deuxième que lorsque l’énumération va au-delà de deux. Cette distinction n’est pas obligatoire.

On veillera toutefois à employer l’adjectif second, plus ancien que deuxième, dans un certain nombre de locutions et d’expressions où il doit être préféré : seconde main, seconde nature, etc., et dans des emplois substantivés : le second du navire.

As the web site puts it: C’est pour cela qu’on parle de la Seconde Guerre mondiale parce qu’on espère qu’ il n’y en aura pas de troisième !

The last words of Biloxi

For poignancy, it’s hard to beat the International Journal of American Linguistics.

No one actually knows how many languages there are in the world.  Linguists generally estimate a number in the range of 5,000 to 10,000.  What people generally do agree about is this: by the end of this century, half of them will be gone.  Extinct.  No longer spoken.

When you teach linguistics, you often find yourself saying things like this:

All human languages have the property of ambiguity.

No language has a voiced stop without having the corresponding unvoiced stop.

Well: we actually know almost nothing, relatively speaking, about most languages.  Whether there are 5,000 languages today, or 10,000, most of them are what linguists call “undescribed:” that is, we know that they exist, but not much else about them.  “We” in the sense of linguists–that is, people who study language as a system.  (Obviously, the speakers know something about them.)

Consequently, I always find myself needing to give a disclaimer: of the 1,000 or so languages about which we know something, out of the 5,000-10,000 languages in the world, all of them have the property of ambiguity… Clunky, but more plausible than all or none.  As a scientist, I don’t really like “universal quantifiers,” anyway–always, never, all, none… They just aren’t true that often.  No language has a voiced stop without having the corresponding unvoiced stop: a well-known fact, which turns out not to be true: Kukú (a language of the Eastern Nilotic family, with 30,000 or so speakers, mostly in the town of Kajo-Kaji (several other possible spellings) in South Sudan) has a voiced palatal stop, but no voiceless one.  Ambiguity, though: yep, as far as I know, every human language is ambiguous.

Franz Boas posing for a museum exhibit: “Hamats’a coming out of secret room.” 1895 or earlier. A quote from the Wikipedia article about him: “In his 1963 book, Race: The History of an Idea in America, Thomas Gossett wrote that “It is possible that Boas did more to combat race prejudice than any other person in history.” Picture source: By Anonymous [Public domain], via Wikimedia Commons
For poignancy, it’s hard to beat articles like this one from IJAL, the International Journal of American Linguistics.  No, “American linguistics” does not mean linguistics done by Americans, or done in America: it means the study of the indigenous languages of the Americas.  (The Americas explained in the English notes below.)  Wikipedia tells me that it was established in 1917 by the anthropologist Franz Boas (a fascinating figure–check him out).

The article was written by Mary Haas, one of the most prolific producers of linguistics PhDs ever (including Marc Okrand, who would go on to create the Klingon language for the Star Trek movies–one of my colleagues used to use it as a source for exam questions).  Haas spent a decade researching some of the indigenous languages of the southeast United States (and an island off the coast of British Columbia), and then was recruited by the War Department to develop a methodology for teaching Thai.  She did so; my anthropological linguistics professor told me that after the war, her approach was abandoned on the theory that she had been teaching her American students to speak a tone language and that for an American to learn to speak a tone language is impossible–clearly Haas was right, and it is not impossible for an American to speak a tone language.  My professor chalked this folly up to the sexism of the time, and she was probably right.  Haas worked with the last living speakers of a number of languages; this paper describes her work on one of them.  Read it and weep.

English notes

The Americas: North, Central, and South America.  Some examples from the Sketch Engine web site:

  • From the local ports it was shipped to Liverpool and thence into larger vessels overseas, including West Africa where it became a key component in the triangular trade involving slaves to the Americas.
  • Her work explores contemporary cultural production in the Americas to analyze how artists and activists use a variety of media, the Internet, Closed Circuit TV, the street, and theatre, to challenge traditional notions of politics in relation to location, simulation, and embodiment.
  • The members represent not only the various disciplines (such as history, anthropology, archaeology, sociology and law) but also the various regions of the world (Africa, the Americas, the Caribbean, Europe, the Indian Ocean, the Arab states and Asia).

How I used it in the post: No, “American linguistics” does not mean linguistics done by Americans, or done in America: it means the study of the indigenous languages of the Americas. 

poignancy: from the adjective poignant, meaning

(1) painfully affecting the feelings piercing 
(2) deeply affecting touching
  • The Schubert in particular was very affecting: in the second movement, the poignancy of an old man now 87 playing the searing music of a young man facing early death was almost too much to take.
  • The threshold between life and death imparts poignancy to the utterances of the dying.
  • If we don’t survive, we can imagine the same faint chance the Voyagers have of being detected and studied by some other intelligence, a thought that adds an almost unbearable poignancy to some of these images.
  • But a second viewing reveals that what he has witnessed is his own funeral, the final scene of the film, adding an unbearable poignancy to a very potent image of tragic inevitability.
  • It was meant to be a romantic comedy, and it definitely has those elements, but it ended up having a bittersweet poignancy as well, as Paisley deals with the death, bequests and scandals of her great-aunt.

How I used it in the post: “For poignancy, it’s hard to beat articles like this one from IJAL, the International Journal of American Linguistics.”  

to chalk (something) up to (something): “To link something that has happened to a particular reason or circumstance.”  (Source: The Free Dictionary You’ll find a number of related, but different, meanings there.)  Examples:










Itchy Feet on How Many Kisses

One for a small child, or in Brittany.

The Itchy Feet comic tells stories of language, culture, and travel from around the world.  I usually can’t say much about the accuracy of the culture and travel stuff, but the language stuff is right on—the author has clearly had some education in linguistics.

For more about la bise:

Want to participate in a survey on how many kisses one gives to faire la bise (give the French cheek-kiss) in different parts of France, or just look up the survey results for your part of that beautiful country?  Go to  Click on a region to give your data, or mouse-over to see its survey results thus far.

Incidentally: my family is originally from Brittany, but we’ve been Parisian for a long time; we do the usual Ile-de-France two.



Ambiguity II: Trump, cognitive issues, and no heart

Out of the 256 possible interpretations of this headline, only TWO seem to be the most obvious ones. Why?

We’ve recently been talking about ambiguity.  Ambiguity, from a linguist’s perspective, is the situation of having more than a single possible meaning, and as we’ve seen, there are MANY ways to be ambiguous.

Here’s a nice example.  It comes from the renowned linguist and cognitive psychologist Steven Pinker.  Like many linguists, he collects ambiguous headlines.  They are not at all difficult to find, but some are cooler than others.  Here’s his current favorite:


What can one say about this? From a linguistic perspective, what are the possible interpretations?

  • There is some doctor who does or does not have things going on with respect to his or her heart–we’ll get into what those things could be momentarily.
  • There is some doctor who said something about someone who does or does not have things going on with respect to his or her heart.

Now, the second interpretation is the intended one, so let’s go with it for the rest of the discussion.  (We’ll talk later about what happens if we don’t.)  What’s the issue with the rest?

  • One interpretation is that the comma indicates what’s called a kind of coordination or conjunction: it corresponds to or, and the intended meaning is that the person who’s being talked about does not have heart or cognitive issues.  (That’s not the full story here–more below.)
  • Another interpretation is that the comma indicates a new clause.  In this case, it would correspond to the doctor saying that the person under discussion does not have a heart, and also has cognitive issues.

How many possible meanings so far?  We’ve listed four, but it’s a big underestimate.

Why does Pinker like this one so much?  Because that last interpretation says this: Trump has no heart, and he also has cognitive issues.  That jives pretty well with what I would say, personally, and apparently Pinker, too–so, yeah, I would love to see that in a newspaper (assuming that his cognitive issues didn’t lead to him nuking somebody in a petulant frenzy before he could be (legally) put out of office.

Now: we’re not done yet.  Here are some remaining issues:

  • What is the scope of issues?  Are we talking about heart issues and/or cognitive issues, or are we talking about cognitive issues, and some unspecified thing about the heart?
  • What is the scope of no?  Are we talking about no heart and no cognitive issues, or are we asserting something about cognitive issues, plus something about there being no heart involved in some way?
  • What does heart mean?  Are we talking about an anatomical organ, or are we talking metaphorically, where heart can mean something like inherent kindness?  Or maybe we’re talking metaphorically, but where the metaphorical meaning of heart is something like courage?  (See this video for the meaning of “heart,” “heart checks,” and “showing heart” in prison.) Does it mean a seasonal check on the core of timber?  (Seriously–check it out on
  • What does issue mean?  We actually had a blog post that was primarily on that question, in the context of analyzing Henry Reed’s poem Returning of Issue.

So: how many interpretations of that headline are there?  A low estimate would be two for each of the questions that we thought about above, so each one of those points doubles the number of possible interpretations.  That’s 2 to the 8th power: 256 possible interpretations.  You found another point of ambiguity?  You just doubled the number of interpretations again, to 512.  (Go ahead–find another one, and tell us about it in the Comments section.)

Here’s a question for you: of the 256 possible interpretations, just two of them seem to be the most obvious ones:

  1. The one where the doctor is talking about someone else, where issues modifies (technically, “has scope over”) both heart and cognitive, the meaning of heart is the anatomical organ, and no modifies both of heart issues and cognitive issues.
  2. The one where Trump is unkind and has cognitive issues.


…and here’s an observation for you: my profession is about getting computers to differentiate between the possible interpretations in biomedical journal articles and in health records, finding the one intended interpretation out of all of the possible ones.  I don’t expect to have it solved any time soon.  🙂