Regression models in French: Part I

One of the hot topics in linguistics right now is mixed effects models.  A mixed effect model is a kind of regression analysis.  Regression analysis is a way of building a statistical model of a phenomenon.  There are all kinds of things that you might want to build a statistical model of in linguistics, including phonetic relationships, sociolinguistics, syntax, and doubtless many others.  I’m going to use this post to put up some links to things that you might find useful in learning about mixed models, and of course we’ll come across some French vocabulary on the way.  (A note on the vocabulary in this post: it is mostly not found in dictionaries.  I induced it from examples on, an excellent source for finding examples of French technical vocabulary in use.)

The absolute best material for learning about mixed effects models so far is this tutorial by Bodo Winter.  If you’re not familiar with simple linear regression (i.e. with fixed effects only), you might want to check out this tutorial of his first.  Besides being really clear, Bodo’s tutorial is especially suitable for linguists, because it works through an extended example on F0 (fundamental frequency–roughly, the pitch of your voice) variation in situations of different politeness levels.

A regression line predicting female first formant frequencies from male formant frequencies, for speakers of several languages. Data from
A regression line predicting female first formant frequencies from male formant frequencies, for speakers of several languages. Data from Johnson (2011).

Let’s build up to the vocabulary of mixed effects models.  First, some basic vocabulary for talking about regression modelling.  Bear in mind that regression modelling–well, simple linear regression modelling–is about finding a formula that can predict the value for something on the basis of the value of something else.  The figure to the left plots F1 (first formant frequencies–part of what makes a vowel sound like what it sounds like) for female speakers of several language over the F1 for male speakers of the same language.  (The data comes from the web site accompanying Keith Johnson’s book Quantitative methods in linguistics.)  The line on the plot reflects a formula that will let you predict the F1 of a female speaker if you know the F1 of a male speaker.  Not surprisingly, the female frequencies are always higher–one of the determinants of overall patterns of F1 is that all other things being equal, the shorter your vocal tract is, the higher your F1 will be, and all other things being equal, women have shorter vocal tracts than men, on average.  What the line says is that you can get pretty close to an accurate prediction of the female F1 if you multiply the male F1 by 1.29.  (Yes, we’re glossing over the y intercept.)  OK, now on to that basic vocabulary:

  • le modèle: model.
  • le modèle de régression: regression model.
  • la régression linéaire: linear regression.
  • la régression logistique: logistic regression.
  • la régression linéaire simple: simple linear regression.

That got us through simple linear regression modelling.  Recall that in simple linear regression, you’re predicting a value for something on the basis of the value of something else.  But, most things don’t have simple one-to-one relationships.  Rather, it’s often the case that you need to predict one thing on the basis of multiple other things.  For example, suppose that you want to know what affects how long it takes a speaker of a language to respond to the question of whether or not a given sentence is grammatical (i.e., could be said in that language.  Colorless green ideas sleep furiously doesn’t mean anything, but you could say it in English.  On the other hand, green sleep colorless ideas furiously is something that you couldn’t say in English).  You might have to include multiple things in the model–how long the sentence is, how frequent the words in the sentence are, how long the words are, etc.  In this case–predicting one thing (response time) from multiple things (sentence length, word frequency, word length)–you need something called multiple linear regression.  This brings up more vocabulary:

  • la régression multiple: multiple regression.
  • la régression linéaire multiple: multiple linear regression.
The relationship between age and the percentage of correctly formed past tense verbs. From
The relationship between age and the percentage of correctly formed past tense verbs. From

So far, we know how to talk about linear regression.  What both kinds of linear regression have in common is that (a) we’re predicting a value from something else–from one value in the case of simple linear regression, or from multiple values in the case of multiple linear regression–and (b) we can describe the relationship between the value that we’re trying to predict and the value(s) that we’re trying to predict it from on the basis of a (straight) line.  Some relationships can’t be described by a straight line, though.  A classic example in linguistics is the U-shaped curve in language acquisition by children.  This describes a common phenomenon relating age to the percentage of correct productions of some linguistic target–say, irregular plurals, or the past tenses of verbs.  Initially, the child has a high percentage of correct productions.  Then, the child goes through a stage where the percentage of correct productions drops.  (As the figure suggests, this is thought to be because the child has made a transition from “memorizing” the regular and irregular forms to developing a hypothesis about a rule for forming plurals, or past tenses, or whatever.)  Finally, the child’s percentage of production of the correct forms climbs again.  Now we can’t describe the relationship between what we’re trying to predict (the percentage of correct productions and what we’re trying to predict it from (the child’s age) with a straight line.  However, there is another kind of regression that we can use.  It is called non-linear regression:

  • la régression non linéaire: non-linear regression.

We’ve now talked about three kinds of regression modelling.  They all have in common the fact that they are used to predict the value for something from the value(s) for something else.  If we’re trying to predict one value from one other value, that’s simple linear regression (la régression linéaire simple).  If we’re trying to predict one value from multiple other values, that’s multiple linear regression (la régression linéaire multiple).  And, if the relationship between what we’re trying to predict and what we’re trying to predict it from can’t be described by a straight line, then we have non-linear regression (la régression non linéaire).  (Before you ask: yes, there is such a thing as non-linear multiple regression, but I don’t know how to say it in French.  Heck, I’m not even sure how to say it in English–non-linear multiple regression?  Multiple non-linear regression?  It’s pretty rare.)  There’s one more kind of regression modelling that we need to talk about before we can move on to mixed effects regression modelling: logistic regression.

Logistic regression is used to predict the probability of something from something else.  Up ’til now, we’ve been predicting a value; now we’re predicting a probability.  What is the probability that a vowel will be unvoiced (whispered)?  What is the probability that I will pronounce -ing, versus -in’?  These are questions for logistic regression.  I’ll leave out the details, but we need to know the vocabulary:

  • la régression logistique: logistic regression.

OK, we can talk about a variety of types of regression modelling in French now.  But, to talk about mixed effects regression modelling, we also need to be able to talk about effects.  This post is already super-long, so let’s save that for next time.  In the meantime, here’s a shout-out to Bodo Winter, regression-modelling explainer extraordinaire:

Holden Caulfield had it easy

Rye_catcherI often wake up in the wee hours of the morning.  I hate lying in bed stewing, so when I woke up at 2 AM the other night, I took a couple of books and went to the hotel lobby to read.  About 4 AM, I’m minding my own business, nose buried in a book about the early 21st century in China, when this kid walks up to me out of nowhere, says “Trouble sleeping?” (in English), and takes a seat.  We spent the next three hours talking about his life as a Chinese teenager, his dream of studying in America, iPhone apps that you use if you live in a country where the government blocks Facebook and Twitter, whether or not there are really criminals filling the streets of New York (depends on the neighborhood), the size of American refrigerators (about 4 times the size of my refrigerator in Paris), whether or not it’s true that in America, you could get married in a park if you felt like it (you most certainly could), and the like.

It would be tempting to turn this into an essay about the universality of the teenaged experience, but I’m not sure that that would be quite right.  In fact, his experience in China sounded different from any American teenager’s experience I’ve ever heard of.  For example: the kid in question (he will go nameless, for reasons that will become clear) wanted to know if I would let my son have a girlfriend.  Wouldn’t his parents let him have a girlfriend, I asked?  Turns out he tried to have a girlfriend last year.  His teacher saw them holding hands and called both of their sets of parents in to the school.  The two kids stood there for three hours while their parents and teachers lectured them on the importance of focusing on school, while tears rolled down the girl’s face.

He goes to an elite high school.  The students get three days off a year.  Not three months–not three weeks–three days.  The competition to get in to the school is quite intense.  He graduated from there, but didn’t do as well as he had hoped on his TOEFL and ACT.  When his test scores came in, his father told him that he was ashamed–he is not what his father was looking for in a son.

The kid is really, really determined to study in America.  While all of his friends from the elite three-days-off-a-year high school have gone off to Chinese colleges, he is taking a gap year to try to improve his test scores—in fact, he is in the hotel because he has come to Beijing to take an intensive TOEFL prep course.  We sat and went through the iPhone app that he uses to memorize English words–he has an amazing vocabulary.  At some point in all of this, I said that “When you come to America, you’re going to see that…”  He interrupted me frantically.  “You can’t say that–no one must hear you say that.”  Then I realized how quietly he’d been talking…

If you’re French, and didn’t major in American literature, and therefore didn’t get the cultural reference in the title of this post: Holden Caulfield is the protagonist of Catcher in the rye, the prototypical novel of teenage angst in the US.  Here are some vocabulary items from the French Wikipédie article on Catcher in the rye:

Il constitue l’une des œuvres les plus célèbres du XXe siècle et un classique de la littérature, à ce titre enseigné dans les écoles aux États-Unis et au Canada, bien qu’il ait été critiqué en raison de certains des thèmes abordés (prostitution, décrochage scolaire, obsession de la sexualité) et du niveau de langue (langage familier et souvent injurieux).

  • à ce titre: in this capacity, as such, in this respect
  • bien que + subjunctive: we already knew that bien que means “although” or “even though,” but this illustrates something I wasn’t aware of: that bien que has to be followed by the subjunctive.  I must have gotten this wrong a thousand times this summer…
  • décrochage scolaire:  dropping out of school.  Décrochage itself has many meanings related to unhooking, uncoupling, or disengagement.
  • injurieux: abusive, insulting; if talking about a reputation: injurious.

The adventures of Rabbi Jacob, or French movies subtitled in French

The web site.
The web site.

French movies and TV shows with subtitles are great tools for practicing your listening skills.  However, it’s difficult to find French films with French subtitles in the US.  In France, all you have to do is go to Netflix, select a French movie, and turn the subtitles on–voila, French subtitles.  In the US, it’s not so easy–you can find French films on Netflix, but they’re subtitled in English.  So, I was very happy to discover the web site.  This site offers French films subtitled in French, free of charge.  The selection appears somewhat random.  I chose Les aventures de Rabbi Jacob (“The adventures of Rabbi Jacob”) tonight; here are some random words from the film.  I doubt that I would have picked them up without the French subtitles–thanks,!

  • (une) usine: factory; “tight ship.”  The plot line crucially involves a factory full of big vats of chewing gum–I think you can see where this will end up.
  • le traître: traitor.  I thought it was worth including this one because of the similarity to the word traiteur, a word that occurs all over Paris and always confuses Americans.  It means not “traitor,” but “caterer,” or more generally, someone who sells pre-prepared foods.  In Paris, a traiteur almost always sells Chinese food, and traiteurs are all over Paris–everywhere.
  •  dépanneur/dépanneuse: repairman/woman.

Frankly, I was too stunned by the brilliance of this movie–an avowed treasure of French culture, starring the comic genius Louis de Funès (you pronounce the final s of proper nouns ending with ès, by the way, or at least in general)–to be able to tear myself away from it enough to take notes, so let’s leave it at three words for today!  Just remember that somewhere in the world there is a web site where you can find free French movies–with French subtitles.

Excuse me, you can speak English question?

The entrance to a siheyuan residence in a hutong.  Photo from Wikipedia.
The entrance to a siheyuan residence in a hutong.

Life takes us to unexpected places sometimes, and at the moment, I am in Beijing (Pékin, in French).  I have a lot of conversations with people that go something like the following–if it’s underlined, it’s in Chinese:

Me: Excuse me, you can speak English question?

Other person: No. (Or, sometimes, in English: No.)

Me: (smile, walk away)

Beijing has a practically infinite number of huge buildings, but I have the good luck to be staying in a small hotel in a hutong.  A hutong is a small alley between siheyuan, or residences built around courtyards; it is also the word for a neighborhood made up of such alleys.  Beijing has had hutongs for maybe 700 years, and they are a traditional symbol of the city, although in recent decades, many of them have been demolished to make way for the huge buildings that compose much of the city today.

Lanluoguxiang, the hutong that I walk through on the way to Banchan, which is my hutong.

Walking through the hutong from the metro station to my hotel, I passed by any number of signs marking public toilets.  “How nice,” I thought to myself, having recently come from Paris, where a free public toilet is a rare treasure.  My roommate explained the reason for the large number of free facilities to me: the old residences don’t have indoor plumbing.  The atmosphere is definitely interesting–on any given evening, I might walk past people cooking in the alley, or drying their laundry, or just hanging around, shirts off and smoking cigarettes.  I haven’t quite worked up the nerve to try the public bathrooms yet…

Here are a couple of sentences from the French Wikipédia article about hutongs:

Un hutong (en chinois :  ; en pinyin : hútong) est un ensemble constitué de passages étroits et de ruelles, principalement à Pékin en Chine.

  • passage: passage, pathway.  There are other meanings, but that’s the relevant one here.
  • étroit: narrow, close; strict
  • la ruelle: little street; back alley; way, lane.  It also seems to mean the space between a bed and the wall, but I might be reading that wrong!
  • Pékin: Beijing.  Some of you might be old enough to remember that the English word used to be Peking (still seen in “Peking Duck.”)

Le nom Hutong est un mot mongol, qui signifie le puits (худаг, khudag).

  • le puits: well.

Some vocabulary from a different France

In America, we basically have two stock stereotypes of the French.

bhlStereotype #1: the French person who always wears black, hangs out in cafes and art cinemas, and spends their evenings at parties where people drink red wine and debate the relative merits of Bernard-Henri Lévy and Sartre. These are the people who, as Edmund White put it in his memoir of his years in Paris, Inside a Pearl, “keep up with the latest books and read the classics and [know] everything about serious music and the history of the cinema….”

bakerStereotype #2: the jolly baker, cafe owner, or farmer; spends their off hours gardening, drinking red wine, and possibly playing the accordion.

Clichy-sous-Bois, an infamous banlieue to the east of Paris.  Note the
Clichy-sous-Bois, an infamous banlieue to the east of Paris. Note the “fuck the police” graffiti in English, to the right of the picture.

I’m sure that they both exist, although personally, I don’t know anyone of either sort.  There’s definitely another sort of French person, though.  We Americans are barely aware of them–we see them once in a while in news stories about riots, but that’s about it.  These are the people who live in the banlieues défavorisées (note: in English, banlieue always refers to these low-income, undesirable suburbs, but in French, the term is neutral and can refer to a nice area or a bad one–to specify one of the bad ones, say banlieue défavorisée), perhaps in HLM (Habitation à Loyer Modéré, “rent-controlled housing”).

Apartment complex in the Clichy-sous-Bois banlieue, from an article about urban renewal projects on
Apartment complex in the Clichy-sous-Bois banlieue, from an article about urban renewal projects on

Millions of people live in the banlieues–there are around 12,000,000 people in the greater Paris metropolitan area, and about 80% of them live in the banlieues.  Add in the residents of the banlieues around big citys like Lyons and Marseilles, and the number really climbs.  This quote from a Wikipedia article will give you the general idea of where the banlieues défavorisées fit into France: Ever since the French Commune government of 1871, they were and are still often ostracized, considered by other residents as places that are “lawless” or “outside the law”, “outside the Republic”, as opposed to “deep France”, or “authentic France” associated with the provinces.  (Here’s the source that the Wikipedia article cites: Anne-Marie Thiesse (1997) Ils apprenaient la France, l’exaltation des régions dans le discours patriotique, MSH.)

I’m not totally clear on the history of the banlieues.  Wikipedia relates them to the urban growth policies of the Third Republic (see above).   Add to that the observation that hundreds of thousands of people were displaced during the massive renovation of Paris in the second half of the 19th century; there’s some disagreement about how many of these people might have been able to find new housing within the city, but it’s likely that many of them relocated to the banlieues, and particularly the poorer residents–I’ve read that rents in Paris tripled or something during this period.  In the 20th century, after the war, France allowed in huge numbers of immigrants to supply labor during a period of strong growth, and their descendants are heavily represented in the banlieues défavorisées today.

A scene from
A scene from “Un Français,” a movie about a French neo-Nazi skinhead that is in the theaters right now.

Yesterday I went to the movies to see Un Français, a sort of “American History X” movie about a French neo-Nazi skinhead.  Today, I’ve been watching La Haine (hate, hatred, aversion), a film about three street kids whose friend has been injured in a riot.  These are movies about the people of the banlieues défavorisées, and the vocabulary that you come across in these movies is sometimes very different from what we learn in school. A number of these examples from the movies are in a form of slang called Verlan, but more about that another time. (I should point out that I had to get these translations from the web site and in some cases translate their monolingual definitions from French into English, because these words are mostly not in regular dictionaries.)

  • la keuf: policeman.
  • keufé: under police surveillance, I think.
  • le keum: guy.
  • le flingue: firearm.
  • foutre la merde: mess around.  (Literally: “to fuck the shit,” I think.)
  • le fric: money, “dough.”
  • le pote: buddy.
  • buter: a number of standard meanings, but in slang, it is “to kill, to bump off.”
  • se casser: a number of standard meanings, but in slang, it is “to leave,” as in casse-toi: “get out of here.”
  • relou: “heavy,” as in behavior.
  • le bédo: a joint. (Translation by phildange.)
  • enculé: As a noun, it’s something like “motherfucker,” but it’s much more interesting as an adjective, where it works something like “fucking” in English, but with the construction enculé de (examples from; you’ll have to look it up yourself, because I am editing out some incredibly foul instances!):
    • L’enculé d’Arthur Sellers a écrit les 156 épisodes “That fucking Arthur Sellers wrote 156 episodes”
    • Je vais me le faire cet enculé de bâtard “I’m going to get that fucking bastard”
    • Un putain d’enculé de kamikaze du Hamas l’a fait sauter dans une pizzeria “Fucking Hamas suicide bomber piece of shit blew him up in a pizza parlour”

An addition to the post from a commenter (lightly edited):

These slang words follow a pattern.  It’s a bit more complicated than it looks–here’s how it goes.

Keuf for instance, is the verlan of “flic” shortened .
Ke-fli ( the “e” after “K” is added as always to create a whole syllable from “K”) . Then ke-fli is shortened, a trend followed by several different French slangs, to make “ke-f” . The spelling “Keu” is there to be sure people won’t pronounce it as kéf , like it would otherwise ..
Keum is the same . Mec, Ke-m, keum .
Meuf for femme, too . Me-fa, Me-f, meuf . (Of course verlan is based on phonetic, not on spelling, as you could see in “relou” for lourd, the silent “d” doesn’t appear in verlan .
My favourite French slang is le louchebem, originally a coded language for the butchers of les Halles . Louchebem means boucher, butcher, in louchebem, and gives an example of how it works .

Letting, letting go, and dropping things in French

There are a number of verbs related to letting go of things in French, and I am always confused about when to use which one. It gets additionally complicated when talking about dropping things, in which case you have to differentiate between dropping something intentionally and dropping something accidentally. I’m going to try to figure it out in this post.

Screenshot 2015-07-12 16.24.27There are a number of verbs related to letting go of things in French, and I am always confused about when to use which one.  It gets additionally complicated when talking about dropping things, in which case you have to differentiate between dropping something intentionally and dropping something accidentally.  I’m going to try to figure it out in this post.  I’ll start with the verb lâcher.  (All of the photos in this post are subtitled frames from the movie La Haine, “The Hatred,” about young men in the banlieues défavorisées around Paris, for no particular reason other than that I happened to watch it yesterday.)

My Collins electronic dictionary has a number of translations for the verb lâcher. The first one applies when the object is something that is released, but doesn’t necessarily move:

Screenshot 2015-07-12 17.31.55
“Go on, let go of me.”
lâcher: to let go of. The dictionary gives the example Il n’a pas lâché ma main. “He didn’t let go of my hand.”

Next, there is a meaning involving dropping where the focus is on the movement of the object:​

Screenshot 2015-07-12 18.20.32
“But every time that he holds out his [hand] to me, he drops his pants.”
lâcher: to drop. The dictionary gives the example Il a été tellement supris qu’il a lâché son verre. “He was so surprised that he dropped his glass.”


Screenshot 2015-07-12 17.09.09
“I can’t let you go in.”
Next, the verb laisser.  It has a number of meanings related to concepts that we would translate with the word leave in English.  Perhaps I find it confusing because one of the meanings is to “let” someone do something, and also, we “let go” in English, with a variety of meanings. Here are some translations and examples from the Collins French-English dictionary:

    • laisser qqch quelque part: to leave something somewhere.  J’ai laissé mon parapluie à la maison.  “I’ve left my umbrella at home.”
    • laisser qqn quelque part:
      • to leave someone somewhere.  J’ai laissé les enfants à la garderie.  “I left the children at the nursery.”
      • to drop someone off somewhere.  Laisse-moi ici, j’en ai pour cinq minutes jusqu’à la gare.  “You can drop me off here, it’ll only take me five minutes to get to the station.”
    • laisser [+ parent survivant]: to leave.  Il laisse une femme et deux enfants.  “He leaves a wife and two children.”
    • laisser (ne pas tout prendre): To leave, in the sense of not taking everything.  Laisse du rôti, tu n’est pas tout seul!  “Leave some meat, you are not the only one here!”
    • laisser qqn faire qqch: to let someone do something.  Laisse-le parler!  “Let him speak!” Il les a laissé torturer sans intervenir.  “He stood back and let them be tortured.”

It gets a bit more complicated when you’re talking about dropping things.  To translate the English verb to drop (something), we have a number of possibilities.  We saw one above, for dropping a glass.  There are a couple of forms that are made in combination with the verb tomber, meaning to fall. 

    • Screenshot 2015-07-12 17.42.22
      “Let it drop. He thinks too much, that stupid bastard.”

      laisser tomber: to drop intentionally.  The website gives the example El lessa tomber le ballon pour courrir dans ses bras, which I believe means “She dropped the ball in order to run into his arms.”

    • faire tomber: to drop accidentally. gives the example Il a fait tomber ses clés sur le trottoir, which it translates as “he dropped his keys on the pavement.”

How to sound French in the summer of 2015

It is possible to do quite well in your French class in the United States without actually sounding very French.  In particular, there are a number of discourse connectives that are quite common in my workplace, but that we are not taught in school.  I spent a lot of this summer trying to get a handle on these expressions, all of which I found to be quite common.

En fait: One that threw me for quite a while was en fait.  There are a number of reasons for this, one of which that it is not pronounced the way that you would expect.  Specifically, the t at the end is pronounced, as if it were spelt en faite.  (See here for a discussion about this divergence between spelling and pronunciation, specifically with respect to this expression.)  My Collins French-English dictionary translates it as “in fact, actually,” and gives the example En fait je n’ai pas beaucoup de temps “Actually, I haven’t got much time.”  As in the example, it seems to mostly show up at the beginning of sentecnes.  I have been working hard on using this one, and I estimate that I am now starting 35% of my declarative sentences this way–probably a bit much, but I need the practice.

En effet: This one has multiple uses, which adds to the difficulty of figuring it out.  My dictionary translates it as “indeed,” and gives these examples:

  • C’est plutôt risqué.  — En effet!  “That’s rather risky.  —  It is indeed!”
  • Je ne me sens pas très bien.  — En effet, tu as l’air pâle.  “I don’t feel very well.  — Yes, you do look pale.”
  • On peut en effet se demander si…  “We may indeed ask ourselves if…”
  • Il est assez arrogant, en effet.  “He is rather arrogant, you’re right.”

Ben oui: I’m still not clear on this one.  In fact, I don’t even know how to spell it.  I first started hearing it at a conference this summer, and suddenly it seemed to be everywhere.  It comes from bien oui; it can be used as a hesitancy marker, but I think also as an indicator of confidence in your assertion.  More in the future, if I ever get it straight.

Quand même: This is the grand-daddy of all of the French expressions that I don’t understand.  I found a video about it here, but still don’t begin to understand it, as it has at least four uses.  My office mate Brigitte uses it all the time, leading to lots of puzzlement on my part–something to work on.

So: master these four expressions, and you will sound totally French–at least to me, and at least for this summer!