In a previous post, we talked about the vocabulary of a number of different kinds of regression models. Today, let’s move on to mixed effects models. You can find an excellent discussion of mixed effects models here on Bodo Winter’s web site. There is also some coverage of mixed effects models in Harald Baayen’s book Analyzing linguistic data: A practical introduction to statistics using R. Stefan Gries’s book Statistics for linguistics with R: A practical introduction has a good critique of current applications of mixed effects models in linguistics.
In statistics, an “effect” is anything that might affect the values in a sample. If we are building a statistical model of crop yields, an example of an effect might be the weather. If we are building a statistical model of voice onset times (the time gap between when a consonant is released from the mouth and when the vocal cords start vibrating for a following vowel), an example of an effect might be whether or not the associated syllable is stressed.
We can talk about two kinds of effects: fixed effects, and random effects.
As explained by Bodo Winter, fixed effects are things with a systematic and predictable influence on your results. They exhaust the possible levels, even if “only” defined operationally. As Stefan Gries puts it, fixed effects cover all possible levels (values that a variable could take) in the population. In contrast, random effects (as explained by Bodo Winter) are generally something that can be expected to have a non-systematic, idiosyncratic, unpredictable, or “random” influence on your data. In linguistic experiments, that is often “subject” and “item.” As Stefan Gries puts it, random effects sample the population, rather than exhausting it. To see if you have this down, figure out if the following described fixed effects, or random effects:
- The Kukú language has voiced, voiceless, and implosive consonants. I have some of each in my experiment.
- The Kukú language has bilabial, alveolar, palatal, velar, labiovelar, and glottal stops. I have some of each in my experiment.
- For the purposes of my experiment, I am defining politeness as having two levels: casual, and formal. I have some of each in my experiment.
- There are almost 2,000 verbs in the CRAFT corpus. My student sampled 30 for a study of sentence plausibility.
- There are a number of labs in the Pharmacology department. My student recruited volunteers from one of them.
- There is an infinite number of sentences in Mousey Banana (a former co-worker’s favorite language name; more properly, it would be Musey Banana or Massa Banana; they are spoken in Chad and Cameroon), as there are in any language. Suppose that I took a sample of them for a study of intonation.
(Yes, these come straight from my statistics lecture notes.) The first three of those are all fixed effects. The last three are all random effects.
The beauty of mixed effects regression models is that they let you take random effects into account in building the model. With a standard regression model, random effects introduce variability that your model will not be able to account for–going back to the voice onset example (the time lag between a consonant and vocal cord vibration for the vowel), the voice onset time will be affected systematically by things like whether or not the syllable is stressed and where in the mouth the consonant is formed, but it will also be affected randomly by things like which speakers I happen to have chosen–even with all other things being equal, my voice onset times will not be exactly the same as yours. A mixed effects model lets you take these random effects into account.
With that background, we are ready for the French vocabulary that we need for talking about mixed effects models:
- un effet: effect.
- un effet aléatoire: random effect (the most common way of saying it).
- un effet de/du hasard: another way of saying random effect (less common).
- un effet fixe: fixed effect.
- les effets mixtes: mixed effects.
Now comes the hard question: how do you link together modèle and effets mixtes? Here’s the most common way of doing it:
- le modèle à effets mixtes: mixed effects model.
Here are some examples from the linguee.fr website:
- Nous proposons un modèle de régression spatial dans un cadre
général de modèles à effets mixtes pour résoudre le problème de l’estimation pour petits domaines. “A spatial regression model in a general mixed effects model framework has been proposed for the small area estimation problem.” (from statcan.gc.ca)
Ces analyses reposent sur des modèles physiologiques plus ou moins simplifiés et nécessitent des outils statistiques pluscomplexes comme la modélisation non-linéaire à effets mixtes. “These analyses rely on more or less simple physiological models and require more complex statistical tools such as non-linear mixed effects modelling.” (from digiteo.fr)
Développement de méthodes d’estimation des paramètres des modèles non linéaires à effets mixtes par maximisation des vraisemblances approchées. “Development of parameters estimation methods in the nonlinear mixed effects models with maximisation of approximated likelihood.” (from lemenuel.com))
Having beaten mixed effects models to death, I should point out that although they are very hot in American linguistics right now, they are not used as commonly in France at this time. I brought up a question about mixed effects models after a talk in France once, and was embarrassed when the speaker asked me to switch to English. It turned out afterwards that my French was OK–the speaker just wasn’t familiar with mixed effects models, and didn’t recognize the technical terms.