Estimate your vocabulary size

When you figure out how to draw a representative sample of language, notify the linguists, because we sure as hell haven’t figured it out…

Want to see an application of Zipf’s Law?  Go to this web site, where you can get an estimate of your vocabulary size in any of 21 different languages.  I don’t know the details of how they come up with these estimates, but as an indicator of their accuracy (or lack thereof), I can tell you that my percentile placement on their English-language test was the same as my percentile placement on the GRE (the exam that you take if you want to go to graduate school in the United States).  It estimated my French vocabulary size at just over 7,600, which seems reasonable–I would guess that I’ve been learning about 3,000 words a year for almost three years, of which I probably forget about a third due to not running into them a second time (Zipf’s Law: 50% of the language that you will run into today consists of words that occur only very rarely–but that do, indeed, occur), which would work out to just under 6,000; add in another 500 for the one semester of French that I took in college (“university” to you French-speakers in the audience) and you get within 10% of their estimate, which seems reasonable.  (According to the web site, this lands me in the top 44% or so–of whom?  No clue.)

How would you use Zipf’s Law to do this kind of estimate?  Remember what the curve described by Zipf’s Law looks like:
Zipf’s Law: a small number of words occur very frequently, while the vast majority of words occur very rarely–but, they do occur. Credit: @ASvanevik.

One way to use this to estimate a vocabulary size would be to figure out how far to the right (towards 100) a person can “go,” so to speak.  If someone can’t reliably understand words above a rank of, say, 20, there is a massive number of words that they don’t know.  On the other hand, if someone can reliably understand words in the 90-100 range, their vocabulary is enormous.  How do you turn enormous into an actual number?  I have no clue how they do that–quantifying vocabulary size is hugely difficult, and as far as I know, it’s not possible to do it precisely for anyone, even for very, very young children. The SWAG approach would be to figure out the rank at which you stop recognizing words reliably, and then calculate the number of words above that rank. The Devil would, of course, be in the details–what texts would you use to determine your curve? Load those texts heavily with scientific journal articles about linguistics and someone like me would probably do pretty well–load them heavily with scholarly analyses of metaphors for love in Finnish epic sagas and I would probably do pretty poorly. Use a representative sample, you say? When you figure out how to draw a representative sample of language, notify the linguists, because we sure as hell haven’t figured it out…


Want to know some of the many technical details that make quantifying vocabulary size more or less impossible, even in principle?  See pages 22-28 of my colleague Elisabetta Jezek’s book The lexicon: An introduction.

English notes 

hugely: an adverb meaning “very.”  Is it English?  It first appeared in the language in the 12th century (along with archangel, asleep, dittany, lion, whoredom, and welkin–how cool is Merriam-Webster’s “Time Traveler” feature, and WTF is dittany??).  Have you ever come across it before?  Quite likely not–here are the relative frequencies of hugely and very:

Screen Shot 2017-10-20 at 05.47.42
Screen shot from the Google Ngram Viewer.

…but, it’s hard to argue that it’s not part of the language.

Want to see some cool shit?  Click on the version of the graph that you see below.  Do I REALLY think that this is cool?  Yes.  Is that fact related to the shockingly large number of times that I’ve been divorced?  I would imagine so.


Palimpsest upon palimpsest

Dear Dr. Zipf,

Good day to you.  My name is [name removed to protect the guilty] and I am a Ph.D. in [field removed to protect the guilty] at [a hospital which shall remain nameless].  I need to learn how to use natural language processing to process the electronic medical record and provide data that can be used for analysis.  As you are an expert in this field I thought I would email you and ask for your assistance.  Are there any books or training courses out there that can help me learn biomedical natural language processing in a few weeks.   Any help you can provide will be greatly appreciated.  Please let me know.

Warmest Regards,

[Name removed to protect the guilty]

In a few weeks… In a few weeks…

Dear Dr. X,

Biomedical natural language processing is super-simple, and I would be surprised if you couldn’t learn it in a few weeks.  You might find this book helpful:

Cohen, Kevin Bretonnel, and Dina Demner-Fushman. Biomedical natural language processing. Vol. 11. John Benjamins Publishing Company, 2014.

Please let me know if I can be of any further assistance.

Warmest Regards,

Beauregard Zipf, PhD

Dear Dr. X,

The doctoral students in our graduate program typically spend five years learning biomedical natural language processing.  Personally, I’ve spent my entire career learning biomedical natural language processing, beginning with spending a number of years as a medic in the military, where I learned the “biomedical” part. I mostly did physiological monitoring–hemodynamics, electrophysiology, stuff like that.  I later got a bachelor’s degree in linguistics (double major in English, actually), as well as a master’s degree in linguistics, and a PhD in linguistics, which is how I picked up the “language” part.  Along the way I learned to program–the hard way, which is to say by making more mistakes than you could possibly imagine, from the painful to the just plain embarrassing.  (That’s the “processing.”) Since then, I’ve spent years trying to figure this stuff out, and I still wouldn’t say that I know very much about it.  But, hey, you’ve got a PhD in [redacted], so, yeah–you should be able to pick this up in a few weeks.  You might find this book helpful:

Cohen, Kevin Bretonnel, and Dina Demner-Fushman. Biomedical natural language processing. Vol. 11. John Benjamins Publishing Company, 2014.

Warmest Regards,

Beauregard Zipf, Registered Cardiovascular Technologist, Advanced Cardiac Life Support instructor, EMT, PhD

Dear Dr. X,

  • Are there any books or training courses out there that can help me learn biomedical natural language processing

What an interesting question–thank you for bringing it up.  When I Googled the words biomedical natural language processing, the first hit I got was this:


Cohen, Kevin Bretonnel, and Dina Demner-Fushman. Biomedical natural language processing. Vol. 11. John Benjamins Publishing Company, 2014.


Looks like it might be relevant?


Best wishes,


Hi, Dr. X,

It’s nice to hear from you.  You might find this book helpful:


Cohen, Kevin Bretonnel, and Dina Demner-Fushman. Biomedical natural language processing. Vol. 11. John Benjamins Publishing Company, 2014.


Please let me know if I can be of any further assistance.
Best wishes,



English notes
palimpsest:  “writing material (such as a parchment or tablet) used one or more times after earlier writing has been erased” (Merriam-Webster).  Back in the day, writing was mostly done on parchment, and parchment was expensive, so in the monasteries that preserved much of the ancient writing that we have today, it wasn’t uncommon to scrape the ink off of parchment if you didn’t really care about what was written on it, and write something on it that you did care about.  If you’re lucky, though, today we can recover an earlier text from the impressions that it left behind on the parchment, and there are some texts that are only known from a palimpsest.  Wikipedia lists most of Cicero’s De republica, as well as the oldest Koranic variant in existence.

Speak to us of drinking, not of marriage

The feeling was like what gay friends have described to me when they first learned that they weren’t the only guys in the world who wanted to have sex with other men.

A Basque joke about the alleged difficulty of the Basque language: The Devil wanted to tempt the Basques to sin, so he decided to learn to speak Basque.  He quit after seven years, only having learned the word “no.”  The Devil did better learning Basque than I’ve done learning French, because I still don’t know how to say “no” in French.  My stumbling block: the second clause in a contrast.  My father speaks Portuguese, but I don’t.  We have pinot noirs in Oregon, but not Brouillies.

Jean Girodet’s magisterial Pièges et difficultés de la langue française to the rescue.  According to Girodet, the issue comes up in what he calls ellipticals.  In this situation, he says that literary language tends to prefer non, while the spoken language tends to prefer pas: 

Dans les tours élliptiques, la langue littéraire préfère en général non, la langue familière pas.

He gives these examples:

Non Pas
Veut-on réformer la société ou non Qu’il travaille ou pas, je m’en moque !
Il néglige son travail, moi non. Elle aime le ski, moi pas.
Cette parole est d’un marchand et non d’un prince. J’irai en voiture, pas à pied.
Il habite une villa, non loin de Cimiez. Il tient un café, pas loin d’ici.
Il veut créer un art tout nouveau, pourquoi non ? Partir tout de suite ? Pourquoi pas, après tout.

OK, good so far: you can use either, with non sounding more literary, and pas sounding more casual.  But, why do you occasionally run into both of them together??  Here’s a clear elliptical in Girodet’s sense of the word: the refrain of the song Parlez-nous à boire, “Speak to us of drinking (not of marriage).”  There are many recordings of it available (sometimes with minor differences in the lyrics), but my favorite du moment is this one from the film Southern Comfort.  Lyrics follow, from

Oh parlez-nous à boire, non pas du marriage
Toujours en regrettant, nos jolies temps passé

Si que tu te maries avec une jolie fille,
T’es dans les grands dangers, ça va te la voler.

Si que tu te maries aves une vilaine fille,
T’es dans les grands dangers, faudra tu fais ta vie avec.

Oh parlez-nous à boire, non pas du marriage
Toujours en regrettant, nos jolies temps passé

Si que tu te maries avec une fille bien pauvre,
T’es dans les grands dangers, faudra travailler tout la vie.

Si que tu te maries avec une fille qu’a de quoi,
T’es dans les grands dangers, tu vas attraper des grandes reproches.
Fameux, toi grand vaurien, qu’a tout gaspillé mon bien
Oh parlez-nous à boire, non pas du marriage.


Native speakers, can you help this poor, lost anglophone?  (Note: I’m guessing that jolies temps passé should be jolis temps passés, but what do I know?)

My source for the Basque joke: I don’t remember, but it’s probably one of Mario Pei‘s many books.  Pei was a linguist who wrote tons of popular-press books about language between the 1930s and the 1970s or so.  Running across one of them in a used bookstore  was the first time I ever heard of “linguistics.”  After a lifetime of mostly keeping quiet about my unending obsessions with language, the feeling was like what gay friends have described to me when they first learned that they weren’t the only guys in the world who wanted to have sex with other men.

Just in case you were wondering why your rabbit looks like it does

Black lab, yellow lab, chocolate lab, meth lab.

When I’m in the US, I live in the Wild West, and that means rabbits.  Where there are rabbits, there are probably man-eating rabbits, and I hate them.  So, the chart explaining rabbit coat coloration that you see above intrigued me–to survive the man-eating rabbits, you must be able to spot them, and you can’t always rely on seeing their long, sinister ears protruding from the grass, so you need to know their coat colors.  But, how do those particular genes explain the devilishly sly diversity of color and pattern that you see in the illustration?

For context, let me give you the rundown (as I understand it–bear in mind that I’m a linguist, not a geneticist) on Labrador retrievers:

Picture source:
  • Labs come in three colors: black, “chocolate,” and yellow.
  • Which color they are is determined by two genes.
  • One gene determines whether your hair is black or “chocolate.”
  • The other gene determines whether or not your hair has any pigment (think of pigment as the molecule that actually has the color) at all.
  • If you have the form of the gene (the “allele”) that allows your hair to have a color, then you will be either black or “chocolate” (assuming that you are a Labrador retriever).
  • If you have the form of the gene (the “allele”) that keeps your hair from having any pigment at all, then regardless of which form of the black-versus-chocolate gene you have, you will be yellow–yellow being what a Labrador retriever hair looks like if it doesn’t have any pigment deposited therein.

My point being: you don’t actually need to have a large amount of genetic variability to get a large amount of “phenotypic” variability (in this case, variability in appearance)–actually, very few things are affected by a single gene.  Rather, most traits are affected by a combination of a number of different genes.

OK, so: how do those rabbits come about?  They differ not just in their colors, but in the pattern of those colors.  Here’s a reasonable guess.

The odd data point in that graphic is the Himalayan.  Everybody else is monochrome, but the Himalayan has a color difference between his (I’m pretty sure that rabbits are generically male, probably due to the known viciousness of the man-eating variety–le lapin anthropophage in French, el conejo antropófago in Spanish, Lepus anthropophagos in Latin, I think, but I couldn’t swear to it) extremities and his…well, everything else.

A Siamese cat with a baby. Note that the cat is not eating the baby—as far as I know, there is no such thing as a man-eating Siamese cat. Picture source:

You’ve seen that pattern before–in Siamese cats, for instance.  My understanding is that the distribution–lighter towards the center, darker at the extremities–is related to reduced blood flow in said extremities.  The reduced blood flow gives you a reduced temperature, and that has some effect or another on the deposition of pigment.  (As I said, don’t quote me on this–I’m a linguist, not a Siamese cat expert.)  Looking at the rabbit that way, you wonder: OK, dark on the extremities and light on the rest, but which dark?  Which light?  Why doesn’t the rabbit have the same colors as a Siamese cat, for instance?  (Think of the evolutionary advantage for a rabbit who looked like a cat–it would be soooo much easier to get humans to take you in, in which case if you were the man-eating variety of rabbit, you could just gobble those overly-trusting humans right down.)

I went digging around for evidence for this explanation for the coloration patterns in Siamese cats.  I found a few papers on a group of related temperature-sensitive tyrosinase mutations that are associated with eye color differences in a range of Siamese cats and Himalayan mice and a rare mink discovered on a ranch in Nova Scotia–and with albinism in humans. (As an albino, your likelihood of going blind due to a lack of protective pigment in the iris and the retina is high–and that’s why we spend your tax dollars on studies of Himalayan mice.)  I found a paper on a temperature-sensitive tyrosinase mutation in a human with the following: white hair in the warmer areas (scalp and axilla) and progressively darker hair in the cooler areas (extremities) of her body. I haven’t tracked it down to the fur color question in Siamese cats, though.  Still think I just make this shit up?  Here’s the paper on the mink found on the ranch in Nova Scotia.  I mean, yeah, I make up the zombies and the man-eating rabbits–but, the rest of the stuff is “for reals,” as the kids say.

Picture source:

Look to the left, look to the right: if the colors in the figure are true to life, the Himalayan rabbit extremities are the color of the rabbit to the left, while the center is the color of the rabbit to the right.  (I am cursed to always remember a scene from an autobiography that I read when I was a kid.  The author has been arrested by the NKVD and finds himself in their notorious Lubyanka prison.  Whenever a prisoner is taken from one room to another, the machine-gun-toting guards intone step to the left, step to the right: attempt to escape.  The NKVD were murderous fuckers, and the threat was entirely believable.  Hence: look to the left, look to the right.)  Likely cause of the pattern of the Himalayan: temperature-dependent pigment deposition gradient of whatever pigment the chinchilla and albino rabbits have or do not have.

Yes, I have been known to spend my Saturday mornings looking for scientific literature on the topic of pigmentation deposition in Siamese cats when I could have been taking a walk in the beautiful fall weather.  This is probably related to why I get divorced so often.  French notes below–no English notes today.

French notes

le dépôt: deposition, in the sense of deposition of a substance.  This seems to be what would be used to talk about pigment deposition.  For example:  La synthèse et le dépôt de mélanines continuent jusqu’à ce que la structure interne ne soit plus visible, on parle alors de mélanosome de stade IV.  (  

le gisement: deposit, in the sense of a deposit of minerals, of archeological finds, and the like.  I haven’t been able to find any examples of it being used in a medical or biological context to refer to deposition of pigments in the skin.

The same thing that we saw in Labrador retrievers: one gene for color, one gene for pigment deposition, and you get three kinds of coats. Faute d’orthographe: dépot should be dépôt.  Source: Bernadette Féry,
With the correct spelling dépôt: Deposition of exogenous or endogenous iron. Picture source:, author unknown.
Picture source: Marc Durand,


Source: Alain Muret.


Matching Game III: Zombies and visa renewals

Today’s depressing vocabulary items are brought to you by Olivier Peru and Sophian Cholet’s magisterial bande dessinée Zombies. The non-depressing vocabulary item is a prerequisite for getting my French visa renewed. Don’t think that ANY of these vocabulary iterms are non-depressing? First World Problems, baby, First World Problems… (First World Problem explained in the English notes below.)


English notes

First World Problem: Something that could only count as a problem if the rest of your life is better than that of most people on the planet.  Examples of First World Problems that I’ve had recently:

  • When my long-awaited new iPhone finally showed up, it was the wrong color.
  • The Singaporean noodles in the United lounge came in really small containers–like, two mouthfuls.
  • I didn’t get surclassé (upgraded) on a cross-country flight.

The better I get at distinguishing my First World Problems from real problems, the happier I get, and I’m already the happiest person you know…

Emporter versus emmener

Two ways to say “to take” in French.


I am of the “write about what you DON’T know” philosophy, and I sure as hell don’t know how to speak French.  So: today, here are two words that native speakers of English (say, me) tend to have trouble with in French: emporter  and emmener.   They both can be translated as to take, but they get used in different contexts.

First, I recommend that you check out this video on the topic from the Learn French with Pascal YouTube series.  Pascal’s explanations are always clear, he always has good examples, and he will give you native speaker pronunciations.  For example, emmener can be pronounced with or without the medial e, and he demonstrates both of them.  Scroll down after you’ve watched the video, and I’ll give you a bunch of examples from the Sketch Engine web site.


Pascal’s take on these two verbs is that you use them as follows:

  •  emmener in a situation where the thing being taken can move on its own.  He lists people and animals as the two kinds of things with which you would use emmener.
  • emporter when the thing that is being moved cannot move on its own–for example, a package.

Let’s see how this holds up in practice.  As we’ll see, it seems to be the case that these are more like heuristics than absolute rules; more probabilistic than deterministic.  In other words: the observations hold true more often than not, but there can be some variability.  To find these examples, I went to the Sketch Engine web site.  It allows you to search multiple corpora (singular corpus)–that is, collections of language that have been analyzed in some way.  I used the DGT French corpus, which is intended to support translations and therefore gives us English equivalents, as well as the frTenTen corpus.  It contains 9.9 billion words scraped from the Web.  When I got my results back, I randomized their order so that I wouldn’t be biased towards any particular sets of documents.

  • Objet: exemption de l’exigence d’ emporter un document de transport et une déclaration du transporteur pour certaines quantités de marchandises dangereuses définies sous (n1). 
    • Subject:Exemption from the requirement to carry a transport document and a shippers’ declaration for certain quantities of dangerousgoods as defined in (n1).
    • Comment: these are documents, therefore not capable of moving themselves, therefore emporter.
  • Les voyageurs ne peuvent emporter dans leur bagage à main que des marchandises dangereuses destinées à leur usage personnel ou professionnel. 
    • Only dangerous goods for personal or own professional use are permitted to be carried in hand luggage.
    • Comment: we’re talking about dangerous goods of some sort, and apparently those dangerous goods do not include, say, tigers (which are capable of movement on their own), so: emporter.
  • Et au lieu d’ emporter la pizza, j’ai eu envie de manger sur place, pour changer un peu…
    • Comment: it’s a pizza that’s being (or not) transported, therefore emporter.
  • Où est-ce que je nous ai emmenés 
    • Comment: the object pronoun is “us,” therefore the transportees are animate (alive), therefore they are capable of moving themselves, and therefore the verb is emmener.  
  • Indique-moi juste le chemin de ta villa, je t’y emmène.
    • Comment: the thing being taken somewhere can show something, so it is animate and sentient, so it can move on its own, so the verb is emmener.
  • La vie de Caroline est monotone, et sans surprise : chaque matin son père l’ emmène à l’école, et le soir une étudiante pas très sympa vient la chercher.
    • Comment: Caroline is human, so she can move on her own, so the verb is emmener.
  • Sécuriser les appâts afin qu’ils ne puissent pas être emmenés par les rongeurs.
    • Secure bait blocks so that they cannot be dragged away by rodents.
    • Comment: I have no clue why this is emmener.  By Pascal’s rule, since the things being moved–les appâts–are not capable of moving themselves, this should be emporter.
  • Le véhicule est alors emmené au moteur jusqu’à l’enceinte de mesure, en utilisant au minimum la pédale d’accélérateur.
    • The vehicle is then driven to the measuring chamber with a minimum use of the accelerator pedal.
    • Comment: maybe this is emmener because a vehicle is capable of moving under its own power (so to speak)?
  • Dans les 5 minutes qui suivent l’achèvement de l’opération de préconditionnement décrite au paragraphe 5.2.1., le capot-moteur est fermé et le véhicule est emmené hors du banc à rouleaux et est parqué dans la zone d’imprégnation.
    • Within five minutes of completing the preconditioning operation specified in paragraph 5.2.1. above the engine bonnetshall be completely closed and the vehicle driven off the chassis dynamometer and parked in the soak area.
    • Comment: another example of emmener with a vehicle.
Perhaps “emporter” despite being animate because he’s being carried, rather than moving under his own steam? Source:

There are other verbs that refer to taking stuff places–apporter, amener, ramener–but this is about all my little head can handle for one day.  Native speakers: have at it in the Comments section, please!

French spelling errors I

If you’re a computational linguist, the sentence that your boss never wants to hear from you is this: we need to spend six months writing a program to fix the spelling errors in this @#$@#$% data. 

If you’re a computational linguist, the sentence that your boss never wants to hear from you is this: we need to spend six months writing a program to fix the spelling errors in this @#$@#$% data.  And yet: spelling errors or similar sources of unexpected inputs are a problem with every domain of computational science that I’m aware of.  Even super-highly-edited text has some residue of spelling errors and other problems.  For example, back in the days when there were still phone books, even they had a non-zero rate of spelling errors.  Not a high rate–but, not zero, either.

You don’t really believe that even when people are paying really, really, really close attention to how they write, they still screw up?  Read on.  I’ll come right out and admit that I’m not sure what the first word is in the picture below, but that’s not even what I’m talking about…


Matching Game II: Marseille and a bakery

Today’s vocabulary items are brought to you by the Netflix series Marseille (Gérard Depardieu is the coke-snorting mayor of the notorious southern port town–shenanigans ensue) and by the bakery where I usually stop for a coffee, a viennoiserie, and a cigarette before tackling the hill that I have to walk up to get to the lab.  I worry that many or most of the words that I learn from Marseille are words that I probably shouldn’t be using in public, but what’s a monolingual American to do?  The bakery vocabulary brings out some subtleties of rye bread that I never would have imagined.  You’ll notice a couple blanks, as there are a couple of words or expressions that I wasn’t sure how to translate–native speakers, can you help the rest of us out?

…and, yes: I mixed them up!

The two complaints of Americans in France: Part I

One of the things with the biggest effect on what language people will speak to you in Paris comes from the fact that if you’re a tourist, you’re mostly interacting with people in some sort of customer service role. 


Today is Wednesday, and Wednesday is market day in my neighborhood, and I need a liter of milk. Normally I would pop into the supermarket across the street for that kind of thing, but if you want good milk–and if you want to support the little things that make life here what it is–you get your milk from a cheesemonger.  (Cheesemonger explained in the English notes below.)  The Wednesday market has plenty of cheesemongers, so under the metro tracks I went (I’m right by the elevated portion of the #6 line), and a cheesemonger I found.  Bingo: lots of bottles of milk.  I got in line.

The two most common complaints that I hear from Americans who have visited Paris:

  1. Nobody there speaks English!
  2. I tried and tried to speak French with them, but everybody just answered me in English…

Contradictory, right?  How can they both be impressions that are shared by so many people?  Seriously —I can’t tell you how many times I’ve heard both of these complaints.  Actually, they both reflect the same truth: that what determines the language that people will use with you here is super-complicated.  Briefly: you have to think about which language will be used in the context of every single interaction that you have.  That interaction takes place with specific people trying to do specific things under a specific amount of pressure.  Those people come into those interactions with specific amounts of background in the two languages, and with specific amounts of tolerance for embarrassment.  One of the implications of this complicated interaction is that the same person may use a different language with you in different contexts; different people may use different languages with you in the same context.  This is so complicated that it will take multiple posts to explain–hence, the title of this post: The two most common complaints of Americans in Paris: Part I. 

One of the things with the biggest effect on what language people will speak to you here comes from the fact that if you’re a tourist, you’re mostly interacting with people in some sort of customer service role.  The hotel desk clerk, the counter girl at the Monoprix (they’re almost all girls), and most of all, the waiter–these are people who have to deal with a lot of people, and deal with them quickly.  In a situation like this, people will use whatever language they think will be most efficient for interacting with you.  Your efforts to speak French are actually very much appreciated, but if that counter person or hotel clerk thinks that they’ll be able to take care of your needs and move on to taking care of the next person’s needs most quickly in English, then that’s what they’ll speak with you–if they can.  Not everyone here is functional in English (and why would we be??), but if they can, and if they’re in a hurry, they’ll speak English with you if your French isn’t up to a super-efficient interaction.

The lady in line in front of me chez the cheesemonger started asking questions–in English.  It was clearly her native language; it was clear that she was struggling to frame her questions simply and clearly–and slowly; and it was clear that the cheesemonger was not getting it, and was not happy.  A deep breath, eyebrows down, and a worried look on his face.  No problem–I speak English natively and I am passionné du fromage (crazy about cheese), so I jumped into the conversation.  The relative strengths of some bleus were discussed; the significance of Mont d’Or in the cycle of the French year was summarized–the cheesemonger was happy to talk about his wares, as long as he could do it in a language that was shared across both sides of the counter.  Euros were handed over, cheese was handed over in return, and the nice tourists went away, tickled with both the experience and the anticipation of some good cheese-eating.

I asked for, and received, my liter of milk.  On an impulse, I picked up a small St-Félicien. The cheesemonger handed me my bag–and a small, wrapped package.  A little something to thank you for the translation, he said.  Would he have been happy to speak English with these folks, if he could?  More than happy.  Was he worried that these non-French-speaking tourists were going to throw his entire waiting line into disarray?  Absolutely.  Did it all turn out fine, with no hurt feelings on anyone’s part?  Clearly.  A tiny little moment in the cheesemonger’s day, the tourists’ day, and my day–and yet, pretty illustrative of the complexities of the question of who will speak what language to you, under what circumstances.  That waiter who impatiently responds to your carefully-rehearsed-but-nonetheless-halting French in English?  If it weren’t the lunch rush, he might very well be up for having a long conversation with you about the rignons de veaux à la sauce de moutard — in your halting French.  But, in the context of a busy lunch hour, he’s going to go with whichever language works out most efficiently for getting your order taken and moving on to the next table.

The small, wrapped package contained a cheese.  Just a little guy–I’ve included my sunglasses and key in the photo to provide some scale.  But, based on what I had ordered, this was a perfect choice–similar to the kind of cheese that he knows I like, ’cause I just bought some (a Saint-Félicien); but, different, in the subtle kinds of ways that lovers of French cheese savor (it’s probably a Saint-Marcellin or a Pélardon (I’ll know when I eat it)).  Scroll down for the English notes.  Sorry, no French notes today–gotta jump on the train to get my convention d’accueil so that I can RENEW MY VISA!  🙂


English notes

cheesemonger, fishmonger, hate-monger, war-monger: English has a number of words that end with -monger.  The basic meaning of this affix is that it is someone who sells something specific.  So, a fishmonger sells fish (there are a few of them in the market under the metro tracks; I understand that if they lop the head off of your fish for you, you’re supposed to tip them a euro), while a cheesemonger sells cheese.

You also see this affix in words referring to people who try to spread something amongst people.  A war-monger is a proponent of war; a hate-monger tries to get people to hate other people.  Scroll down to see examples of all of these in use; be aware that the spelling of these words can be variable with respect to whether or not they’re written as one word, and if they are written as one word, variable as to whether or not it’s hyphenated.

The worst kind of war-monger, for my money–a guy who won’t fight, and whose kids won’t fight, either. (For context: I spent nine and a half years in the US Navy.) Source:
Fisherman buying fish on the way home...!
Picture source:
Curative Power of Medical Data

JCDL 2020 Workshop on Biomedical Natural Language Processing


Criminal Curiosities


Biomedical natural language processing

Mostly Mammoths

but other things that fascinate me, too


Adventures in natural history collections

Our French Oasis


ACL 2017

PC Chairs Blog

Abby Mullen

A site about history and life

EFL Notes

Random commentary on teaching English as a foreign language

Natural Language Processing

Université Paris-Centrale, Spring 2017

Speak Out in Spanish!

living and loving language




Exploring and venting about quantitative issues