Metro sight of the day


Just another beautiful spring day in Paris.  Metro sight of the day: a one-eyed min-pin (miniature Doberman pinscher) being carried by a young guy–in one hand.  In the other: a 6-pack of beer–with one missing.

le Pinscher nain: miniature Doberman pinscher.  How you pronounce pinscher in French: I haven’t a clew.  (I know how to spell, at least in English–that’s British.)

Global warming: At least I’m messing up a better class of verbs

Pride comes before a fall, and sometimes the fall is worse than others.

Most mornings, I sit with my first cup of coffee and a stack of index cards and look up all of the words that I ran into the day before and didn’t know.  My 15 minutes or so of vocabulary every morning is a given–I typically learn about 10 new words a day, which means that despite having grammar that makes my French tutor shudder and an accent like fingernails on a blackboard, I know three ways to say “unremittingly.”

Everything else–conjugation, grammar, pronunciation–I rotate between.  Which is to say: I try to make sure that every week I spend a day on some new verb form, a new tense I don’t know, the order of double pronominal preverbal objects (my current bugaboo–il me le rend? Il le me rend?  FUCK), or something of that ilk.  Hence, I know lots of obscure things to say–but, I don’t necessarily know how to say them, if that makes any sense.

The other morning my plane landed in Paris after a long weekend in the US.  (A work thing, and then I surprised my father for his birthday.  We made fried matzah with schmaltz, which is to say: rendered chicken fat.)  On your first day in Europe, the challenge is to stay awake–fall asleep when you get off the plane and you’ll find yourself in a cycle of décalage horaire-induced sleep cycle disturbance that you won’t work your way out of for a week.  Sundays and Wednesdays it’s easy–there’s a market under the Metro tracks down the block, and getting out in the fresh air and sunshine is a good way to keep yourself moving and conscious.

On market days, I actually start not at the market, but at the fromagerie at the Dupleix metro station.  (Right outside the station was the spot where you were most likely to get taken to face the firing squad, at least as recently as 1871, the last date of which I’m sure.)  Although as an American, I had no clue about this ’til I got here, it turns out that cheeses have seasons; the first thing that I do when I get to Laurent Dubois is check the ardoise in the window to see what’s just come in.

This week: 3 “rare” cheeses.  Bleu du Nil, an obscure tomme, and something even more obscure that had already sold out.  Now, you’ll hear numbers about how many cheeses France has, but in truth, no one really knows how many cheeses France has.  Like the apocryphal Eskimo words for snow (that’s bullshit, by the way), some say 200, some say 300, some say 350…  In truth, there’s no way to know, because it’s not clear how to define “a cheese.”  In the limiting case, since every farmwife who still makes her own cheese is making a cheese unlike any other, the cheeses of France are essentially uncountable. (That’s not to say that there’s an infinite number–uncountable and infinite are different things.  I remember well being baffled by the idea of being countably infinite versus uncountably infinite as a graduate student.  As my wife of the moment said to me: Kevin, if you can’t wrap your head around this, you just can’t take any more math classes.  I thought that that was adorable, since I haven’t taken a math course since the obligatory algebra and trig course in college, and in fact am completely innumerate.)

But, back to the fromagerie.  My copy of Marie-Anne Cantin’s Guide de l’amateur de fromages (“”Cheese-lover’s guide”) lists somewhere around 200 or so French cheeses, but it doesn’t list any of the cheeses that had come in this week, so I asked the adorable pixie-cut saleslady to tell me about them.  It developed that the name of one of them comes from the valley where the cows from whose milk it is made graze.  Except…she didn’t use the word graze, and I didn’t catch the word that she did use.  No problem–I recently learnt the verb to graze.  “Where they paissent?” …I asked, using the verb paître–a favorite of mine, because I love circumflex accents.  Seulement voilà, the only thing is: I’d never had the opportunity to use this delightful lexical item before, and I screwed it up.  I should have said paissent–but, my mind wandered off into the delights of that circumflex, and instead I said paîtent.  Which sounds like pètent…  Which means that I had just asked the nice lady if she were referring to where the cows fart.  Damn it.  Pride before a fall, and all that.  She had the good grace not to laugh.  At least, I think she didn’t–I was too embarrassed to look at anything but the floor.

In the English notes, we talk about the little-known English subjunctive.  The French notes are, of course, devoted to the verb paître.  The bleu du Nil comes from exactly one farm, in Brittany–see the picture above.  It’s delicious–as creamy as butter, with little bits of fenugreek.


English notes

Anglophones complain constantly about the French subjunctive.  Even French teachers get into it, commiserating with us about its chiant existence and teaching us ways to avoid it.  In reality, this most charming of the conjugations of the French language is not one that is completely foreign to us.  Although it’s not widespread, my dialect still has a subjunctive.  It’s easiest to say in the case of the verb to be.  Here’s how it showed up in this post:

I had just asked the nice lady if she were referring to where the cows fart.  

The subjunctive here is were.  You would expect was:

I had just asked the nice lady if she was referring to where the cows fart.

…and indeed, (a) you most certainly could say that, and (b) I would guess that most Americans would say that.  (I hate to guess, but I don’t have any statistics on this–sorry.)  You can find some exercises on the use of the subjunctive in English here, if you’d like to pursue this.  Be aware that there are some differences between American and British English in the use of the subjunctive–the Wikipedia page on the English subjunctive goes into them at some length.

French notes

Paître is the kind of delightfully irregular verb that I just adore.  Along with repaître, native speakers don’t seem to agree on whether either, both, or neither of them can be used for humans, or just for cows and the like; whether either, both, or neither of them can be transitive only, intransitive only, or both; or in which tenses the gets its little chapeau chinois.  (From what I can tell, the Academy’s decision on this has not always been gracefully accepted.)  My Bescherelle maintains that (a) it doesn’t have any of the compound tenses, and (b) le participe passé pu, invariable, n’est utilisé qu’en termes de fauconnerie…. and if you can find a verb that’s cooler than that, I will buy you a beer–and if you’re a woman, I’ll marry you.

Three ways to say unremittingly: 

  • sans trêve
  • sans répit
  • sans cesse

Bon ménage

It’s amazing how many Republican politicians have gone down in flames over the years because they talked a lot of shit about immigration and then turned out to have an illegal housecleaner.  Some examples:

  • Meg Whitman, Republican candidate for governor of California, 2010
  • Andy Puzder, Trump pick for Labor Secretary, 2017
  • Tom Tancredo, long-time Republican congressman from Colorado and one of the worst of the hypocritical people in the area of pushing anti-immigrant policies and then hiring them.  He bragged about turning in a high school student when an article about him receiving an honors scholarship mentioned that he was in the US illegally–and then got busted hiring illegal immigrants to work on his mansion.

I’ll point out here that two of Bill Clinton’s nominees for Attorney General (the highest law-enforcement office in the United States) went down over illegal nannies–and I’ll also point out that unlike the Republicans, they were not hiring illegal immigrants while hypocritically talking trash about hiring illegal immigrants.  Of course, most past misdeeds seem less relevant under the Trump administration, which seems positively gleeful about being a bunch of crooks, bigots, and–I suspect we’ll soon know clearly–traitors.

On that note, here’s a nice post from the France Says blog on the subject of French vocabulary related to people who clean things.  Enjoy–and if you’re going to hire illegal aliens to work for you, have the grace not to build your career on talking about how bad they are!

Source: Bon ménage

Just because you’re a poet doesn’t mean you can’t kick ass

Operation Iraqi Freedom II
An LVTP7 amphibious assault vehicle. Picture source: USMC.

One day some decades ago, the amphibious assault vehicle in which I was riding around Camp Pendleton, California while we practiced assaulting hills and the like made an unplanned stop.  I reached into one of those voluminous pockets that military uniforms tend to be covered with and pulled out a book to read while the platoon leader tried to figure out where the fuck we were.  Whatcha reading, Doc?, some big, bulky Marine or another asked me.  (I was a medic in the Navy.  The US Marines don’t have their own medical personnel–they’re all provided by the Navy.  This came as a surprise to lots of young men who volunteered to join the Navy during Vietnam thinking that there was no better way to avoid finding yourself in a rice paddy with leeches on your scrotum and somebody shooting at you than working in a naval hospital–and then found themselves in a rice paddy with leeches on their scrota and somebody shooting at them.  Technically, the term for a Navy medic is hospital corpsman, but by long tradition, the Marines call us “Doc.”  But, back to Camp Pendleton…)

The social animal, I said.  Social psychology.  (You might think that I wouldn’t remember what I was reading in the early 1980s–but, the paperback fit perfectly in my left thigh pocket.  My right thigh pocket was for a bag of licorice.  You never know when you will/won’t get to eat, and licorice doesn’t leave your hands covered with melted chocolate.)  Social psychology…hm… I like to read about history, myself, said the big, bulky Marine.  The Wars of the Roses–that was some crazy shit…  The second lieutenant gave the staff sergeant an embarrassed smile and folded up his map; the big, bulky Marine and I climbed back into our hatches; and we all went back to assaulting whatever we were practicing assaulting–the Wars of the Roses would wait.  In the military, every branch has their stereotypical insults for the other branches, and everyone’s insult for the Marines is that they’re stupid, but I’ll tell you this: I know exactly two guys who dropped out of high school, joined the service, and then got a doctorate, and the one who isn’t me is a Marine.  (I don’t say “was” a Marine, because once a Marine, always a Marine, and they are, indeed, bad motherfuckers.  “Bad motherfucker” explained in the English notes below.)

You tend to think of poets as ethereal, wispy types who are super-sensitive and probably wouldn’t be the person you would want to cover your back if you got into a fight in a metro station.  However, if you’ve been paying attention to the stuff that we’ve been reading for National Poetry Month, you’re already aware that there are plenty of counter-examples to that.  Case in point: Guillaume Apollinaire.  He may or may not have been sensitive, but he was definitely a serious scrapper.  He tried to join the army when the First World War came to France in August 1914, but was turned away due to not being a French citizen.  No problem–he left Paris and headed south-east to Nice and tried again, this time successfully.  He was initially assigned to an artillery unit, but this wasn’t hard-core enough for him, so he got himself transferred to a decimated infantry unit, picking up a promotion to second lieutenant in the process.  (That’s a very low rank for an officer, but for an enlisted man to get promoted to it is a pretty big deal.)

guillaume_apollinaire_calligramme
Calligramme. Public domain.

Apollinaire was one of the greats of French poetry; if you’ve only heard of one French poem, it was probably his Le pont Mirabeau.  One of his innovations was his role in the development of what’s known as “concrete poetry.”  It is “concrete” in the sense that not just its linguistic elements, but its typographic shape are essential to the poem.  The one to the left is my favorite of his works in this genre.  In the form of the Eiffel Tower, the words translate something like this:

Hello, world of which I am the eloquent tongue.  Oh Paris, may your tongue stick out, and stick out always, at the Germans.  

guillaume_apollinaire_foto
Guillaume Apollinaire. Public domain.

Now, being poetry, it is, of course, a bit more complicated than that.  What I’ve given here as “may your tongue stick out” comes from a volume of translations of Apollinaire  by Anne Greet and S.I. Lockerbie that I like.  “To stick one’s tongue out” is a plausible translation of tirer la bouche, but it’s not necessarily the most obvious one.  Certainly it fits with the facts that (a) Apollinaire refers to la langue éloquante, “the elegant tongue,” and the Eiffel Tower does have a tongue-like shape.  But, given that this was written by a guy who was putting his life on the line in the trenches at the time, I tend to think that he was playing on another meaning of the verb tirer: to fire a weapon.  For a poet in an infantry unit, the metaphor of the mouth as a weapon (que sa bouche…tire et tirera toujours aux Allemands) is certainly an apt one.

hybrid-suite-cardiac-16x9
A cardiac catheterization lab. Picture source: Children’s Hospital of Philadelphia.

The Navy eventually sent me to school, and I finished my time in the service in a cardiac catheterization lab, which over the course of some rather bizarre decades led to me being a faculty member at a medical school, where I specialize in biomedical language.  Apollinaire caught a shell fragment in the temple (when a bombardment started while he was reading a literary magazine, they say); although he survived trepanning, he never fully recovered, and in his weakened condition, died in the flu epidemic of 1918.  Whenever I visit the Panthéon, I take a moment to slip away from my friends and find his name on the (long) list of writers who gave their lives for France–and to pay my respects.


English notes

bad motherfucker: One of the cute things about American English is that bad–and similar words, depending on the region of the country that you’re in–can have positive connotations.  (Connotation is the cultural meaning of a word, as opposed to its denotation, which you could think of as its “dictionary meaning.”  Connotation and culture both start with a C; denotation and dictionary both start with a d.  That’s how remember them, at any rate.)

So: a bad motherfucker is someone who is really tough, with some implication that this toughness involves fighting.  You would want to be called a bad motherfucker.  When I was a kid, it was common to use bad to mean something like cool, impressive–our favorite bands were “bad,” a nice leather jacket was “bad,” etc.

wicked-smart
The spelling of “smart” as “smaht” here is important, in that it’s meant to reflect the stereotypical regional pronunciation of the Northeast US, where this use of “wicked” comes from.

Less common, but incontestably much cooler, is the use of wicked to mean “very” in front of an adjective, especially one with a positive meaning.  I believe it’s a Northeast thing, although I’ve seen it as far west as Oregon.  Scroll down for lots of examples.

Screenshot 2017-04-09 02.09.11

6a00e54ef97d7c88330168ea8ec3c4970c-pi5326029-0490dfaa02f5b60aea13579ac26a82eb

73585648
The spelling of “smart” as “smaht” here is important, in that it’s meant to reflect the stereotypical regional pronunciation of the Northeast US, where this use of “wicked” comes from.

2007-12-13-imagine-wicked-cool-possessions

ws-teaser2

Danse macabre: the illustrated version

fleurs du mal larousse
The Larousse version of Baudelaire’s “Les fleurs du mal.” Picture source: me.

Being old, bald, and fat, I don’t get a lot of admiring glances when I ride the train to work in the mornings.  I do, however, get a lot of funny looks when I pull out a book to read.  The reason: I’m fond of reading French literature, but I tend to read it in the sorts of annotated versions of a work that you would read if you were a middle-school student in France (collégien in French, I think–roughly 7th and 8th grades in the American system).  For me, they’re perfect–they have definitions in simple French of the kinds of words that the editors think will be difficult for a French child, which as a non-native speaker, I have trouble with myself.  (Think back to the footnoted versions of Shakespeare that you read in high school and college.)  If this kind of thing interests you, you can find them used by the score (see this post for an explanation of what by the score means) in boxes in front of the Boulinier bookstore on boulevard Saint Michel in the Quartier Latin.  They’re so cheap–typically one euro–that there’s no reason not to by multiple versions of a play that you’re planning to see.  (17th-century French theater is actually probably more intelligible than Shakespeare is in English, although as is the case with Shakespeare, it’s a good idea to read a play before you go see it.)  I find it interesting to see the contrast between the sorts of things that one would (not) dare to teach middle-school students in the US and the sorts of things that one can teach middle-school children in France–definitely edgier in France.

In honor of National Poetry Month, here’s some Baudelaire, from Les fleurs du mal.  Baudelaire popularized poetry about cities, as opposed to nature, glorified ad nauseum by Romanticism.  In his delightful book The flâneur, Edmund White describes him as “the great apostle of dandyism,” which explains a lot about the picture of him that you see below.  Odd 6-degrees-of-separation stuff: he went to high school across the street from the university where my grandfather would later study.

Danse macabre

Charles Baudelaire

A Ernest Christophe

652b3299413ea4845776be34d1c66177
Picture source: https://goo.gl/BVRvlb

Fière, autant qu’un vivant, de sa noble stature,
Avec son gros bouquet, son mouchoir et ses gants,
Elle a la nonchalance et la désinvolture
D’une coquette maigre aux airs extravagants.

Note the inversion that moves un soulier pomponné, joli comme une fleur to the end of the sentence, indicated only by the relative maker que rather than qui.

s’écrouler: to fall, e.g. le mur s’est écroulé, s’écrouler sur le canapé.

Vit-on jamais au bal une taille plus mince ?
Sa robe exagérée, en sa royale ampleur,
S’écroule abondamment sur un pied sec que pince
Un soulier pomponné, joli comme une fleur.

la ruche: a strip of pleated cloth (see picture above)

lascif: sensual, lascivious

lazzi: jibes, ribbing

appas: “charms”

La ruche qui se joue au bord des clavicules,
Comme un ruisseau lascif qui se frotte au rocher,
Défend pudiquement des lazzi ridicules
Les funèbres appas qu’elle tient à cacher.

frêle: fragile, frail

attifé: dressed, not necessarily well

Ses yeux profonds sont faits de vide et de ténèbres,
Et son crâne, de fleurs artistement coiffé,
Oscille mollement sur ses frêles vertèbres.
Ô charme d’un néant follement attifé.

Note ivre here and enivré later.

armature: framework; also the underwiring of a bra, although I don’t know whether or not that sense was current in Baudelaire’s time

Aucuns t’appelleront une caricature,
Qui ne comprennent pas, amants ivres de chair,
L’élégance sans nom de l’humaine armature.
Tu réponds, grand squelette, à mon goût le plus cher !

éperonner: to spur, to spur on; also to ram

encor: an old literary spelling of “encore”

Viens-tu troubler, avec ta puissante grimace,
La fête de la Vie ? ou quelque vieux désir,
Éperonnant encor ta vivante carcasse,
Te pousse-t-il, crédule, au sabbat du Plaisir ?

Au chant des violons, aux flammes des bougies,
Espères-tu chasser ton cauchemar moqueur,
Et viens-tu demander au torrent des orgies
De rafraîchir l’enfer allumé dans ton coeur ?

aspic: asp

errer: to wander, roam, rove

Inépuisable puits de sottise et de fautes !
De l’antique douleur éternel alambic !
A travers le treillis recourbé de tes côtes
Je vois, errant encor, l’insatiable aspic.

Love the ne expletif after craindre!

Pour dire vrai, je crains que ta coquetterie
Ne trouve pas un prix digne de ses efforts ;
Qui, de ces coeurs mortels, entend la raillerie ?
Les charmes de l’horreur n’enivrent que les forts !

gouffre: gulf, chasm, abyss

Le gouffre de tes yeux, plein d’horribles pensées,
Exhale le vertige, et les danseurs prudents
Ne contempleront pas sans d’amères nausées
Le sourire éternel de tes trente-deux dents.

Pourtant, qui n’a serré dans ses bras un squelette,
Et qui ne s’est nourri des choses du tombeau ?
Qu’importe le parfum, l’habit ou la toilette ?
Qui fait le dégoûté montre qu’il se croit beau.

bayadère: sacred dancer from India

gouge: old word for a prostitute

offusqué: offended

musqé: musky

Bayadère sans nez, irrésistible gouge,
Dis donc à ces danseurs qui font les offusqués :
” Fiers mignons, malgré l’art des poudres et du rouge,
Vous sentez tous la mort ! Ô squelettes musqués,

Antinoüs: according to the footnotes in my middle-school-student version, jeune esclave d’une beauté parfaite, qui était le favori de l’empereur Hadrien

flétri: faded (beauty), withered, wilted (like the roses sitting on my table–I really need to toss them)

dandy: in Baudelaire, this is a compliment, as you might guess from the painting of him at left

glabre: clean-shaven, smooth-skinned (WordReference.com)

lovelace: séducteur pervers et cynique, according to the footnotes in my middle school version of the poem

chenu: white-haired from age

le branle: a kind of dance.  (If you are French: you can just imagine what happens when you try looking for videos of this on YouTube)

Antinoüs flétris, dandys, à face glabre,
Cadavres vernissés, lovelaces chenus,
Le branle universel de la danse macabre
Vous entraîne en des lieux qui ne sont pas connus !

se pâmer: to faint; to swoon, either literally or in a state of strong emotion, whether good (with synonyms délirer, exulter, se griser, s’émerveiller, s’enthousiasmer, s’exalter, s’extasier) or bad (elle s’est pâmée de douleur).

béant: gaping, wide open, cavernous

le tromblon: blunderbuss

Des quais froids de la Seine aux bords brûlants du Gange,
Le troupeau mortel saute et se pâme, sans voir
Dans un trou du plafond la trompette de l’Ange
Sinistrement béante ainsi qu’un tromblon noir.

la contorsion: contorsion, but also “a face” in the sense of “to make a face”

En tout climat, sous tout soleil, la Mort t’admire
En tes contorsions, risible Humanité,
Et souvent, comme toi, se parfumant de myrrhe,
Mêle son ironie à ton insanité ! “

What it means to be an iguana: the Jaccard index

In the end, what does it really mean to be an iguana, and how could you tell?

The big thing in language these days is distance-based representations of semantics.  The idea is that the meaning of a word can be discussed in terms of its closeness to, or distance from, other words.

How the hell would you measure that?  Current approaches to distance-based semantics are based on something called the distributional hypothesisthe idea that a word’s meaning is, in essence, the set of words that it occurs with.  (With which it occurs, if you prefer.)  When you have sets, you can calculate the distance between (or closeness between–it doesn’t matter what you call it) those sets.  I’ll give you an example of this in which we’ll use a distance metric (metric, in this case, means a number that measures something) called the Jaccard index.  It’s based on counting the number of things that two sets have in common and then adjusting it with respect to the total number of things in the sets.

Let’s walk through the intuitions behind the Jaccard index.  The first intuition: the more things that you share with another set, the more similar to that set you are.  Let’s think about two sets of words:

Set 1 fur eat pet play ball
Set 2 fur eat pet sleep mouse

What do those two sets share?

  1. fur
  2. eat
  3. pet

That’s three things.  Now let’s look at Set 1 again, versus a third set:

Set 1 fur eat pet play ball
Set 3 scales eat sun sleep climb

How many things do they share?

  1. eat

Based just on the counts of things that these three sets have in common, you might say that Set 1 and Set 2 are the most similar to each other, since they have the most things in common.

Now, it’s a bit more complicated than this.  Think about these two pairs of sets, and tell me which you think is closer: Set 3/Set 4, or Set 1/Set 2?  Here’s Set 1/Set 2 again:

Set 1 fur eat pet play ball
Set 2 fur eat pet sleep mouse

…and here’s Set 3/Set 4:

Set 3 scales eat sun sleep climb
Set 4 scales eat sun sleep climb strike hiss bird molt brumate

To brumate: similar to hibernating, but the state of dormancy is not as deep.

Set 1/Set 2 share 3 things.  Set 3/Set 4 share even more–5 things.  But, how much more similar does that make them?  I’m going to suggest that it’s not as much as you might think.  The reason that I’m saying this is that the fact that Set 3 and Set 4 share as much as they do has to take into account the fact that Set 4 has more things in it than any of the other sets have.

learn more about brumation 6c505dfd5f0942f62981d4f820f09207
Picture source: https://goo.gl/zjuxaC

How can we take this difference in the set sizes into account?  We’ll do something called “normalizing” the count of the things that they share: we’ll make it relative to the sizes of the sets that we’re comparing.  How we’ll calculate the sizes of the sets: we’ll count up the total number of words that you would get if you added both sets of words together, and only counted each unique word one time.  We’ll go back to Sets 1 and 2:

Set 1 fur eat pet play ball
Set 2 fur eat pet sleep mouse

What are the unique words in the combination of both sets?

  1. fur
  2. eat
  3. pet
  4. play
  5. sleep 
  6. ball
  7. mouse

There are 10 total words in the two sets, but if you only count each word once–each unique word, that is to say–you have 7.  Now let’s look at 3 and 4, this time counting the unique words that are found in the combination of the two sets:

Set 3 scales eat sun sleep climb
Set 4 scales eat sun sleep climb strike hiss bird molt brumate
  1. scales
  2. eat
  3. sun
  4. sleep
  5. climb
  6. strike
  7. hiss
  8. bird
  9. molt
  10. brumate

To normalize the number of things that two sets of things have in common by the total number of types of things in the set, we divide the number of things that they have in common by the total number of things.  So, for Set 1 and Set 2:

3 things in common / 7 types of things = 0.43

For Set 3 and Set 4:

5 things in common / 10 types of things = 0.50

…and those are the Jaccard indexes for Set 1 and Set 2, and for Set 3 and Set 4.

Let me give you one more pair: Set 1 and Set 1.  If you calculate the similarity between a set and itself, you get a value of 1.0.  What you should take from that is that the range of values for the Jaccard index is from 0.0 to 1.0.  Knowing that, you have a point of reference: if the Jaccard index is close to 1.0, then the two things are very similar (because identical things give you a Jaccard index of 1.0).  On the other hand, if two things are very different, then you’ll have a Jaccard index that’s close to zero.  This might seem obvious, but imagine if there were no upper limit on how big the Jaccard index could get.  What would 20 mean?  What would 4,808 mean?  Who the hell knows?  Metrics in the range of 0.0 to 1.0 are the ginchiest.

So, now that we have a way to quantify the similarity (or difference) between two sets based on the quantity of things that they share, normalized by the total quantity of things: suppose that those things are the other words that some word occurs with.  If you replace the names of the sets like this:

  1. Set 1: dog
  2. Set 2: cat
  3. Set 3: iguana
  4. Set 4: snake
slide_54
We’ve looked at one measure of similarity/difference–the Jaccard index–but, there are others. Here’s a small sample. Picture source: http://images.slideplayer.com/24/6982424/slides/slide_54.jpg

…then you could imagine the words in those sets being the words that dog, cat, lizard, and snake occur with.  When we calculate our numbers, we end up with dog being more like cat than it is like iguana or snake.  In contrast, our numbers are consistent with the idea that iguanas are more like snakes than they are like dogs or cats…and that’s one way that you can think about quantifying the similarities between the meanings of words.

john_rupert_firth
John Rupert Firth. Picture source: Public domain, photographer unknown.

These particular kinds of distributional representations of word meanings go back to the 1950s and the work of John Firth, who famously (OK: famously among linguists) said that “you shall know a word by the company it keeps.”  Distributional representations suddenly become popular in the language processing world (surprising, to some extent, because the language processing world is populated much more by computer scientists than by linguists) a few years back, for two reasons:

  1. Thanks to the Internet, we now have access to quantities of textual data that are big enough to be able to calculate reliable quantities–you need a lot of data to actually make this kind of approach work.
  2. People have recently had some success with figuring out ways to do calculations of these numbers in ways that are efficient enough to be able to handle those enormous quantities of data without bringing every supercomputer in the world to its knees.  If you tried to do something like this naively, you would be calculating the similarity between every word and every other word; no one actually knows how many words there are in (to take one example) English, but you’re probably talking about a table with 10,000,000,000 cells in it.  A few years ago people came up with a couple ways of reducing that number drastically, and that makes it practical to do the calculations and to store their results.  (If you could do one calculation a second, it works out to a bit over 19,000 years.)  Now my laptop can crunch the numbers for a few million words or so worth of text overnight.

We’ve talked about calculating the Jaccard index today (shared things divided by total things), and calculating it on the basis of words.  That’s a very straightforward way of doing this–the Jaccard index is the simplest distance metric that I know of, and words are the easiest things to count.  However: words are actually much more difficult to count in real life–or even to define–than they seem to be in the examples that we looked at, and there are lots of other things that one could count that might work out better.  There are also different ways to define what counts as “occurring together.”  To give you some examples of the kinds of questions that you need to think about in doing this kind of thing:

  1. Words: What is a unique word?  Do you want to count Dog and dog as the same word, even though one starts with an upper-case letter, and the other starts with a lower-case letter?  Do you want to count reproduisisse, reproduisît, reproduisissions, reproduisissiez, and reproduisissent (the forms of the imparfait du subjonctif of the French verb reproduire, “to reproduce”) as the same word?  How about pet peeve and bête noire–do those count as one word, or two?  Do you want to count bete noire as the same word as bete noir, bête noire, and bête noir?  (More generally: do you count an incorrectly spelt word with its correctly-spelled equivalent?  If so: how the hell do you spell-correct the Internet?)
  2. Things to count: Do you want to count $1, $2.25, 50%, and 75% as 4 different things?  Maybe you want to consider them all as numbers, in which case there is just 1 “thing?”  Maybe you want to count $1 and $2.25 as prices, and 50% and 75% as percentages, in which case there are 2 things?
  3. What “occurring together” means: Is it occurring in the same sentence?  The same newspaper article?  The same book?  Maybe it means occurring within two words to the right or within two words to the left–i.e., occurring within the four surrounding words?

…and that’s the kind of thing that will keep graduate students busy for the next 5 years or so, unless something else becomes au courant in the meantime (au courant discussed below in the French and English notes), in which case all of the grad students who were betting their careers on the latest cool thing will be spending some time engaging in some serious nombrilisme and then either starting all over again or quitting grad school and going into building better search engines for Twitter or something.  Welcome to my world.

For more information on distance-based semantics and its alternatives, see Elisabetta Jezek’s The lexicon: an introduction.

I got into this 2400-word little essay in the course of trying to come up with a way to respond to a series of comments on my last post in which we got into a discussion of whether or not the English word bete noire means the same as the English word pet peeve (see how I snuck an assumption in there about how many “words” are in bete noire and pet peeve?)  Obviously I went down a bit of a rabbit-hole here.  More on the bete noire/pet peeve thing some other time, if Trump doesn’t nuke some country because the president said something mean about him (remember how he was saying that Hillary wasn’t “tough enough?”) and bring the world as we know it to an end, along with all of the electricity.  A quick discussion of some relevant French and English words follows.


French and English notes

au courant: This expression exists in both English and French, but with different uses in the two languages.  In French, it means something like “up to date,” and is used to describe people.  In English, it can be used in the same way, but is also (and I think more commonly, although I don’t have the data to demonstrate this, one way or the other) used to describe things, in which case it means something like “in fashion.”  Additionally, in English, this is a very high-register word–you wouldn’t use it with just anybody.  Here are some French examples from the frTenTen12 corpus, a collection of 9.9 billion words of French scraped off of the Internet that I searched via the Sketch Engine web site:

  • Nous tiendrons nos lecteurs au courant de cette tentative…
  • …que Dieu t’entende pas petite Marie Bon courage et tiens nous au courant
  • Vous êtes au courant de ces dangers, vous devez donc protéger votre PC contre toutes intrusions.
  • …mea culpa, je n’étais pas au courant
  • …ni les Etats-Unis ni l’URSS n’ont été au courant de cet événement…
  • Peut-être que le jeune mutant était au courant , aussi elle décida de l’attendre devant la porte.

I like the second-to-last one, because it describes two countries there, rather than the two people that you would expect.

To find examples of au courant in English, I went to the enTenTen13 corpus, a collection of 19.7 billion words of English-language text, which, again, I accessed through Sketch Engine.  Here is some of what I found:

  • …a library of au courant phraseology and jabber…
  • Where once the adage “Things go better with bacon” was au courant ,”Things go better with cheese” is timeless.
  • That isn’t to say that paisley prints are reserved solely for custom-fitted, au courant French fashion houses; just the opposite.
  • Pappardelle is the au courant cut of pasta right now…
  • It’s all very au courant , yet it’s not at all.
  • Being au courant can be its own sort of stultifying endgame.

Comparing the experience of putting these two lists together, I can tell you that I had to hunt to find examples of au courant in French where it wasn’t modifying a human, and I had to hunt to find examples of au courant in English where it was modifying a human (my last example here is the closest that I came).  Here’s how it was used in the post: That’s the kind of thing that will keep graduate students busy for the next 5 years or so, unless something else becomes au courant in the meantime, in which case all of the grad students who were betting their careers on the latest cool thing will be spending some time engaging in some serious nombrilisme and then either starting all over again or quitting grad school and going into building better search engines for Twitter or something.  

The zombie apocalypse and education in the computational sciences

How to respect both logical positivism and the zombie apocalypse while educating computer scientists.

Screenshot 2017-03-10 04.25.01
zombilingo.org, a web site that supports research on what linguists call the “heads” of groupes nominaux (“noun phrases,” in English).

In my professional life, one of my pet peeves is scientific discussions that involve the verb to believe.  For example:

  • …we believe that [joint circumscription] will be important in some AI applications.  (John McCarthy, Circumscription–A form of non-monotonic reasoning, publication date unclear) 
  • We believe ontologies are key requirements for building context-aware systems… (H. Chen, T. Finin, and A. Joshi, An ontology for context-aware pervasive computing environments, 2003)
  • We believe enzyme-loaded erythrocytes may have therapeutic possibilities for several diseases.  (Ihler et al. 1973, Enzyme loading of erythrocytes, which I should note has been cited over 300 times nonetheless)

I have actually been–on multiple occasions–cautioned against using formulations like Je pense que… (“I think that…”) in some professional situations in France, as it’s considered a sign of having a position that you’re not actually confident that you can defend.  (Native speakers, can you comment on this?)

I’m not shy about bringing up my problems with the verb to believe in any discussion in which I find myself that claims to be scientific, be those lab meetings or reviews of papers/grants/whatever.  I would not label myself as a logical positivist, but I try to always keep in mind the potential logical positivist position–it’s not a bad foundation for a philosophy of science.  (See, I didn’t say I think that it’s not a bad foundation for a philosophy of science–I flat-out asserted it.  In academic writing, I would follow that assertion with a few credible citations.)

Follow these links for more information on the zombie apocalypse and…

In light of that tendency of mine towards the empirical and the epistemological, students are often surprised to learn of my concerns regarding the upcoming zombie apocalypse.  Clearly, zombies are something about which I have no empirical data, and one would have to classify the upcoming zombie apocalypse as something about which I have beliefs, but not knowledge, and therefore outside of the realm of something that I would talk about in my professional life.  So, yes: students are surprised when I bring it up.  (As far as I can tell, my French colleagues just think I’m crazy, or chalk it up to some quirk of the Anglo-Saxon psyche, or something.  I actually have no clue what my American colleagues think.)

Here’s the thing: the zombie apocalypse is an engaging point of entry into the problem of making robust systems.  In the context of computer programming, you could think of “robustness” as the ability of a program do deal with the unexpected–making speech recognition systems that will work in a crowded restaurant (impossible 20 years ago, not unusual today), or building sentence analyzers that won’t reformat your hard drive if someone passes them a sentence in Uzbek. In particular, the upcoming zombie apocalypse is an engaging entry point to the problem of how to think about the problem of making robust systems.  The issue is that a major contributor to robustness is planning for unanticipated inputs (I had English in mind when designing my sentence analyzer, and then someone gave it a sentence in Uzbek) or operating conditions (I never thought about someone trying to use my speech recognition system with a lot of noise in the background).  Seulement voilà–the thing isit’s the nature of unanticipatedness that we have trouble coming up with the unanticipated.  Even more fundamentally a problem: we often have trouble getting into the mindset of taking seriously the very idea that unanticipated inputs or operating conditions are even plausible.  In fact, they are; but, how to get students to think about something that is, a priori, difficult to conceptualize?  Posing the question as how will your approach work when the zombie apocalypse comes? typically leads to a laugh–and seems to give one a way to think seriously about what kinds of things might happen that you haven’t actually thought about yet.  To think seriously about things that it’s difficult to think about by means of thinking non-seriously about things that don’t exist, you might say.  You might say that–if you haven’t really thought about the upcoming zombie apocalypse.


English notes

pet peeve: something that annoys a specific person a lot.  To call something a pet peeve, it should be rather specific to that person, especially with respect to the extent that it bothers them or with respect to the extent that they are sensitive to it.  For example, traffic jams wouldn’t really be anybody’s pet peeve–everybody is annoyed by traffic jams.  However, traffic jams caused by trash trucks doing their collections during rush hour could be someone’s pet peeve-say, if they happen to actually notice them more than most people would, in a situation where most people don’t particularly care whether or not a traffic jam was caused by a trash truck doing its collections during rush hour–they are equally annoyed by all traffic jams.  How it was used in the post: In my professional life, one of my pet peeves is scientific discussions that involve the verb “to believe.”  

French notes

la robustesse: robustness.  You can use this in a lot more ways in French than in English.  For example:

  • Hardiness would probably be the English-language equivalent here, where we’re talking about plants and their illnesses: Différentes maladies peuvent entraîner un flétrissement des tubercules qui se traduit, à son tour, par une perte de robustesse des plants.  (Source: Sketch Engine web site)
  • Toughness would probably be the equivalent here, where what’s being discussed is fabric: Ce tissu se distingue par sa robustesse, sa longévité et son confort.  (Source: click here)

How to not get a second date with a non-linguist

I always love being in France–but, sometimes, I REALLY love being in France.

The first thing you learn in American linguistics graduate schools is that you can make sure that there will never be a second date by commenting on some aspect of your companion’s speech.  Although America has no official language and nothing remotely like the notion of an Académie-Française-sanctioned standard form of the language, we are nonetheless super-sensitive about having the way that we speak brought up in a conversation.  Comment on the way that your date speaks, and it’s all over.  In France, the situation is very different–anyone will talk about how anybody else speaks, anywhere, any time.  I love that.


Sunday is market day in my little neighborhood in Paris.  Vendors set up their booths under the metro tracks down the block (I live by one of the few “aérienne” (elevated) lines).  Most things are pretty local–in France, meat and produce is usually sold with its area of origin marked, and the majority of foodstuffs for sale at the market come from no further away than Spain.  (For my geography-challenged American concitoyens: that’s right next door.)

I have my little routine.  The first place where I stop is the aligot booth, because if they were to sell out of that potato-butter-and-cheese equivalent of crack cocaine before I got there, my week would be ruined.  On the last leg of my trek, I stop by the choucroute stand.

Choucroute is an Alsatian specialty consisting of sauerkraut, an occasional carrot or potato, and any of a wide variety of smoked and/or cured meats.  Which raises a question: which meat do you want?  My habitual choice: all of them.

la-choucroute-en-pratique-5015-1200-630
Picture source: http://www.jds.fr https://goo.gl/2aCI2i

When I got to the front of the line for the choucroute, the elderly gentleman next to me was having a detailed discussion with one of the ladies working the booth about the ham on offer, and exactly how close to the bone it had been sliced.  The lady had set the pig leg on the counter, and was indicating various and sundry parts of the unfortunate animal’s anatomy with her knife.  (How close to the bone you’ve been sliced turns out to have implications for how deeply the meat has been cooked, and therefore both the smell (apparently worse the closer you get to the bone) and the taste (apparently better the closer you get to the bone).)  I looked at the variety of meats resting atop the bed of fermented cabbage and decided, as I usually do, that I wanted a bit of everything.  (If it’s in italics, it happened in French.)  May I have a mix of meats, please?  A huge smile from the vendor: oh, what a beautiful French word!  Did you hear what he just said?  …she asked the gentleman examining the ham.  He grunted and went back to discussing bone-closeness.  Shit, I thought to myself–what did I just say??  

Where are you from?  America, really?  Seriously, did you hear?  He said “déclinaison de viandes.”  This time the elderly gentleman didn’t even bother to grunt–nothing was going to distract him from his deepening relationship with that ham.  What should I have said? …I asked.  A “mélange,” I think…or an “assortiment.”  But, don’t change–that’s delightful.  


Lest you think that I’m bragging: this wasn’t the last time that I amused the nice choucroute lady yesterday morning.  In particular, when she asked me if I wanted some alaille, I was baffled.  She was happy to explain to me that this was saucisson à l’ailgarlic sausage.  D’oh!  On the down side, I still sound like a complete idiot when I try to speak French.  On the plus side, I gave the nice choucroute lady a few good laughs, and that has to count as A Good Thing.  I always love being in France–but, sometimes, I LOVE BEING IN FRANCE.  Seriously.


English notes

atop: a preposition meaning on top of.  This is a word that you might use in writing, but would rarely, if ever, use in the spoken language.  How it was used in the post: I looked at the variety of meats resting atop the bed of fermented cabbage and decided, as I usually do, that I wanted a bit of everything.

trek: a long journey, usually done specifically by walking, and usually difficult.  How it was used in the post: On the last leg of my trek, I stop by the choucroute stand.  In this case it conveys the idea that my journey through the market is long, and that I’m walking, but in this context, it’s not meant to suggest difficulty.

French notes

la déclinaison: according to WordReference, a range or variation; I saw it used in this way on the ardoise (“slate”–the little blackboard, often an actual piece of slate, on which the specials of the day are posted in restaurants) of the cafe downstairs from my apartment, advertising a déclinaison de tomates–an assortment of tomatoes.  Also according to WordReference, a declension, in the sense of a set of related words (sausage/sausages/sausage’s).  I have loved this word from the moment that I learnt it–apparently the choucroute lady thinks it’s pretty cool, too.

Questions with only one right answer

You’re in country X.  Let’s say that the local language is called Xish.  Here are the only correct answers to the following questions:

Q: So, what do you think about Xish?  A: It’s beautiful.
Q: Xish is really easy to speak, isn’t it? A: No.
Q: Do you think that Xish is hard?  A: Yes.
 Q: What’s more difficult–English, or Xish?  A: Xish.
 Comment: You speak Xish wonderfully!  Response: Oh, no, I speak Xish terribly.

In some technical sense, your answer to all of these will have been been false, except for the one about speaking Xish poorly.  “Difficulty” is not a meaningful word when applied to languages.  Neither is “beauty” in a technical sense, although I won’t belabor that one.

It occurred to me as I wrote this that the picture that I’ve painted here could be interpreted as suggesting that people who speak any language other than the one that you speak are easily fooled. In fact, that’s not the case at all.  This is about shared human culture–as far as I know, most people in most places love to talk about their language with foreigners, and how hard that language is will pretty much always be a good conversational tack to take.  (Obviously, I haven’t been everywhere or talked to everyone, but I’ve probably done this little exercise in somewhere in the neighborhood of 20 countries by now.)  In fact, in a lot of places, the Your Xish is great! thing is a sophisticated opportunity to let you show your grasp of the culture (or not)–in many cultures, accepting a compliment is quite gauche, and the only proper response is self-deprecation.  Respond with “oh, thank you–I’ve really been working on it!”…and you’ve just shown yourself to still be clueless.  Respond appropriately and you’ve just shown your grasp of, and respect for, the culture.

Ironically, I can’t quite figure out whether or not that’s the case in France–in general, this is not a country where self-deprecation is valued.  It’s a real problem for Americans, since self-deprecation is more or less our default attitude any time that we meet someone new, and often for much, much longer than that.  You could think of this whole isn’t-my-language-hard thing as an instance of not “exoticizing the Other,” as we academics like to say, but rather, of exoticizing oneself–of supporting a sort of exceptionalism for one’s own language, in the sense that we talk about “American exceptionalism” (the idea that America is just plain better than the rest of the world and has something to offer it–I certainly agree with the second part of that) and “French exceptionalism” (the idea that France is just plain better than the rest of the world and has something to offer it–I certainly agree with the second part of that, too).


English notes

gauche-james-dean-quote-i-m-a-serious-minded-and-intense-little-devil-terribly-gauche-and-so-tense-that-i-don-james-dean-63-50-46
Picture source: http://www.azquotes.com/quote/635046

gauche: lacking social experience or grace; also :  not tactful :  crude  (taken directly from Merriam-Webster).  I think that the best French equivalent might be maladroit, but couldn’t swear to it.  How it was used in the post: In many cultures, accepting a compliment is quite gauche, and the only proper response is self-deprecation.  Some examples from the Open American National Corpus, a collection of 15 million words of American English, collected and annotated by my colleague Nancy Ide, that you can download to do with as you please.  I used the Sketch Engine web site to search it.

  • You are correct that you cannot come right out and say, “It is gauche to come over and serenade me with your potato chips , so please go away.”
  • Gauche, gauche, gauche, and tacky.  (I love this one even more than the previous one.) 
  • Your take on his behavior was correct: It was gauche. Prudie does have one slight bit of curiosity about the faux pas.

to take a tack: to go in a particular direction, metaphorically speaking.  It comes from nautical language, where the verb to tack means to change the direction of a ship by turning the bow into the wind.  Confusingly, it can mean something like tactic, but it is not related to that word at all.  How it was used in the post: Most people in most places love to talk about their language with foreigners, and how hard that language is will pretty much always be a good conversational tack to take.  Some examples from the Open American National Corpus (see above):

  • Britain’s Independent took a similar tack, observing, “The situation is far from precisely parallel, but it is still a chastening thought that the Kosovo Liberation Army is, under conditions of vastly greater duress, handing in its guns at a rather faster rate than the Provisional IRA seems able to arrange  “
  • But rather than pursue that obscure tack any further (place names such as Washington are surely both proper nouns and eponyms) , let us see if the proper categories of words really end there as grammar books tend to suggest .  (Different verb–pursue, rather than take–but, same meaning)
  • Having apparently grown tired of obsessing over just how skeletal the Ally McBeal Über-waif has become, the tabs take a different tack: They bare their fangs and become positively McCarthyesque in their zeal to rat out celebs who’ve become the least bit unsvelte.
  • I think it’s one of the tacks Gerald Posner took in his book JFK book, Case Closed.

French notes

gauche: according to WordReference.com, this adjective can mean awkward, clumsy, or gauche, but with this sense (meaning) it is soutenu.  

le langage soutenu or le registre soutenu: according to the French-language Wikipedia, this is especially a written form of the language, used in official letters and literary texts.

Ukrainian Humanitarian Resistance

Resisting the russist occupation while keeping our humanity

Languages. Motivation. Education. Travelling

"Je suis féru(e) de langues" is about language learning, study tips and travelling. Join my community!

Curative Power of Medical Data

JCDL 2020 Workshop on Biomedical Natural Language Processing

Crimescribe

Criminal Curiosities

BioNLP

Biomedical natural language processing

Mostly Mammoths

but other things that fascinate me, too

Zygoma

Adventures in natural history collections

Our French Oasis

FAMILY LIFE IN A FRENCH COUNTRY VILLAGE

ACL 2017

PC Chairs Blog

Abby Mullen

A site about history and life

EFL Notes

Random commentary on teaching English as a foreign language

Natural Language Processing

Université Paris-Centrale, Spring 2017

Speak Out in Spanish!

living and loving language

- MIKE STEEDEN -

THE DRIVELLINGS OF TWATTERSLEY FROMAGE