Fish and semantic theory

Don’t you hate it when you just want to order dinner, and your host starts ranting about lexical semantics?

Picture source:

At this point in my French adventures, I’m reasonably capable of helping an American visitor navigate a French menu.  Quail, lamb, artichokes–I’ve got those under control.  The subtleties of the cheese plate–you’d be amazed.  Tajines versus couscous–I’m getting there.  Sausages–you want dry?  Spicy?  Halal?  Pork guts?

There’s still one part of the menu where I’m pretty hopeless, though.  It’s the fish section.  Here’s the thing: I don’t even know what the fish are in English.  Is flounder the flat one, or is that halibut?  Is haddock a dark fish, or a light fish?  I have no clue.  I know that tuna are really, really big and anchovies aren’t, but if you put me face to face with a monkfish and a sea bream, I’d be pretty lost.

This is curious.  Here’s the thing about lexical semantics–the meanings of words: theories of lexical semantics all assume that what we know about the meaning of a word is sufficient (a) to identify what it refers to, and (b) to distinguish what it refers to from what it doesn’t refer to.  Linguistics as we know it today begins with Saussure‘s observation that all meaning boils down to what you are not: I believe the quote is Dans la langue, il n’y a que des différences.  (“In language, there is nothing but differences.”)  We’re looked at a number of ways of representing word meanings in the past, including ontologies, prototypes, and necesssary and sufficient conditions.  The ability of a definition to distinguish between things is crucial to all of them.  The currently most popular approach to representing the meanings of words in the field of natural language processing, known as word embeddings, is based entirely on a spatial metaphor–similarities in meaning are closeness in meaning, differences in meaning are distances.  Nothing is defined in terms of any of its characteristics–it’s all about which other words a word is more, or less, different from.  All of these approaches to semantics require not just that you be able to define what a thing is, but also differentiate it from the things that it isn’t.

So, how do you fit fish into this?  They’re just one example of a phenomenon that is not uncommon.  It doesn’t have a name, that I know of, but I’ll bet that you can give your own examples of it.  Here are some from my own experience:

opaline birch ochre
amethyst alder taupe
garnet poplar cerise
tourmeline yew ecru

I know what everything in columns A, B, and C have in common.  Column A is precious stones.  Column B is trees.  Column C is colors.  But, which stones, trees, colors?  I haven’t a clue.  I know how they’re similar, very broadly–they’re all stones/trees/colors.  But, I couldn’t tell which was which if my life depended on it.

So: how do you explain this phenomenon where I know what kinds of things they are, but I don’t know specifically what they are?  I know a similarity–it’s the difference that I don’t know.  How you fit that into a theory of meaning that relies crucially on being able to differentiate between things, I have no clue.  And, it’s not like this is rare, either: I’ll bet that if pushed, you could come up with a list of things that you know are geographic features, without actually being able to define them (hill, hillock, dale, dell, valley, hollow, swale); same for furniture (settee, chaise lounge, tuffet, ottoman); fish, perhaps?  A request: if I’m the only freak in the world who knows categories of words that he can’t tell apart, feel free to leave me in peace.  But, if there are categories of words like that that you don’t know how to define the members of, either, could you let me know that I’m not alone in my weirdness by giving me examples?  And, if you’re a linguist, and you know the name of this phenomenon: please enlighten me…

  • le thon: tuna
  • la truite: trout
  • la daurade: sea bream

The ideal and the desired, French and American versions

Talking about the ideal and the desired in French and in English: ways to say “should,” with some comics.

Why Paul Ryan should vote for Hillary Clinton.  —headline, The Fiscal Times

Speaker Paul Ryan should disavow Donald Trump.  —headline, Milwaukee Journal-Sentinel

Paul Ryan says Donald Trump should release tax returns —headline, Wall Street Journal

Should I have a cookie? Picture source:
It’s amazing (and more than a little depressing) to me that such enormous holes persist in my French, even after 2 3/4 years of studying really, really hard!  I just realized that I don’t know how to express the difference between must and should.  Obligation–must–that, I can express.  It’s the verb devoir in the present tense.  Ideal actions, desired actions–that’s a bit more complicated, both in French and in English.  (See the English notes at the bottom for the English issues.)

See here for more information about parallel corpora like OPUS 2.

To express the idea of should, we still use devoir, but we need its conditional tensesFor the English present tense, e.g. I should, we use the French present conditional of devoir.  You can read about how to do so here on the Lawless French web site; I’ll give you some examples from the Sketch Engine web site.  I used Sketch Engine to search the OPUS2 corpus, a collection of billions of words of text in 40 different languages, drawn from sources as diverse as movie subtitles and the proceedings of the European Parliament, and lined up with each other wherever possible.  We’re talking 1.1 billion words of English, 600,000 words of Afrikaans, 46 million words of Albanian, 300 million words of Arabic, etc.  French?  Almost 766 million.

Don’ t you think that before shooting a spy, we should make him talk?

Vous ne pensez pas qu’avant d’abattre un espion, on devrait le faire parler?

Now that we have finished the script, we should save it to disk.

Maintenant que nous avons fini le script, nous devrions l’enregistrer sur le disque.

Well, then, I think we should go out on a Sunday night.

On devrait sortir le dimanche soir alors.

What we should do is have dinner sometime.

On devrait dîner ensemble un soir.

Should I have a cookie? Picture source:
To talk about something that you should have done in the past, you need the past conditional of devoir.  Here‘s the Lawless French page with an example–there’s a more detailed lesson hidden somewhere on the Lawless French Kwiziq site, but I have no clue how to tell you how to find it.  Again, I’ll give you some examples from the OPUS 2 corpus, retrieved via the Sketch Engine web site:

I knew we should have stayed on this case.

Je savais qu’on aurait dû rester sur cette affaire.

Maybe we should have bought some rice in town.

On aurait peut-être dû acheter du riz en ville.

According to all you told us, and to all calculations … we should have located the mine two days ago.

D’après ce que vous nous avez dit et nos calculs, nous aurions dû trouver la mine il y a deux jours.

Wonder if we should have told the exec about that package … … Mike used to keep under his sack.

Je me demande si on aurait dû parler du paquet … – que Mike gardait sous sa couchette.

As a result, we have not been able to make as much progress as we should have.

En conséquence, nous n’avons pas pu réaliser tous les progrès que nous aurions dû accomplir.

It’s hard to believe that the 2020 Republican primaries won’t see Paul Ryan pitted against Tom Cruz.  Cruz will still be as scary then as he is now, I imagine–personally, I find him even more frightening than Trump, and I find Trump pretty damn frightening.  Paul Ryan will continue to bear the burden of his failure (so far) to denounce Trumpism, which probably won’t hurt him much in the Republican primaries, but I hope will keep him from winning the general election.

English notes

If you’re French: I probably don’t have to tell you that should in English is at least as bizarre as it is in French.  There’s a good web page on it here, from the Cambridge Dictionary.  Note that the page describes the British uses of the word, which are different from the American ones in some respects.  For example, the conditional form should you, as in should you want some coffee…, is not used in America–we would say if you want some coffee…  The UK also has a formal/neutral alternation between should and would that we don’t have in the US.  For example, the Brits have neutral I would love to come and formal I should love to come, but in the US, only I would love to come will work.  Finally, oughtn’t instead of shouldn’t is more formal in British English, but it’s dialectal and possibly stigmatized in the US.

What soldiers carry in their pockets

Some questions are deceptively simple, masking quite a bit of complexity and some non-obvious answers.

What could possibly be more French than wearing a scarf with your cammies? I don’t think that this is in any sense official, though–I found the picture on the web site of a French army surplus store. Certainly I’m not aware of any military that allows its members to lounge about with their hands in their uniform pockets. Picture source:

Some questions are deceptively simple, masking quite a bit of complexity and some non-obvious answers.  I ran across this one on Quora, a web site where you can post a question, and if you’re lucky, random strangers will answer it:

What do soldiers keep in all those pockets they have?

As you can tell from the responses: (a) there’s no one answer, and (b) people really do devote a lot of thought to this.  Reading through them, you’ll notice some basic themes: (1) the necessities of life, like eating and pooping; (2) the necessities of staying alive, like first aid kits; and (3) the necessities of keeping your mind/spirit alive at the same time as the rest of yourself.  Focussed around those basic themes, you’ll see a lot of variety and creativity–the Turkish soldier who carries sanitary pads to staunch bleeding and catch sweat, as well as women’s nylon stockings (I’m surprised more people didn’t mention those–they’re popular in areas where there are heavy mosquitoes); writing materials; be sure to read the one about the Marine and his stuffed frog…

My own answer reflects those themes pretty closely: What I kept in those pockets

Parts of the cammie blouse in French. Picture source:

So, when you read my posts about the incredible number of needless gun deaths in the US and think about what a naive lefty liberal I must be: yes, I am a lefty liberal, and, yes, I can fire a weapon just fine, and, yes: I am a pretty good shot, actually.

  • le treillis: [treji] fatigues, battledress, “cammies.”

Flirty repartee, French-style: technical terms for the mandible

You don’t think you can have flirty conversations about the mandible? You haven’t been to France.

La partie inférieure de la mâchoire, ou la mandibule: the mandible. Picture source:

Even more than soccer, flirting is often said to be France’s national sport.  I like it–it’s fun, and (so far) a harmless way to practice my French.  So, I was chatting with a girl over a cup of coffee in a café by the Museum of Comparative Anatomy and Paleontology one day (no, I’m not making that museum up–along with the Pergamon in Berlin and the British National Museum in London, it’s one of my favorites in the entire world, and I can’t even imagine how many hours I’ve spent there) when she said something that made my ears perk up: she mentioned la partie inférieure de la mâchoire.  The “lower part of the jaw:” what you and I probably know as the mandible, and most French speakers as la mandibule.  

On the surface, the jaw doesn’t look that interesting.  One bone, with a pretty simple hinge on each side.  It’s got a lot of subtleties, though, and if you look at how jaws vary across species, a jaw can tell you a lot about an animal.

Picture source:

Let’s look at the “gross anatomy” first–the basic parts.  In many species, the mandible has two main parts: the body, and the ramus.  The body is the part that’s parallel to the ground, and the ramus is the part that goes up vertically.  We’ll go through the French vocabulary now, rather than at the end of the post, because it’s essential to understanding my flirty ways.

  • la mâchoire: the jaw; jawbone.
  • la partie inférieure de la mâchoire: the lower part of the jaw. The official term for the mandible.
  • la partie supérieure de la mâchoire: the maxilla–again, the “official” term.
  • la mandibule: mandible.
  • le maxillaire: maxilla.
  • l’articulation temporomandibulaire: temporo-mandibular joint.
  • la branche de la mandibule: the ramus of the mandible.
  • le corps de la mandibule: the body of the mandible.

Why it made my ears perk up when she said la partie inférieur de la mâchoire: as far as I know, that’s the technical term for the mandible, versus la mandibule, which I believe to be more general language.  How often do you meet someone whose idea of witty repartee includes throwing around technical terms for bones?  Of course, then a bum walked by.  Blah blah blah blah, you gotta blah blah for me? 

What’s a blah-blah? …I asked my terminologically gifted new friend.  He wants a cigarette, she replied.  And what does blah blah blah blah mean? 

“Kevin Costner”–he called you Kevin Costner.

How the fuck does he know my name’s Kevin?

He doesn’t–you’re an American, and you’re bald, so…


English notes:

to make someone’s ears perk up: to make someone suddenly interested in what you’re saying.  How it was used in the post: She said something that made my ears perk up: she mentioned la partie inférieure de la mâchoire. 

repartee: (Yes, this is an English word.)  Witty conversation that goes back and forth.  From Merriam-Webster: conversation in which clever statements and replies are made quickly.  How it was used in the post: How often do you meet someone whose idea of witty repartee includes throwing around technical terms for bones?


Weapons Of Math Destruction: Cathy O’Neil on how people go wrong with Big Data

The Hype Cycle, beloved by technoskeptics such as myself. Big Data is somewhere around the “Peak of Inflated Expectations” point–maybe just starting down towards the trough. Picture source:
It’s tough to read technology news these days without hearing about the wonders of Big Data and how it’s going to revolutionize our world.  Apparently it will soon predict epidemics, prevent terrorist attacks, and boost farm production.

In truth, though: it’s not so clear that it’s a great thing.  One of the problems with Big Data is a special case of a general problem in the ethics of technology: the kinds of things that can go wrong when the public perception of how well/poorly technology performs doesn’t match well with the truth. In particular: when the public thinks that technology performs way better than it does.

You will occasionally hear people talking about how algorithms are going to take our jobs, bring about the zombie apocalypse prematurely, etc. More commonly, technology gee-whizzers will tell you the opposite: that they will remove bias and introduce complete objectivity to sentencing guidelines, for instance.  In fact, an algorithm is nothing more (or less) than a defined set of procedures.  In the case of an algorithm for computing, it’s typically a set of calculations. An algorithm can’t be biased. It can’t be unbiased, either. The data, though: that can be biased. An example from the interview: train an algorithm to evaluate resumes from applicants for jobs at an engineering firm. You could imagine training it with the resume of everyone who has ever been hired in the past, and the following piece of information for each person: whether or not they were a successful employee. If the engineering firm is a typical one, those previous hires are mostly going to have been males. Now the program learns the characteristics of a successful hire, and among other things, the program will conclude that a successful hire is going to be a male, since that’s all that it’s ever seen. Is the algorithm biased?  No. Is the person who programmed it biased?  No. What’s biased?  The data. Not biased in the way that a person is biased–rather, biased in the statistical sense: not every member of the population had an equal likelihood of being included in the training set.

Where people get seduced by things with the Big Data label on them is by the bigness. Most people know that the bigger your data set is, the more reliable the statistical model that comes out of it will be. A lot of people look at Big Data and think: there’s a LOT of data, so it’s GOT to be good. That’s where the trouble comes from.

I like this interview because it’s neither a gee-whiz-this-technology-is-so-great story, nor an ignorant oh-my-God-the-data-miners-are-going-to-kill-us story. The interviewee, Cathy O’Neil, knows what she’s talking about, and she explains it well.  The unbiased sentencing program?  It didn’t work out so great–see a very detailed story about it here.

Link to the interview with Cathy O’Neil:

French notes:

  • le big data: Big Data.
  • les mégadonnées: Big Data.
  • les données massives: Big Data.

English notes:

  • to sentence to (a punishment): to assign a punishment or penalty to someone.  Examples: A 46-year-old man threw feces in a Clark County, Ohio, courtroom Wednesday after learning he was being sentenced to 40 years in prison for armed robbery.  (Story here.)  Alan Turing, the pioneering computer scientist and cryptanalyst who cracked the Nazis’ Enigma code, was sentenced to chemical castration as a punishment for his homosexuality.
  • sentencing guidelines: instructions for how to determine the length of the jail or prison sentence of someone who has been convicted of a crime.  How it was used in the post: More commonly, technology gee-whizzers will tell you the opposite: that they will remove bias and introduce complete objectivity to sentencing guidelines, for instance. 

Zipf’s Law needs help, and by “help” I do not mean “money”

Picture source:

Help!  I need advice on memorizing conjugations.  I don’t remember how the hell I did it in school, and I’ve got 30 days left to prepare for the DELF/DALF exams.  I have no clue about how to handle the fact that I don’t know which conjugations I don’t know.  I’m pretty sure that there are some tenses that I’m weaker on than others, some verb classes that I’m weaker on than others, and some irregulars that I’ve never even heard about…  I’ve got the things that I know that I’m weaker on on my todo list for the month leading up to the exams, but I don’t know how to figure out what I don’t know.  How do you do it?  (I am not a big fan of ending blogs that way, but: how DO you do it??)

English notes:

to be weak on [a subject]: to not have sufficient knowledge of some subject.  See the definition below from Macmillan.  (There’s also a use that means something like not taking a strong or effective stance against something, and you see that in the news all the time right now–candidates accuse each other of being weak on crime, weak on ISIS, weak on Russia, etc.  That’s a different sense, though.)  How it appeared in the post: I’m pretty sure that there are some tenses that I’m weaker on than others, some verb classes that I’m weaker on than others, and some irregulars that I’ve never even heard about…

Picture source: screen shot of


Sorry for the gratuitous Wikipedia-bashing:

Picture source: screen shot of

From various and sundry tweets:

Thing about Trumpettes – a bit weak on math and logic.

Obviously,whoever started this is a bit weak on spelling – The anal retentive police say this s/b !!

Exactly, is the Dark Ages today. Strong on mythology, weak on science & lots of smiting.

So: if you’re weak on crime, you are not taking an effective stance against it.  If you are weak on the subjunctive, you don’t know enough about it.

Men, chocolate, and coffee: compositionality and the mapping problem

Linguists are sometimes accused of spending their time navel-gazing over sentences that are not realistic. The truth is that you don’t have to look any further than your daily life for real linguistic puzzles.

Sign on the wall of a Village Inn. Picture source: me.

Linguists and philosophers are sometimes accused of spending their time navel-gazing over sentences that are not realistic.  However, the truth is that you don’t have to look any further than your daily life for real puzzles, and sometimes for real challenges to linguistic theory.

Right at this moment, I’m sitting in a Village Inn.  If you’re French: that’s a restaurant chain that’s known for being somewhat déclassé (déclassé and other obscure English expressions explained below in the English and French notes), and for having great pancakes.  I’m somewhat déclassé, and it’s Saturday morning, so I’m sitting here treating myself to pancakes.  (Village Inn is not so redneck as to not have wifi.)  On the wall opposite me is the poster that you see at the beginning of this post.  It says:

Men, chocolate and coffee are all better rich.

Now: that is a joke.  It plays on multiple meanings of the word rich.  Something like this:

  • rich man: A man with a lot of money.
  • rich chocolate:  Containing a large amount of choice ingredients, such as butter, sugar, or eggs, and therefore unusually heavy or sweet: a rich dessert
  • rich coffee: Strong in aroma or flavor: a rich coffee (from

A reasonable native speaker could disagree with me over whether or not rich has different meanings in rich chocolate and rich coffee, but the essential fact about the example remains: rich has more than one sense in this sentence.

Who cares?  It’s like this.  One of the fundamental assumptions in the vast majority of approaches to understanding semantics (in the sense of the meaning of language) is something called compositionality.  Compositionality is the process of meaning being produced by something that you could think of as similar to addition (technically, it’s a more general “function,” but “addition” will work for our positions–linguists, no hate mail, please): the idea is that the meaning of Khani stole the butter is the adding together of the meanings of Khani, steal, butter, and the meaning of being in the subject position versus the object position of an active, transitive sentence.

That’s compositionality.  Another bit of background that we need: the mapping problem.  The mapping problem is the question of how the semantics of a sentence–its meaning–is related to the syntax of the sentence–the structure of the phrases of which the sentence is made up.  There are all sorts of problems here.  To give you one example: take a situation where my dog stole some butter.  The semantics are: there’s a dog, it’s my dog, there’s some butter, and the butter was taken, by the dog, without permission.  (You can’t believe how horrible the poo that I had to pick up over the course of the next 24 hours was.)  The syntax, though: there are multiple possibilities.  My dog stole the butter.  The butter was stolen by my dogs.  The meaning is the same–how do you account for multiple syntactic structures being usable for communicating that meaning?  I’m giving you a very simple example of a very complex and nuanced topic–again, no hate mail from linguists, please.

So: we have the mapping problem.  Your answer to it is probably going to involve compositionality.  Imagine this sentence:

Men are better rich, kind, and patient.

How do we map the semantics to the syntax via composition?  Let’s see:

  1. Take the significance of the subject position and the adjective relative to that verb in a declarative sentence…
  2. …add the meaning of to be, and
  3. …add the meaning of men…
  4. …add the meaning of rich…
  5. …add the meaning of kind…
  6. …and add the meaning of patient.

No probs–sentence structure meaning + word meanings = the meaning of the assertion.  Now let’s go back to the sign on the wall:

Men, chocolate and coffee are all better rich.

How do we map the semantics to the syntax via composition?  Let’s see:

  1. Take the meaning of to be and the significance of the subject position and the adjective relative to that verb in a declarative sentence…
  2. …add the meanings of men, chocolate, and coffee
  3. …and add the meaning of rich.

Ooooh–what the hell??  We have the one word rich, but we have three meanings.  We’ve been mapping one word to one meaning–how the hell can we get three meanings out of one word?  This works as a joke, versus just a simple statement, precisely–and only–because you can have that single word rich contributing three different meanings to the “utterance,” as we linguists say (énoncé in French).  Myself, though: I can’t for the life of me see how to reconcile it with linguistic theory.  That’s not a problem–it’s a good thing.  Personally, I am pretty happy with the notion that science gets pushed forward by finding problems with theories, not by showing how they work.  Something fun to think about while I listen to the hum of Berber, Spanish, and some very stigmatized dialects of English around me as I eat my redneck, Saturday-morning pancakes…

Native speakers of French: I’d love a similar example in the language of Molière–do you have one for me?

Postscript: the sentence that is the topic of this post contains the word and.  The word and is (believe it or not) actually one of the toughest problems in computational linguistics, and I have glossed over it in this discussion deliberately, despite the fact that it is crucial to the nature of the problem.  Another time, perhaps.  English and French notes below.

English notes:

  • déclassé: having inferior social status.  It can also have a similar meaning to the French meaning–fallen or lowered in class, rank, or social position,” per Merriam-Webster–but, I don’t believe I’ve ever heard it used in that sense.  How it appears in the post: That’s a restaurant chain that’s known for being somewhat déclassé, and for having great pancakes
  • redneck: from Merriam-Webster: a white person who lives in a small town or in the country especially in the southern U.S., who typically has a working-class job, and who is seen by others as being uneducated and having opinions and attitudes that are offensive.  It can also be an adjective, which is how I used it in the post.  How it appears in the post: Village Inn is not so redneck as to not have wifi.  Note: this can be a very offensive term if you are not yourself a redneck, and if you are not a native speaker, I recommend that you never use it.

French notes:

  • déclassé: downgraded, relegated, demoted.
  • le déclassé: dropout (societally, not from school)
  • gaulois: redneck, among other things.  See above for the definition of redneck; I don’t actually know whether or not the French word is offensive.