Zipf’s Law and burning at the stake

Cathars being burnt alive at Montségur.  Picture source: http://vivre-au-moyen-age.over-blog.com/article-13095618.html
Cathars being burnt alive at Montségur. Picture source: http://vivre-au-moyen-age.over-blog.com/article-13095618.html

It’s the end of the peak publishing period in France–between late August and early November–and the beginning of the season of literary prizes.  Mathias Enard took the Goncourt prize yesterday for his novel Boussole (“compass; barometer (figurative), indicator”–definitions from WordReference.com), and this morning all of the guests on my news program were writers.  One of them read an essay on the subject of death as a revolutionary act (my favorite French news show is even more cerebral–like, by a long shot–than National Public Radio, the most intellectual of American news shows).  Along the way, he talked about the shift from cremation in the the classical world to burial in the Christian world, pointing out that in the Christian world, only witches and heretics die at the stake.  (If you’re reading this and are not a native speaker of English: the word “stake” typically means a piece of wood that has been sharpened at one end–what you kill a vampire with, right?  If the stake is stuck in the ground and someone is tied to it and burnt alive, it’s called “THE stake.”)

It was a perfect Zipf’s Law moment.  The word for “the stake” is le bûcher.  Is that an obscure word, in the sense of being one that most people wouldn’t know?  Not at all.  Is it a common word?  Not at all.  Why the hell would anyone know this word in a foreign language, though?  Last summer or so, I chanced upon my father reading Le bûcher de Montségur, the classic book on the siege of the Cathars at the Montségur fortress in southern France.  (Yes, there is a classic book on the siege of the Cathars at the Montségur fortress, in English, as well as in French.)  The Cathars were considered heretics, and when they surrendered to royalist troops, about 220 of them were burnt to death in a massive bonfire.  I didn’t know what the word bûcher in the title of the book meant, so I looked it up, then duly memorized it, despite my confidence that I would never see it again.  Over a year later, it’s literary award season, and I run into it again, on the morning news…  Zipf’s Law at its best.

So, in the spirit of the literary prize season, let’s look at some meanings of bûcher, both as a noun and as a verb.  Definitions from WordReference.com:

  • le bûcher (tas de bois où on brûle les morts): funeral pyre.  This is presumably the sense in Le bûcher de Montségur.
  • le bûcher (tas de bois où on exécutait les coupables): stake.
  • le bûcher (abri pour bois): woodshed.
  • le bûcheron: lumberjack, woodcutter.  (Like that derived noun?)
  • bûcher: to work your butt off.  (In Quebec, it can also mean “to log, fell trees; to chop wood.”)

If you’re reading this and you’re French: please note that to us Americans, the word bûcher (stake, funeral pyre) and boucher (butcher) sound the same, so please have mercy on us if we say one when you’re sure we mean the other.  Feel free to correct our pronunciation, though–it’s good for us.

Why Paris might have more in common with Manhattan than it does with Clichy-sous-Bois

A cover of France-Amérique, a magazine for French lovers of America. The title of the cover story is "The American heart: an investigation of philanthropy." Picture source: france-amerique.com
A cover of France-Amérique, a magazine for French lovers of America. The title of the cover story is “The American heart: an investigation of philanthropy.” Picture source: france-amerique.com

Lately I’ve been obsessed with the general incorrectness of the common American belief that the French look down on everything about us.  A couple days ago, I talked about the popularity of English words in France.  The topic of French attitudes about America came up again this morning.  Listening to the news on the way to work, there was a long segment on the banlieus défavorisés of France–the poor suburbs where much of the French underclass lives.  2015 is the 10th anniversary of the 2005 riots, which were very much a feature of the banlieus (unlike, say, the student riots of 1968, which were very much an urban phenomenon).  There was a guest who had been invited to talk about his theories of the geographic aspects of the banlieus.  His take on it is that part of what makes life in the banlieus what it is is that they have very little in the way of public transportation–as Wikipedia puts it in its article on the infamous Clichy-sous-Bois banlieu, where the 2005 riots started, “Clichy-sous-Bois is not served by any motorway or major road and no railway and therefore remains one of the most isolated of the inner suburbs of Paris.”  So, you have this paradox that Clichy-sous-Bois is maybe 10 miles from Paris, but has less in common with life in Paris–culturally, politically, economically–than Manhattan does.  Here, the guest saw the situation in America much more positively–his take was that in America, the low-income areas are mostly urban, not suburban; they have the same public transportation as the rest of the city does, and therefore the residents of an American ghetto have the same access to universities, museums, etc., as more well-off residents of the city do.  Obviously, it’s more complicated than that–we have seriously low-income and culturally disconnected little towns scattered throughout Appalachia and elsewhere, and if you live in a poor urban area of an American city, you probably have other obstacles to your access to universities, museums, etc., besides transportation.  But, the guest was right in that the geographic facts that keep the residents of the banlieus isolated from the rest of French life don’t generally have the same ill effects in an American urban ghetto.

Public transportation (transport un commun) is an important aspect of life in France–let’s look at a little bit of vocabulary from the French Wikipedia page on the subject (translations from WordReference.com):

Le transport en commun, ou transport collectif, consiste à transporter plusieurs personnes ensemble sur un même trajet. Il est généralement accessible en contrepartie d‘un titre de transport comme un billet, ticket ou une carte.

  • le trajet: journey; plane flight; car or bus ride
  • en contrepartie de: in return for, in exchange for
  • le titre de transport: ticket

What’s the difference between billet and ticket?  No clue.  Perhaps someone can tell me in the Comments section?

Resources for learning French: One Thing In A French Day

From the home page of the
From the home page of the “One thing in a French day” web site, a good resource for intermediate students of French. Picture source: screenshot of http://onethinginafrenchday.podbean.com/

The world is full of books, YouTube videos, etc. for people who speak English and want to start learning French.  It’s harder to find materials that are suited for somewhat advanced students of the language.  It turns out, however, that there are some very good resources out there, if you can find them.

One that I think is good for people at about the intermediate level is the podcast One thing in a French day.  The podcast consists of a short, read essay (i.e., it was written and then read out loud, versus being spontaneous speech) about some thing or another in the podcast creator’s day.  Perhaps Laetitia buys a new printer.  Maybe she meets a friend at a patisserie for coffee and a pastry.  Maybe one of her daughters loses her Pass Navigo.  Whatever it is, Laetitia tells you about it, and Zipf’s Law strikes–you are quite likely to learn some new words in every podcast.

The podcasts are free, and a transcription of the beginning of the podcast is available on the web site, also gratis.  New ones come out 2-3 times a week.  For 3 euros a month, Laetitia will email you a full transcript of every podcast, along with some grammatical or vocabulary points of interest, and will answer questions.

As I said, Laetitia reads the essays, and her pronunciation is quite clear.  This makes One thing in a French day quite good for intermediate students, but possibly not as challenging as it could be for advanced students.  (Laetitia does point out that Il est vrai que c’est un texte lu, par contre je ne fais pas d’effort particulier pour ralentir le débit. Le rythme est mon rythme naturel.  “It’s true that it’s a read text.  On the other hand, I don’t make any particular effort to reduce the speed.  The rhythm is my natural rhythm.”) Still, no matter what level you are at, you will learn stuff from the  podcasts, and it’s nice to keep up with what’s going on with Laetitia and her family, as well as to get a little glimpse into the life of a normal French family.  To give you the flavor of the podcast, here are some words that I learnt from the most recent one (definitions from WordReference.com):

  • entrecoupée par: broken up by.
  • ensoleillé: sunny, bathed in sunlight.
  • la voie verte: I had to write to Laetitia herself for this one.  Here’s her answer: Pour répondre à votre question : La voie verte est une ancienne voie de chemin de fer entre Chalon-sur-Saône et Mâcon qui a été goudronnée et qui est maintenant réservée aux piétons, aux vélos, aux fauteuils roulants ou aux rollers. Il y a plus de quarante kilomètres de promenade.
  • prendre le pli: to get used to something, to get into the groove of something.  Prendre le pli de faire qqch: to get into the habit of doing something.
  • perché: perched, sitting on.

Nous avons fait d’autres belles visites pendant cette semaine en Bourgogne, entrecoupée par deux voyages à Lyon et de longues promenades ensoleillées sur la voie verte. Lisa est une bonne marcheuse, une fois qu’elle a pris le pli. Son record sur la voie verte : sept kilomètres.

Nous avons visité le site médiéval de Brancion. Un petit village perché sur une colline.

That’s 5 words in the first 5 sentences–as I said, you will learn stuff from One thing in a French day! 

The term for term is terme

The technical terminology of kitchen sinks. Picture source: http://www.homeblog.link/tag/kitchen-sink-components
The technical terminology of kitchen sinks. Picture source: http://www.homeblog.link/tag/kitchen-sink-components

In the United States, many people have the conception that France is somehow opposed to the English language.  This couldn’t be further from the truth.  Sprinkling your French with English is considered cool and au courant; so many French singers now record in English that it’s increasingly difficult for French radio stations to find French-language music to play; and you see advertisements on TV in English in France more than you would believe.  (One morning in Paris this summer, I had the news on the TV while I was eating breakfast.  As usual, I was struggling pretty hard to understand anything.  Suddenly, I was understanding everything, and the past year and a half of intensive French study had clearly paid off, and I was finally, finally, getting it.  Then I realized: I was hearing an advertisement, and it was in English.  Sigh!)

As you might suspect, the area where the greatest incursion of English into French happens is in technical terminology.  The leaders in creating French-language equivalents for English technical terms are actually not the French, but the Canadian folks at the Office Québécois de la langue française.  They maintain the Grand Dictionnaire Terminologique web site.  This is an on-line dictionary that lets you search for technical terms in a specific domain, or in all domains simultaneously.  Jean-Benoît Nadeau and Julie Barlow say in their book The story of French that the French Academy’s web site gets two million hits a year, while the Grand Dictionnaire Terminologique gets fifty million hits a year.  Quebec’s work in keeping French terminology up-to-date and a viable alternative to English terminology has been adopted as an approach by countries all over the world.

  • le terme: term, word; also term, date, or limit.
  • la terminologie: terminology, in the sense of specialized vocabulary.
  • le vocabulaire: vocabulary.
  • le lexique: lexicon, vocabulary; glossary; small pocket bilingual dictionary or phrase book.  I think it’s also the set of words in a text, but I can’t prove that right at this moment.

(Yes, the title of this blog post is an Ursula K. Le Guin reference: The word for world is forest.)

A certain convocation of politic worms are e’en at him

The French Ministry of Foreign Affairs, usually known by its nickname, the Quai d'Orsay. Picture source: this blog, which has a nice post about the building. http://davidplusworld.com/french-ministry-foreign-affairs/
The French Ministry of Foreign Affairs, usually known by its nickname, the Quai d’Orsay. Picture source: this blog, which has a nice post about the building. http://davidplusworld.com/french-ministry-foreign-affairs/

It’s the political season in many parts of the world.  Lots of Europe is having elections, and the presidential campaign is well along in the United States.  Last night there was a debate amongst the contenders for the nomination for Republican presidential candidate.  Jeb Bush attacked his erstwhile protege, and now opponent, Marco Rubio over his attendance record in the Senate, saying “The Senate, what is it like a French work week? You get, like, three days where you have to show up?””  Hearing this, I was struck by the difference between “in theory” and “in practice” that is ever-present in France.  In theory, France has a 35-hour work-week (versus 40 hours in the US).  In practice, only about 50% of the French working population qualifies for the restriction.  The lab where I work when I’m in France qualifies for the 35-hour week in theory, but in practice, they have a 37.5 hour work-week, the idea being that they get more holidays than most people, so it balances out.  37.5 hours in theory, mind you–in practice, I frequently get responses to emails at midnight.

With politics being a hot topic in the French news right now, we need some new vocabulary if we’re going to be able to listen to the news on the way to work in the morning.  A word that’s been coming up quite a bit lately is politique.  It has a number of meanings:

  • politique: as an adjective, it means “political.”
  • la politique: politics, but also policy, which is actually the sense that I’ve been hearing the most on the news.  Politique extérieure: foreign policy.
  • le politique: politician.

You probably noticed that this is one of those nouns that has different meanings depending on whether it’s masculine or feminine.  Masculine: a politician.  Feminine: politics or policy.

Putting together this blog post on the word politique, I was reminded of the “politic worms” of Hamlet:

Not where he eats, but where he is eaten. A certain convocation of politic worms are e’en at him. Your worm is your only emperor for diet. We fat all creatures else to fat us, and we fat ourselves for maggots. Your fat king and your lean beggar is but variable service—two dishes, but to one table. That’s the end.

(William Shakespeare, Hamlet: Act 4, Scene 3.)

The Shakespeare Navigators web site translates politic here as “crafty, prying.”  I don’t know whether or not that pejorative meaning is intended–the Oxford English Dictionary says that during the same period, it could also mean “prudent, shrewd, sagacious.”  Given Hamlet’s overall affect in that scene, I guess that the pejorative interpretation is probably justified.  In any case: a great image to keep in mind as you listen to the political news these days.

Starting your day with Zipf’s Law

Picture source: http://www.memecenter.com/fun/1898983/cat-alarm-clock
Picture source: http://www.memecenter.com/fun/1898983/cat-alarm-clock

News stories are one of the great ways to start your day with an encounter with Zipf’s Law–by virtue of being the “new”s, they bring new words into your life, and by virtue of things usually staying in the news for a few days, you’ll get to review them in the days to come.  I’ve found a great French news podcast, and I like to listen to it on the way to work every morning.  (Sorry–I can’t find a web page, but you can see their Twitter feed here.  Try searching iTunes for France Culture Matin.)  My command of French being as weak as it is, I run into Zipf’s Law in the first sentence every morning.  The announcer always opens the broadcast with  Bon jour, bon réveil a tous.  What the heck is réveil?  Turns out that it has multiple meanings.

  • le réveil (fin du sommeil): waking, waking up, awakening.
  • le réveil (horloge qui sonne): alarm clock.

Yesterday I talked about the cute video about the guy and his cat.  In the video, the guy says that one good thing about having a cat is that it can be a réveil–alarm clock–for you.  (He also says that the bad thing is that the time is completely random.)

There’s a related word:

  • le réveillon: Christmas Eve or New Year’s Eve dinner.

One evening I had a glass of wine with the beautiful Françoise after work.  At the end of the evening, she either told me that she was going to visit her mother in Brittany for Christmas Eve, or that we should get together for Christmas Eve.  My French is so bad that I couldn’t tell, and no matter how many times I ask her to repeat herself, she never seems to believe that I don’t understand half of what I hear.  I didn’t know what was happening on Christmas until I got a text from her on Christmas Eve saying that the only lift that she could find to the réveillon that we were apparently going to in the ‘burbs was on a motorcycle.  Zipf’s Law!

Now, where there’s a noun, you might suspect that there’s a verb, and sure enough, we have one:

  • réveillonner: to celebrate Christmas Eve or New Year’s Eve.  A delightful verb if I’ve ever heard one.

Where does all of this come from?  I would guess from this verb:

  • veiller: to stay awake, or to keep a vigil over someone, to sit by their bedside.

Bon réveil, and may the odds of Zipf’s Law be ever in your favor!

Oblique strikes

Map of the Schengen Area. Countries in blue are already members, and countries in orange will be joining. Photo source:
Map of the Schengen Area. Countries in blue are already members, and countries in orange will be joining. Photo source: “Map of the Schengen Area” by Rob984 – Derived from File:Schengen Area.svg. Licensed under CC BY-SA 4.0 via Commons – https://commons.wikimedia.org/wiki/File:Map_of_the_Schengen_Area.svg#/media/File:Map_of_the_Schengen_Area.svg.

Listening to the news this morning, I heard an interesting new term.  Part of what was interesting was that the broadcaster felt it necessary to explain what the term meant.  The term was la frappe oblique, or “oblique strike.”  If I understood the story correctly (never a given), there is a European commission meeting on the subject of what to do about Islamic State (usually referred to as Daesh in French, the same as in Arabic) plans for “oblique strikes.”  As the broadcaster explained, an “oblique strike” is carried out by having a citizen of one European country carry out a terrorist attack in another European country.  The idea is that if you have, say, a French citizen who is associated with a terrorist group, that person might be under investigation by the French police, but they won’t be under surveillance by, say, the German or Spanish police.  Within the Schengen Area (the territories of the 26 European countries that don’t have any restrictions on travel between them), that French citizen could travel to any other country–say, Germany–at which point they drop off of the French police’s radar, and are much freer to carry out an attack.

It’s so nice to have terms explained on French radio.  Even in your own native language (that’s English for me, not French), Zipf’s Law strikes on occasion.  As a side note, the ability of speakers of a language to explain words to one another is theoretically interesting to some extent, as on a very strong version of structuralism, it shouldn’t be possible for them to do that.  Clearly we can.  That doesn’t negate more reasonable versions of structuralism, though–it’s a useful way of thinking about language.

Oh, my

Photo source: http://www.keepcalm-o-matic.co.uk/p/keep-calm-and-love-phonetics/.
Photo source: http://www.keepcalm-o-matic.co.uk/p/keep-calm-and-love-phonetics/.

In English, the spelling of a word doesn’t tell you how to pronounce it—it just gives you some clues about how to pronounce it. Through, though, tough, and plough are famous examples.  French is the same. But, even more so, it’s the case that in French, knowing how to pronounce a word only gives you the slightest clue how to spell it. In a previous post, we looked at several ways to spell words that sound like mur. Here are nine different words that all sound identical in French. Specifically, they all sound like the English word oh:

  • o: this is the letter of the alphabet.
  • ô: this is the poetic “oh”–“Oh, bird of my soul, fly away now, For I possess a hundred fortified towers.”  (Rumi)
  • au: in theory, this means “to the,” but you see it in lots of other uses, like things that would be compound nouns in English— for example, pain au chocolat, a delicious chocolate-filled square croissant.
  • aux: “to the” again, but this time plural.
  • eau: water.
  • eaux: waters.
  • haut: high (male singular)
  • hauts: high (male plural)
  • os: bone

Of course, that doesn’t mean that the same letters or letter combinations always sound the same.  My favorite is notre and nôtre. Despite the fact that the words o and ô are pronounced the same (see above for their meanings), notre and nôtre, which mean almost the same thing (roughly “our” and “ours”), are pronounced quite differently.

Incidentally: the technical term for words that sound the same as other words is homophones.  You see them in lots of languages.  They may or may not also be homographs—words that are spelt the same.  We talked about the ubiquity of ambiguity in human languages in a previous post–homophones are a source of ambiguity in spoken language, and homographs are a source of ambiguity in written language.

Can you add any more words to my list of French words that are pronounced o?  If so, how about putting them in the Comments section?

ALICE in Zipf’s Law Land

Screenshot 2015-10-25 15.58.11Randomly Googling Zipf’s Law, I came across this web page that talks about one aspect of the significance of Zipf’s Law for natural language processing–that is, getting computers to deal with human language.

The page is on the web site for A.L.I.C.E., a computer program that uses frequently-occurring patterns to give the appearance of understanding, and replying to, things that are “said” to it in English.  The page points out that for A.L.I.C.E., there’s an advantage that comes from Zipf’s Law: it means that a relatively small number of patterns encoded into A.L.I.C.E. allow it to process a very large percentage of the things that people say to it.  Here are the most common things that people “say” to A.L.I.C.E.:

531 WHAT IS YOUR NAME
352 WHAT IS MY NAME
171 WHAT IS UP
137 WHAT IS YOUR FAVORITE COLOR
126 WHAT IS THE MEANING OF LIFE
122 WHAT IS THAT
102 WHAT IS YOUR FAVORITE MOVIE
92 WHAT IS IT
75 WHAT IS A BOTMASTER
70 WHAT IS YOUR IQ
59 WHAT IS REDUCTIONISM

(I don’t know what the total count is–it would be nice if the web page gave percentages.)  What is What is reductionism doing there?  I’m guessing that it’s because A.L.I.C.E. is presented as an artificial intelligence application, and reductionism is a theoretical topic in artificial intelligence.  (Here’s Neil Rowe‘s take on reductionism: “Perhaps the key issue in artificial intelligence is reductionism, the degree to which a program fails to reflect the full complexity of human beings. Reductionism includes how often program behavior duplicates human behavior and how much it differs when it does differ. Reductionism is partly a moral issue because it requires moral judgments. Reductionism is also a social issue because it relates to automation.”)  Apparently a lot of geeks like to talk to A.L.I.C.E.–either that, or there are hella people in the world that are interested in reductionism.

Of course, the flip side of Zipf’s Law for natural language processing is that an enormous number of the inputs to your program will only occur very infrequently, and it’s going to be very difficult to cope with all of those.  Zipf’s Law cuts both ways.

Here are some words that I didn’t know on the French Wikipedia page about artificial intelligence:

    • se vouloir: to claim to be.  L’intelligence artificielle est le nom donné à l’intelligence des machines et des logiciels. Elle se veut discipline scientifique recherchant des méthodes de création ou de simulation de l’intelligence.  “Artificial intelligence is the name given to the intelligence of machines and computer programs.  It claims to be a scientific discipline researching methods of creation or simulation of intelligence.”
    • abréger: to shorten, abbreviate, abridge, summarize; to make (something) fly by. Le terme « intelligence artificielle », créé par John McCarthy, est souvent abrégé par le sigle « I.A. » (ou « A.I. » en anglais, pour Artificial Intelligence). “The term ‘artificial intelligence,’ created by John McCarthy, is often abbreviated by the acronym ‘I.A.’ (or ‘A.I.’ in English, for Artificial Intelligence).”

I love a good monosyllable II: voile

In a previous post, I explained why I love learning new English monosyllables.  Today I ran across the English word voile.  Wiktionary defines voile as a light, translucent cotton fabric used for making curtains and dresses.  This particular word is a nice Zipf’s Law phenomenon both because I’m 53 and I just learnt it today, and because its etymology is French.

In French, voile actually has a number of meanings, depending on whether it’s male or female.  None of them are quite the same as the meaning in English:

  • le voile: this is “veil,” and now also “headscarf” or “hijab.”  You will see this meaning in France quite a bit these days, because of la loi sur le voile intégral.  This is the informal name for a law which, among other things, forbids wearing the full-face veil in public.  It is quite controversial.  (I should note that I see women in full-face veils in Paris routinely, and I have never seen the law enforced.)  WordReference.com also gives a meaning related to what sounds like a sort of skin forming on top of a liquid, and the buckling of a wheel.  I don’t think I’ve ever run into either of those, but Zipf’s Law being what it is, I’ll probably see them both tomorrow…
  • la voile: this is a “sail,” and also “sailing.”

Life being weird, I learnt the English word voile in a post about renovating a house in France–you can check it out here.

Ukrainian Humanitarian Resistance

Resisting the russist occupation while keeping our humanity

Languages. Motivation. Education. Travelling

"Je suis féru(e) de langues" is about language learning, study tips and travelling. Join my community!

Curative Power of Medical Data

JCDL 2020 Workshop on Biomedical Natural Language Processing

Crimescribe

Criminal Curiosities

BioNLP

Biomedical natural language processing

Mostly Mammoths

but other things that fascinate me, too

Zygoma

Adventures in natural history collections

Our French Oasis

FAMILY LIFE IN A FRENCH COUNTRY VILLAGE

ACL 2017

PC Chairs Blog

Abby Mullen

A site about history and life

EFL Notes

Random commentary on teaching English as a foreign language

Natural Language Processing

Université Paris-Centrale, Spring 2017

Speak Out in Spanish!

living and loving language

- MIKE STEEDEN -

THE DRIVELLINGS OF TWATTERSLEY FROMAGE