What Japan knows that the Western world doesn’t

It’s 2 AM and it’s been less than a week since the last time that I crossed an ocean, so I’m jet-lagged and sitting on the back porch having a cigarette.  There’s a full moon.  It’s warning me.

It’s 2 AM and it’s been less than a week since the last time that I crossed an ocean, so I’m jet-lagged and sitting on the back porch having a cigarette.  There’s a full moon.  It’s warning me.

I’ve written about man-eating rabbits beforeLe lapin anthropophage in French, el conejo antropófago in Spanish–we’re not talking about the adorable little Monty Python killer rabbit here.  We’re talking about rabbits with sharp, bloody fangs who eat people.  Yeah, I know–you’ve never seen a man-eating rabbit.  How do I know?  Because you’re still alive–if you ever see a man-eating rabbit, that’s the last thing you see.  You know how I know that the adorable little Monty Python killer rabbit is not a real, actual man-eating rabbit?  Because he let some of the humans get away.

Look up at the night sky and what do you see?  If you were born and raised in the Pacific Northwest, like I was, you see the Man in the Moon.  The Pacific Northwest, Russia, Germany, France–it’s all of a piece.  It’s an optical delusion that pervades Western artistic expression.  From music…

And the cat’s in the cradle and the silver spoon

Little Boy Blue and the Man in the Moon

When you comin’ home, Dad?  I don’t know when,

But we’ll get together then, yeah–you know we’ll have a good time then.

Harry Chapin

…to the graphic arts…

Source: https://www.gettyimages.com/detail/news-photo/vintage-illustration-of-a-winking-man-in-the-moon-rising-in-news-photo/530194185

…to the cinema…

Source: https://www.signature-reads.com/2016/01/how-jules-verne-inspired-a-generation-of-rocket-scientists/

…the Western world looks at the moon and sees…the face of a man.  In fact, the idea of the Man in the Moon is so embedded in Western culture that the moon is portrayed with a man’s face even when the moon is not full:

Source: https://www.pinterest.com/pin/406449935092586325/?lp=true
Source: https://www.allmusic.com/album/man-in-the-moon-mw0000586669

It’s 2 AM and it’s been less than a week since the last time that I crossed an ocean, so I’m jet-lagged and sitting on the back porch having a cigarette.  There’s a full moon.  I look around the yard: any telltale long ears sticking up out of the grass?  Is there the gleam of moonlight off of pointy white little fangs?

Japan: I think of it as the France of Asia.  France: I think of it as the Japan of Europe.  Culturally, the two countries share a lot: an obsession with aesthetics, with presentation, with formality and with formalism; and food…  Generalizing about individuals is usually a losing proposition, but cultures–yeah, you can generalize about cultures.  (That’s sorta their point, right?)  And if there is one thing that Japanese culture and French culture share, it is an obsession with food.

The great Japanese food movie: Tampopo.  If you’ve only watched one Japanese movie, it’s probably the one.  Cinephiles have Ran, of course, and The Seven Samurai; teenagers have (or once had) The Ring; but poll your friends and you’ll find that if they’ve only seen one Japanese movie, it was Tampopo.  It’s about food, and sex, and food, and heroism, and food, and love, and food, and America (yes, in a minute you’re going to see a Japanese truck-driving Western movie hero), and food, and work, and… well, food.  Everyone has seen Tampopo, or should, and if you’ve seen it–when you’ve seen it–you’ll have a favorite scene.

For many people, that scene is this comedic skit.  YouTube auto-completes it as tampopo choking scene: 

The glutinous substance upon which the old man chokes is mochi.  You making it by pounding glutinous rice.  I especially love it when used to make o-manju–when in Japan, I conduct research on how long you can live on a diet consisting solely of its daifuku form, and coffee.

It’s 2 AM and it’s been less than a week since the last time that I crossed an ocean, so I’m jet-lagged and sitting on the back porch having a cigarette.  There’s a full moon.  I look at the rabbit in it.

In Western culture, you look at the moon and you see the Man in the Moon.  In Japan and China, you look at the moon and you see what’s really there.  In China, I understand that he’s pounding herbs.  In Japan: he’s pounding mochi.  As I said: mochi starts with sticky rice.  You pound it.  It becomes mochi.  Somebody makes o-manju out of it, and I eat it.  Artisanal manju in a fancy department store, 7-11 daifuku at midnight–it’s all good.  Great, even.  (Yes, there are 7-11s in Japan.  If it weren’t for 7-11 and their competitor Lawson’s, I would not survive.)

Yeah, I know that you’ve never seen a man-eating rabbit.  I mean, neither have I–if I had, I wouldn’t be alive and sitting on the back porch smoking a cigarette, would I?  But, the man-eating rabbits leave traces, like everything else.  Crossing the boulevard St-Michel the other day, I saw this sticker:

Here’s the thing: no, you’ve never seen a man-eating rabbit.  But, like everything else, they leave traces, and sometimes they do so deliberately.  Propaganda.  Man-eating rabbit propaganda.  Why suspect us, indeed.  (Fun, and little-known, fact: inter-annotator agreement, a fundamental measure of corpus linguistics, has its roots in the pre-WWII study of propaganda.  Check out Krippendorff’s book Content Analysis: An introduction to its methodology for the story.)  Why English-language man-eating rabbit propaganda in Paris?  I don’t know–possibly because the man-eating rabbits know that the boulevard St-Michel is infested with non-French-speaking tourists.  Possibly man-eating rabbits just have so much disdain for les valeurs républicains that they can’t be bothered to pick up a fucking dictionary.  Who can understand the thought processes of a man-eating rabbit (beyond their obvious nastiness)?

What Japan knows that the Western world doesn’t: it’s not a man in the moon–it’s a rabbit.  It’s a rabbit that is pounding mochi.  

Source: Matthew Meyer, http://yokai.com/gyokuto/.

Japan has given the world some wonderful things.  Judo, the “gentle way,” with its philosophy of mutual benefit between humans as the best way forward.  (Yes, it is the opposite of stupid “America First” isolationism.)  Ramen, the noodle dish that surpasses any other known human food for its comfortingness, yumminess, and general ability to make the world feel like a good place.  (Yes: aligot is a strong competitor.)  And something that people are, in general, less aware of: the rabbit in the moon, that constant reminder that we must always, always, always be vigilant.  Vigilant of the man-eating rabbits.  Vigilant of zombies.  Vigilant of petulant man-babies who would sacrifice America on the altar of their own narcissism, pathetically weak ego, and financial profit.

It’s 2 AM and it’s been less than a week since the last time that I crossed an ocean, so I’m jet-lagged and sitting on the back porch having a cigarette.  There’s a full moon.  It’s warning me.  It’s warning you.  It’s warning all of us.

English notes: bare relative clauses

It is an unfortunate fact that in English, relative clauses can–and it sounds perfectly natural–appear without their relativizer.  What that means: a week since the last time that I crossed the ocean can also be said a week since the last time I crossed the ocean.  In the second case, the last time I crossed the ocean is known as a “bare” (basically, “unclothed”) relative clause.

Leaving out the relativizer (in this case, that) is totally natural in the spoken language, and as far as I know, it’s totally fine in the written language, too.  Not all thats can be omitted.  I think it has something to do with whether the relative is “restrictive” or “non-restrictive;” unfortunately, despite having a doctoral degree in linguistics, I’ve never quite grasped the difference between them, so I can’t say anything more on the subject.  Try this web page, and explain it to me if you can.

Bare relative clauses are totally English, but I suspect that they must be very difficult for non-native speakers who don’t yet have an excellent command of the language, so I try to avoid them in this blog.  Here I’ll give you bare and non-bare examples of relative clauses from this blog post, just to familiarize you with the issue:

Bare: It’s 2 AM and it’s been less than a week since the last time I crossed an ocean, so I’m jet-lagged and sitting on the back porch having a cigarette.  There’s a full moon.  It’s warning me.

Not bare (there must be a term for that): It’s 2 AM and it’s been less than a week since the last time that I crossed an ocean, so I’m jet-lagged and sitting on the back porch having a cigarette.  There’s a full moon.  It’s warning me.

Bare: You know how I know the adorable little Monty Python killer rabbit is not a real, actual man-eating rabbit?  Because he let some of the humans get away.

Not bare: You know how I know that the adorable little Monty Python killer rabbit is not a real, actual man-eating rabbit?  Because he let some of the humans get away.  (Note that this one is almost certainly not a restrictive/non-restrictive issue–the relative clause is not modifying a nominal group.)

Bare: And if there is one thing Japanese culture and French culture share, it is an obsession with food.

Not bare: And if there is one thing that Japanese culture and French culture share, it is an obsession with food.






Châteaux forts: How do French children learn vocabulary?

How do you learn vocabulary in a language with gender if the gender is not marked?

The Christmas holidays took me to the Loire Valley.  That’s an area that’s famous for chateaus (châteaux, n.m.pl), and that meant new vocabulary–Zipf’s Law and all that…

…which brings me to a mystery: how are French kids supposed to learn new words correctly when the graphics, diagrams, and the like from which they learn them don’t include the genders of the words?  In this post I’ve included four pictures showing terminology related to châteux forts–what we call “castles” in English.  Notice that in only one of the four is the gender of the words marked, and even in that diagram the gender is marked only inconsistently–gender is given here by the form of the definite article, and for terms that are given in the plural (les douves, the moats; les créneaux, crenellations; and les remparts, ramparts), you can’t tell the gender from the definite article.

Source: http://www.ikonet.com/fr/ledictionnairevisuel/arts-et-architecture/architecture/chateau-fort.php


Chateau fort de coucy
Source: http://rozsavolgyi.free.fr/cours/Premiere%20partie/Annexes/05-02-03.htm


Source: https://www.mireille33.fr/articles.php?lng=fr&pg=1215 (great page, BTW)





This is not just an idiosyncracy of medieval vocabulary for castles–it’s a very general phenomenon in French-language educational materials.  For example, here’s a diagram of a representative insect from Le grand livre marabout de la nature, edited by Fanny Delahaye:


…a representative bird from the 2004 version of Le petit Larousse compact:

…and one from the 3rd edition of Pierre Kamina’s Petit atlas d’anatomie:

…a non-representative sample chosen by scanning my bookshelf for educational materials with diagrams in them.

How about it, native speakers?  (Phil d’Ange, I’m lookin’ at you…)  How does a French student learn vocabulary without having the gender of the terms listed on diagrams that are intended to teach them?  Concretely: you’re a kid.  You’ve got a diagram like the ones shown on this page, and you need to learn the terms thereon.  How do you do so, given that the gender is not labelled?

English vocabulary

Idiosyncracy: From Merriam-Webster: a peculiarity of constitution or temperament; an individualizing characteristic or quality .  First known use: 1604.  Other words first observed in that year: appreciation, black eye, blotch, and chinchilla. https://www.merriam-webster.com/dictionary/idiosyncrasy


Ducklings and goslings and inklings, oh my

That moment when the elves take your baby and leave one of theirs in its place.

I dragged myself out of bed at 8:30 AM today.  Under normal circumstances, if I’m still in bed at 5:45 AM, it means that I had a rough night–I am most definitely both a morning person, and an early riser.  Seulement voilà (“the thing is”):

  1. At this time of year, it doesn’t get light outside in Paris until about 8:30 in the morning.
  2. At 2 AM I got obsessed with the need to learn all of the words for baby animals in French.

Morphemes are the things that words are made of.  For example, the plural cats has two morphemes: cat, and the that carries the meaning of plurality.  (This happens to be the example from which my child learned what a morpheme is–as a young child, and as we did the dishes together.  Must suck to be a linguist’s kid…)

English has an odd little morpheme that refers to things that are small.  Like the of cats, it is what is called a bound morpheme, meaning that it cannot be a word on its own–it has to be attached to something else.  (Contrast that with the cat in catnap (a short, light nap), catnip (a plant–it’s basically pot for cats), and cathouse (a brothel–archaic)).  Here are a couple of examples:

  • duckling: a baby duck.
  • inkling: a small hint, or a small piece of knowledge.  (I’ll give some examples of its use later.)
Source: http://www.vivre-en-irlande.fr/culture-irlandaise/changeling-fee-legende.  See the site for helpful information about how to recognize a foundling, return a foundling, etc.

The -ling morpheme is also not productive: that means that you can’t really use it freely to make “new words.”  For example, it’s not clear that anyone would know what you meant if you casually threw the words waterling (parallel to inkling) or penling (parallel to duckling) into a conversation.  (Contrast that with -gate, which over the course of my lifetime has become applicable to practically anything, with the meaning of “a scandal related to:” Bridgegate, Pizzagateetc.)  Because it’s not productive, one could list all of the words in English in which it occurs.  Limited only by my memory, of course.  My best shot at doing so:

  1. duckling: baby duck
  2. gosling: baby goose
  3. foundling: a child who has been found after having been abandoned
  4. changeling: when the elves take away your baby and leave one of their own in its place
  5. inkling: a small hint, idea, trace, piece of knowledge, clue

In the Foundling Hospital grounds, London, c1901 (1901)
The London Foundling Hospital in 1901, from an article about a 1911 foundling lottery in Paris at http://time.com/4433717/paris-baby-raffle-history/.

Now, I know what you’re thinking: Zipf, you’re a drooling idiot.  There are lots of words in English that end with -ling: for example, DROOLING.  Feeling, wheeling and dealing (French: mic-mac or micmac), healing… 

Well… I may be an idiot, but I’m not a drooling one.  Here’s the thing: a morpheme is defined by its sound (or spelling)–in our case, ling–and by its meaning.  Drooling and gosling (baby goose) contain the same sounds/letters, but not the same meaning of smallness, so it’s not the case that they share the same morpheme.  -ling is a pretty textbook (French: typique) example of a non-productive morpheme.

So, yeah: I don’t sleep much, and I’m trying to learn to speak French, so at 2 AM I got obsessed with learning the names of baby animals in French.  This web page got me started, and then I started searching WordReference.com for weird English-language baby animal names (say, gosling), and here you see the results.  (Yes, some occur more than once.) At 2 AM, I only knew chiot (puppy), chaton (kitten), and veau (calf)–how about you?  And, native speakers (Phil d’Ange, I’m lookin’ at you)–can you add some more?

Adult animals:

Juvenile animals:

English-language example sentences


  • The Steel Riders Saga is a sci-fi/fantasy novel about Free Wheeler, a foundling discovered by the legendary Steve Thompson during a deep terrain ATV ride. Thompson leads an ATV pack known as the “Steel Riders.” In their fantastical journeys Free Wheeler finds true love and home.  (Twitter, @quantum_tide)
  • Meanwhile, in Australia, there’s a National . I have never heard of anything so glorious! (Nobody in my family cares about gravy as much as I do. I… might be a foundling?)  (Twitter,
  • Can I just say…Baby Faced Finster. A foundling!! You Naughty Baby!! Hahaha! 😂❤️  (Twitter, @TheSuperAmanda)


  • I’ve mentioned this numerous times on the podcast but… I have an inkling that Nintendo will use Smash DLC to promote upcoming (inc third-party) Switch releases.  (Twitter, @pixelpar)
  • My new resolution is to not read the thread of comments of tweets where I know or have an inkling that it’s not going to be a good thing.  (Twitter, @valparkie)
  • You are a gem of a friend and you don’t have an inkling of how much i appreciate your ignorance of my vices.  (Twitter, @Shakti_Shetty)
  • I don’t have an inkling of what the future holds but I’m excited  (Twitter, @JaredTench)
  • Roommate, Camden *going to Waffle House in Dunn*: “If I get the smallest inkling of a crack-whore, I’m leaving!”  (Twitter, @dr_pattyguin)


What computational linguists actually do all day: The lexical frequency version

In practice, we spend most of our time trying to figure out where we went wrong in writing some computer program or another. 

Tell someone that you’re a computational linguist, and the next thing out of their mouth is likely to be either:

  1. How many languages do you speak?, or…
  2. What’s that?

In theory, computational linguists spend their time thinking about fun questions like:

  1. Is natural language Turing-complete?
  2. The relationship, if any, between what we know about words (say, the word dog can be a noun or a verb, and it occurs more often with the words bark and leash than with the word meow) and what we know about the world (say, a dog is a canine, and might like to chase balls, and will eat cat shit if not instructed otherwise).
  3. How Zipf’s Law, which describes the fact that a small number of words are extremely common, while a large number of words are extremely rare, but do occur, might or might not be related to the mathematical phenomenon of the fractal.

In practice, we spend most of our time trying to figure out where we went wrong in writing some computer program or another.  (OK: that, and writing grant proposals.)  Think that being a computational linguist sounds glamorous?  Here’s how I spent my morning.

All I gotta do: go through a bunch of documents and count how often each word in that bunch of documents occurs.  Easy-peasy–barely hard enough for a homework in Computational Linguistics 101.

Seulement voilà…

Screen Shot 2018-12-03 at 12.54.58

Easy enough to fix–I just failed to give the complete name of the program, and…. marde.

Screen Shot 2018-12-03 at 12.57.21

OK, easy enough to fix–I had written

Screen Shot 2018-12-03 at 12.58.58

…when I shoulda written

Screen Shot 2018-12-03 at 12.58.44

Shoulda: the typical spoken form of should have. 

(Note the square bracket near the end of the middle line–I had left it out.)  Great–avançons, alors.  But, no, fuckashitpiss:

Screen Shot 2018-12-03 at 13.03.54

Easy enough to fix–turns out I wrote this:


Screen Shot 2018-12-03 at 13.05.19

…when I shoulda written this:

Screen Shot 2018-12-03 at 13.06.35

(Note the dollar sign before the rightmost instance of words now.)  And so, on we go, but…

Screen Shot 2018-12-03 at 13.07.57

…and it’s easy enough to fix–I had written this:

Screen Shot 2018-12-03 at 13.08.57

…when I shoulda written this:

Screen Shot 2018-12-03 at 13.10.10.png

(Note the double quote before $frequencies{$words[$i]}\n”;) …and now I’m wondering:

  1. These errors were all on one single line–what other horrors have I hidden in this code, and will they be as easy to find as those were?
  2. What the hell was I thinking when I wrote that line?  Was I thinking about the upcoming dissertation defense at 2 PM?  Was I thinking about Trump giving my country to China?  Was I thinking about tomorrow’s colonoscopy? Who the hell knows, really–whatever it was, it apparently wasn’t this line of code…

Mais returnons… Ah marde, but at least this one will be easy to fix…

Screen Shot 2018-12-03 at 13.14.14

…except that I verify the existence of the directory, and then get this:

Screen Shot 2018-12-03 at 13.16.03

…which is the exact same error that I got before.  So, I go back and look at my code, where I see this, and remember that my error message is supposed to print out the name of the directory that it couldn’t open, but it did no such thing:

Screen Shot 2018-12-03 at 13.22.00

…which is ’cause I never gave the program the name of the input directory.  So I take care of that, and also tell my program to print out the name of the directory that it couldn’t open if, it fact, it can’t open a directory–as we saw above, I had planned to do this, but of course left out that little detail:

Screen Shot 2018-12-03 at 13.25.31

…and now I experience a tiny little bit of success, because my program does not crash.  Seulement voilà, it doesn’t actually produce any ouput:

Screen Shot 2018-12-03 at 13.27.18

Note the lack of a bunch of lexical frequencies… So, I go back to my script, and I start looking around in the region of the program where I meant for the output to happen.  I don’t see anything obvious in that area, so: I go further up in the code, and start doing what I need to do to convince myself that the earlier parts of the program are working the way that I intended them to.  This means printing out the results at intermediate steps of the processing. The resulting code (leaving out a bunch of details) looks like this:

Screen Shot 2018-12-03 at 13.32.22

…which does nothing different than it was doing before, so I know that I need to go even further up in the program and, again, print stuff out as I go, resulting in this:

Screen Shot 2018-12-03 at 13.35.57

…which, when I run the script, produces this:

Screen Shot 2018-12-03 at 13.37.25

…which suggests to me that the directory exists, and that I’m opening it correctly, but that I am either (a) reading its contents incorrectly, or (b) making a mistake when I make a decision about whether or not to open each file.  A quick Google search finds the problem for me–I had written this:

Screen Shot 2018-12-03 at 13.41.16…when I shoulda written this:

Screen Shot 2018-12-03 at 13.41.33

(Note that the text at the left end of the line was open, and now is opendir.)

Progress!  Now I get some output, but note the last line–I’m just getting a bunch of file names, and no word frequencies.  I can see the problem right away, though–I have the directory name right, and I have the file name right, but I need to combine them in order to be able to open the file.  Doing so gives me this code:

Screen Shot 2018-12-03 at 13.46.55

…which results in my script running successfully for a while, but then crashing, and I know exactly what causes said crash…

Screen Shot 2018-12-03 at 13.48.10.png

…and I know that it’s a bear to fix, and I’ve been working on this fucking task that’s barely difficult enough to make a good homework assignment for two hours, and now it’s time to go to the aforementioned dissertation defense, and… Soupire…

Meme source: https://imgur.com/gallery/fzbkRI8



Gratuitous picture of me and my cat

In which I can’t even get beyond the Introduction.

Your lexicon–the words that you know, and what you know about them–is unlike every other part of your knowledge of your native language in that it continues to grow over the course of your entire life.  By the time you’re a young child you know pretty much everything that you’re going to know about your language’s phonetics, phonology, morphology, and syntax.  Your lexicon, though–that continues to grow throughout your life.

Now imagine someone who tries to learn a second language as an adult.  Like everyone else who speaks that language, you’re going to be learning new words until you die.  But, that’s going to be a lot more obvious to you than it is to people who speak it natively, because unlike them, you didn’t spend your entire youth learning the vocabulary of that language–start studying a language in your 50s, and you are literally 50 years behind a native speaker when it comes to learning the lexicon of the language in question.

If you’ve been reading this blog for a while, you know that you don’t have to work very hard to find words that you don’t know: Zipf’s Law, which describes the fact that a small number of words of a language are very, very common, while the rest occur only very rarely–but do occur–ensures that you will be running across new words just going about your daily life.

Living in France, I have no difficulty whatsoever running into 10 words that I don’t know every single day.  Ads on the metro, the services written on a window installer’s truck, the name of a street that I walk by on the way to the lab–that’s all it takes.  Living in the US, it’s a bit harder, but it’s totally doable–listening to the radio, watching something on YouTube, or listening to a book on tape will do it.  10 words a day, every day (except the month of December, which I spend reviewing the words that I learned from January to November), and mine de rien, you have a vocabulary of thousands of words.

And yet: as Zipf’s Law would suggest, I still have no problem whatsoever finding 10 new words a day to learn.  Case in point: today I wanted to figure out what the symbol ≠ means in the grammar book that I’m working through at the moment (Grammaire progressive du français : niveau perfectionnement, B2 – C2, by Maïa Grégoire and Alina Kostucki).  So, I went to the “front matter” of the book–the table of contents and stuff like that.  This involved reading the Introduction, where I ran across the following:

WordReference.com found me most of the relevant definitions, and yet: dictionaries being the beautiful but imperfect things that they are (like, say, my cat), it did let me down for a couple words: relever, and mécanisation. To wit:

….même avec un vocabulaire riche et une bonne connaissance de la grammaire, les résultats atteints son souvent entravés par la persistance de fautes qui ont traversé les différents niveaux d’apprentissage. Bon nombre de ces difficultés tiennent à des interférences avec la langue d’origine et aucune grammaire ” générale ” ne peut prétendre en rendre compte.  D’autres, en revanche, relèvent de particularités de de la langue française, mal perçues par les étudiants, et que nous tentons d’exposer de la façon la plus claire possible.

My best guess for an English-language equivalent of relever de would be “to arise from.”  Here are some examples of to arise from from Word Sketch, purveyor of fine linguistic corpora and the tools for searching them:

  • The lectures focus on topics arising from research in science and technology.
  • The investigation arose from a referral from both Houses of the NSW Parliament.  (Arise is an irregular verb, with the past tense form arose.)
  • He blames Jews for the ills arising from the industrial revolution, e.g., class divisions and hatred.
  • Leukaemias are devastating diseases of the haemopoietic system that arise from aberrant stem or progenitor cells.  (Leukaemia and haemopoietic are the British English spellings of leukemia and hemopoietic.)

But: looking at WordReference, I don’t see to arise from as a possible translation of relever de, or vice versa.  Phil d’Ange?

The other problem word: la mécanisation.  The only translation of this word in Word Reference is…”mechanization”!  What that means: I can only guess (see above about how your lexicon grows over the course of your entire life), and none of my guesses would make sense in this context.  Mechanized infantry is infantry equipped with armored vehicles to move itself around, and mechanized artillery is artillery equipped with its own transport system, but oral mechanization, as in the sample from my book?  I haven’t the faintest clew.  (That’s “clue,” for us Americans–something about the faintest clew just demands that you spell it like a Brit.)

À la partie théorique, située sur la page de gauche, correspond, sur la page de droite, une présentation en contexte (parfois illustrée) des points de grammaire, et une série d’exercices de réemploi : exercices à trous, transformations, mécanisation orale, écrit.



Native speakers: can you show an anglophone some love?  (To show someone some love means to help them, to do something nice for them, to give them something.  Super-slangy.)


Finally, here is a gratuitous picture of a fat old bald guy and his cat Keiko.  As you can tell from the amount of light in the dwelling, the photo was taken in America, not in wintertime Paris.  The teddy bear on the floor is the property of my cat, and I suggest that you not touch it.

Conflict of interest statement: I have no conflicts of interest to declare.  I pay for a subscription to Sketch Engine, I bought the book, and Word Reference is free to one and all.

Becoming a computational linguist without double-majoring in linguistics and computer science

You’re an undergraduate, and you want to become a computational linguist? Here’s how to do it.

People who want to become computational linguists usually get a PhD in the subject.  Every once in a while, though, you run into someone who wants to study computational linguistics as an undergraduate.  In the United States, that means a student in what we call “college” and the rest of you call “university” (or, if you’re French, la fac’).  Undergraduate students in the US have one, and sometimes two, “majors”–the topic in which they will do the most coursework, and whose name will appear on their official paperwork when they graduate.  To “double-major” is to have two majors, rather than the usual one.  It’s not super-unusual to do this–I had a double major, in English and linguistics–but, it’s helpful to do a double major only if really necessary, as it’s a hell of a lot of work. 

If you’re getting a bachelor’s degree and want to be a computational linguist, a double major in computer science and linguistics is probably overkill.  (Overkill discussed in the English notes below.)  The most efficient way to become a computational linguist would be to get a degree in linguistics in a department that has computational linguists on the faculty, such as the University of Colorado at Boulder, or Ohio State University. If you want to try to become a computational linguist in a university that doesn’t have computational linguists in any department: first of all, your major should probably be linguistics, not computer science—computational linguists are a kind of linguist, right? (They are—I’m a computational linguist, and I’m a linguist.) You’ll want to do some coursework in the computer science department, but I wouldn’t actually recommend even a minor in computer science—that will probably require you to take some courses that won’t be the most useful ones for you, while taking up time that you could have been using to take courses that would be useful for you.

What should those courses be?  As many as possible from this list:

  • Corpus linguistics (usually offered in the linguistics department, but if your university doesn’t have such a course in the linguistics department, look for courses in the social science, communications, or media departments, possibly with names like “content analysis”)
  • Statistics (best in a linguistics or speech & hearing department–the traditional psychology department or agriculture school courses will kill you)
  • Machine learning (usually offered in a computer science department)
  • Natural language processing (presumably not what you meant by “computational linguistics,” or you would have said so)
  • Automatic speech recognition, if and only if you seriously think that you want to work in this area (often offered in the electrical engineering department)
  • Speech synthesis, if and only if you seriously think that you want to work in this area (again, often offered in the electrical engineering department)

Notice what’s not on this list: programming courses.  Take those if you know that you need them, but if you don’t know that you need them, then don’t take them.  Notice that I also haven’t said anything about linguistics courses: we’re assuming here that linguistics is your major, and you’re going to get a solid and well-rounded background in that field.

Picture source: Mariana Romanyshyn, Grammarly, Inc. https://www.slideshare.net/MarianaRomanyshyn/nlp-a-peek-into-a-day-of-a-computational-linguist-71510838

English notes:

overkill: doing way too much.  Examples:

How I used it in the post: If you’re getting a bachelor’s degree and want to be a computational linguist, a double major in computer science and linguistics is probably overkill. 


American English reading practice: John McCain, Trump, and torture

I’m a US military veteran, and proud of it. If anyone hates torture more than a military person, I don’t know who it is.

John McCain was shot down and held prisoner for 5 and a half years by the North Vietnamese. He never recovered physically from the frequent and lengthy torture sessions that he underwent. The son of an admiral, he was offered early release, but refused to be set free until all of his fellow prisoners were. Meanwhile, Trump avoided the draft, later bragged about it repeatedly in public, and attacked McCain repeatedly as a candidate and as president. Asshole.

Afin de travailler votre amerloque, voilà un reportage sur la torture, John McCain, et Trump.  On débute avec du vocabulaire, et puis je vous invite à suivre le lien vers l‘article dans son intégralité.

For more on a proud US military veteran’s opposition to Trump’s immoral ideas about torture, see this post.  Do you have corrections for my crappy French?  The Comments section awaits you.

Speaking out on torture and a Trump nominee, ailing McCain roils Washington

to speak out: to say something by way of a public statement, typically criticizing something.  Note that the preposition here is on, but it could also be about, and possibly others.

ailing: sick.  If English had the concept of langage soutenu, this would be soutenu, like many of the words in this article.

to roil: to stir up, to disturb, to put in a state of disorder (see Merriam-Webster, sense 2)

Sen. John McCain is 2,200 miles from Washington and hasn’t been on Capitol Hill in five months, but he showed this week that he remains a potent force in national politics and a polarizing figure within the Republican Party.

potent: powerful

polarizing: “to break up into opposing factions or groupings: a campaign that polarized the electorate” (Merriam-Webster, sense 3). Today’s Republican Party can generally be divided into people who like McCain, a war hero and basically OK guy right up to his recent death–versus immoral shitbags who cravenly support Trump no matter how low he stoops into the mud.  Thus: he’s a polarizing figure within the party.

But his declaration Wednesday in opposition to Gina Haspel, President Trump’s nominee for CIA director, has uniquely roiled the political scene. The denunciation has prompted reactions from fellow senators and a former vice president, as well as intemperate remarks from some Republicans aligned with Trump, including a White House aide.
to prompt:to serve as the inciting cause of : evidence prompting an investigation” (Merriam-Webster, sense 3).
intemperate:  not temperate, where “temperate” means “akeeping or held within limits not extreme or excessive MILDmarked by an absence or avoidance of extravagance, violence, or extreme partisanship” (Merriam-Webster, senses 2a and 2d)”
It has revived the fierce debate over torture and its effectiveness in extracting information in the years since the Sept. 11 terrorist attacks — from a man who speaks from experience. McCain was held for 5½ years in a North Vietnamese prison, often deprived of sleep, food and medical care, after a jet he piloted was shot down over Hanoi.
No need for translation here, but for context, it’s worth knowing that McCain was a war hero and a staunch supporter of the US military–and hugely, vocally opposed to torture.  In contrast, Trump the draft-dodger (réfractaire, I think) has long advocated it.  Asshole.
Click here for the complete article in the Washington Post.