No, the French do not hate Americans

It’s the weekend of the celebration of the liberation of Paris from the Nazis.  I step out on my balcony for a cigarette, and I see a parade of old World War II military vehicles roll down l’Avenue de la Motte-Picquet.  When the American vehicles come, the onlookers cheer and clap.  The French vehicles go by unapplauded.

It’s August in Paris, when there is dancing on the banks of the Seine.  I walk up to a woman and ask her to dance.  She walks into my arms and asks Where are you from?  Later, I ask her how she knew so immediately that I wasn’t French–in France, asking a French person where they’re from is rude, although it’s (mostly) fine for non-French.  (More on this below, in the French notes.)  You hesitated a bit before a word, she said.  Then she thought for a moment more: …and you walked up to me with this directness and openness that I admire in Americans.  

It’s my first time in France, and I don’t speak French. Someone is telling me where to find a specific hotel in Normandy, and says–in English, obviously–That’s where you saved our fucking asses–twice.

No, French people do not hate Americans.

L’Avenue du Président-Kennedy, seen from the Bir-Hakeïm Bridge in Paris. Source: Mbzt [CC BY-SA 3.0 (
L’Avenue Franklin-D.-Roosevelt, Paris. Source: Mbzt [CC BY-SA 3.0 (

French notes

In France, you do not ask a French person where they’re from (vous venez d’où ?).  It’s rude, because the implication is that you don’t really belong in French.  Rather, you ask What region are you from–vous venez de quelle région ?  Point of pride: when I first started spending time in France as a francophone, people would ask me So, you’re an American?  Then, they progressed to Where are you from?, or occasionally So, you’re British/Belgian/German/Suiss?  Now, after 5 years of constant and intensive study of the langue de Molière, I very, very occasionally get what region are you from?  Always warms my heart.

Languages that give you a sore throat

I notice that my mouth hurts.  Then I realize why: I’m thinking in French.

I’m walking down the street, and in one hand I have a shopping bag containing books that I just paid a week’s-worth of grocery money for.  In the other hand: a shopping bag containing the most disgusting canned food available, ’cause… see the preceding sentence about books that I just paid a week’s-worth of grocery money for.  I realize that my mouth hurts.  Then I realize why: as I walk down the street, I’m thinking in French–but, I haven’t spoken it much lately.

It’s no secret that speaking a language that you don’t typically speak can make your mouth hurt.  I speak Spanish for exactly one week a year, and it always makes my cheeks sore: the kinematics of Spanish are quite different from English and French (my languages of daily life), and the difference is enough to wear out my muscles.  If I haven’t spoken French much for a week or two, my lips get tired: the French (International Phonetic Alphabet [y]) requires more rounding than any sound in English or Spanish.  But, Kaqchikel: Kaqchikel is giving me a sore throat.

I spend one week a year volunteering with a group called Surgicorps in Guatemala, a country the size of Tennessee–with 23-25 different languages.  70% of the population is “indígena,” which in these parts (see the English notes below for what in these parts means) means Mayan Indian.  There are 20-22 different Mayan languages spoken in Guatemala, plus Spanish and two other non-Mayan Indian languages.  Kaqchikel is one of the four mayoritarias, or “big” Mayan languages, being spoken by around half a million people; in preparation for my week of volunteer work I just spent several hours a day for the preceding two weeks studying it in a local language school.

Part of what makes Kaqchikel sound the way that it does is its ejective consonants.  Those are the “popping” sounds that you hear in the following YouTube video.  Why they “pop:” because of the way that you make the air come out of your mouth when you make them.  Most sounds of language are made with what is called a pulmonic egressive airstream mechanism.  “Airstream mechanism” refers to the way that you make the air flow to make the sound.  Egressive means that when you make the sound, the air flows outward; and pulmonic means that the flow of air is initiated in the lungs.

Ejective consonants are produced by what is known as a glottalic airstream mechanism.  That means that the airflow is powered by closing the vocal folds (vocal chords in non-technical English).  In the case of a glottalic egressive consonant, you put your tongue wherever it goes to make the sound in question, you close your vocal folds, and then you lift your glottis upwards.  This increases the air pressure in the oral cavity, and when you open your mouth to release the sound, that elevated air pressure gives the consonant the characteristic ejective “pop.”

So… why the sore throat?  From clamping my vocal folds shut all day while I’m (trying to) speak Kaqchikel.  Mind you, I already (a) smoke way too much, and (b) spend a lot of my waking hours speaking French, so my voice is already so low that making myself heard by an American without shouting is sometimes difficult.

One week a year I head south to Guatemala, where I do English/Spanish interpretation for Surgicorps, a wonderful group of surgeons, nurses, anesthesiologists, technicians, and therapists who provide free specialty surgical services to people who would not otherwise have access to them.  We buy our own plane tickets and pay for our own hotel rooms.  A donation from you to Surgicorps goes to taking care of our patients, and even a little bit helps—$250 pays all of the surgical expenses for one patient, $25 pays for a pack of instruments, and $10 buys all of the pain-killers that we hand out in a week.  If you enjoy my posts from Guatemala, please consider a donation, large or small–just click here.

English notes

in these partsin this geographical area.  I’m just going to give you one example, in the hopes that you will take the time to watch the very powerful video embedded in the tweet.

How I used it in the post: I spend one week a year volunteering with a group called Surgicorps in Guatemala, a country the size of Tennessee–with 23-25 different languages.  70% of the population is “indígena,” which in these parts means Mayan Indian.  “In these parts” refers back to “Guatemala.”

Jokes that can’t be translated

Sucking the joy out of language for 30 years

Things written in [square brackets] are in the International Phonetic Alphabet.

French: A farmer in Picardy takes his pig to the vet.  The vet says to him: c’est tatoué?  The farmer says: ben sûr c’est à mwé!

English: What’s black and white and [rɛd] all over?  A newspaper.

American Spanish: How is a cat like a priest? Ambos [kasan].  

The French joke relies on a regional dialect where oi is at least sometimes pronounced wé rather than wa.  The vet asks the farmer is it tattooed? in standard French, but the farmer understands it in the regional dialect as is it yours?, and answers of course it’s mine!

The English joke relies on the homophony between the color red and the past tense of the verb to read.  This riddle puzzled the shit out of me when I was a small child, which in retrospect I should have realized meant that I was never going to be a very good linguist.

The Spanish joke relies on the American Spanish non-distinction between the pronunciation of and s.  (“American Spanish” means Spanish as spoken in the Americas, i.e. South, Central, and North America.)  A cat casa (hunts), while a priest caza (marries).  They’re written differently, and in Spain (and maybe some upper-class American dialects, but I can’t swear to it) are pronounced differently, but they’re pronounced the same in the Americas.

Sucking the joy out of language since 1989,

Beauregard Zipf

English notes

vet: This word can mean two things in American English:

  • veterinarian, as in the joke.  Examples:
    • took my dog to the vet just to find out he’s sick af (af = “as fuck,” an adverb meaning “a lot”)
    • My dog hates going to the vet.
    • Ask a cat vet online now
  • veteran, or former member of the military.  Examples:
    • She and other vets said there’s frustration that the President is quick to claim credit for successes and happy to bask in the reflection of the military’s luster but doesn’t follow through on tough issues. 
    • Vets groups decry hatred, racism in wake of Charlottesville violence (Source: headline here.  Charlottesville is a city in North Carolina where the president of the United States of America defended a white supremacist rally at which an anti-racism protester was killed.)
    • The veteran’s voice is crucial to changing the hate rhetoric directed at Muslims. “When I served in the United States Marine Corps, I took an oath to the Constitution of the United States. There is a First Amendment, which respects religious tolerance and freedom of speech,” stated John Amidon, Vietnam vet and member of Veterans For Peace.

Picture source:

Billet-doux: love letter

This is a love letter.  It’s not to my grandmother, although it could be.  My favorite memories of her: sitting together on her front porch in the morning, sharing a cup of coffee and a cigarette, talking about nothing–or just not talking at all.

Prévert in Paris, 1946. Photographer: unknown. Cat: unknown.

This is a love letter.  It’s not to Jacques Prévert, although it could be.  I’m usually up at daybreak, and sometimes as the sun peeks over the horizon I’ll go outside to have a smoke and read his Encore une fois sur le fleuve.  I’ve read some of his poems so often that they form a sort of soundtrack in my head as I walk the streets.  In his photographs, he looks like the uncle you always wanted–a face that you can tell is just barely hiding a smile, a cigarette in his hand–or just hanging from his lips.

This is a love letter.  It’s not to my grandmother, although it could be.  When she died, I found her long white evening gloves and her cigarette holder.

This is a love letter.  It’s not to my grandfather, but it could be.  One of my mother’s friends told me this about him: his apartment was nothing but books and cigarette smoke.  

This is a love letter to cigarettes.  Yeah, I know: they’re gonna kill me.  Hell–if I didn’t smoke, I might live two years longer!  Two years against some connection, any connection, with the French grandfather who had my mother when he was as old as I am now (very), and died before I was born.  Two years against Jacque Prévert in my head when I walk the streets in Paris, or anywhere in the world, really.  Two years against that memory of my grandmother, the warm Florida mornings, the ashtray that my father made for her in summer camp.  Seems like I come out ahead on this one.

The picture at the top of this page is not my grandmother, but the American actress Carol Landis, photographed in 1946 for a Kislav glove ad.  Photographer: unknown.

English notes:

To walk the streets: be careful with this one.  It can mean walking nowhere in particular–not flâner, as it connotes a certain intensity and solitariness that is lacking in flâner.  It can also mean living by prostitution–compare the noun streetwalker, a prostitute qui fait le trottoir.  Yet another meaning: to be free after a time in prison.

  • How I used it in the post: I’ve read some of his poems so often that they form a sort of soundtrack in my head as I walk the streets.
  • With the “out of prison” meaning: Many are outraged that the convicted killer will be walking the streets after spending just two years in prison. (Source: the Farlex Free Dictionary.)
  • With the “prostitution” meaning, in a slightly different construction: 52 and still working the streets.

French notes:

le billet-doux: an old term for a love letter.  I understand that you can use it for comic effect.  But, compared to la lettre d’amour, I like the sound of billet-doux much more.  Doux: it just sounds…right.  (Phil dAnge, can you comment?)

What computational linguists actually do all day: The recursion edition

I know, I know: computational linguistics sounds like the world’s most glamorous profession, right? 

I know, I know: computational linguistics sounds like the world’s most glamorous profession, right?  You imagine a bunch of geeks in hip glasses sitting around talking about Sanskrit is-aorist verbs, playing a little foosball after a free sushi lunch in the Google cafeteria, and then writing code to translate Jacques Prévert into idiomatic American English with a little stock ticker in the upper-right corner of their screen so that they can watch the value of their vested options go up, and up, and up, and…

In reality, I’m sitting in the international student dormitory of a well-known East Coast American university.  Yesterday was a good day, because the shitwad in room D2 left his dirty dishes in the sink for the full 48 hours that let me feel fine about throwing the reeking things in the trash can.

But, then I realized something: I can only get easy copyright releases for the book I’m writing for papers published in 2016 or later.  That means that I need to do a serious analysis of what I’m citing in the book, which means…writing code (the computer language that makes up a program) to go through a bunch of citations to figure out what year they were published, in which conference or journal, etc., etc., etc.

That means that I write stuff that looks like this:

open (IN, "/Users/transfer/Dropbox/Scripts-new/bioNLP.bib") || die "Couldn't open input file...\n";

…and then spend a lot of time looking at the error message “Couldn’t open input file”, ’cause I was missing the slash at the beginning of this:


…which I was happy to figure out, but didn’t really find all that interesting.

Then I spent a lot of time writing things like the following:

    if ($line =~ /title.*=\{(.*)\},$/) {

        $DEBUG && print "TITLE: $1\n";

        $entry{"TITLE"} = $1;


…which wasn’t particularly difficult, but caused a little pinprick in my soul, ’cause I knew as I was writing it that it would mess up any time that I had a title with a curly-brace in it ({}), and practicing your profession shittily never feels good.  For reasons that we need not go into, having curly-braces in the title of a work happens a hell of a lot more often than you might think, and that fixing that little flaw would require writing something called a recursive functionwhich really shouldn’t be that complicated for a computational linguist (recursion is one of the fundamental properties of language (the picture at the top of this page is a humorous illustration of recursion (which is probably oxymoronic (and as you might have guessed, these embedded parentheticals are themselves an example of recursion (as is the second sentence of this post (an example, that is–not necessarily a humorous one (unlike the cartoon))))))), and yet still, is more than my little brain de pois chiche (garbanzo bean) can handle on a Sunday morning.

Then, in order to be able to see any actual output, I had to write code like the following:

        my $output = "";

        for my $field (@fields) {

            #print "$entry{$field}\t";                                                               

            $field .= $entry{$field} . "\t";


        $field =~ s/\t$//;

        print "$field\n";


…which was neither particularly challenging nor particularly interesting, but caused my program to crash quite rudely, ’cause for reasons that we need not go into, I should have written

        my $output = "";

        for my $field (@fields) {

            #print "$entry{$field}\t";                                                               

            $output .= $entry{$field} . "\t";


        $output =~ s/\t$//;

        print "$output\n";


That gave me the first thought I’d had all morning that was actually interesting, as I contemplated how hard I’m pretty sure that it would have been–how impossible I at least hope it would be, for the moment at any rate–for a computer to find and fix that particular bug.

Another half hour or so of work, and now I can actually see what I wanted to know, which is the venues where the works that I cite were published.  This was useful, in that I noticed that one that should be heavily represented in my bibliography in fact barely figures there at all.  But, what it meant was that I needed to Google hither and yon to find out how to search Google Scholar (we’re just getting more and more meta here all the time) by name of conference.  Not particularly challenging; but, not particularly interesting, either.

This is a whiny post, right?  Totally tongue in cheek, though.  Actually, I have the incredible good luck to love what I do, and the book in question really is a labor of…a labor of love.


English notes

Something in this post that is perfectly fine English but that I probably would not have written if I didn’t spend a lot of time writing (poorly) in French these days:

I noticed that a publication venue that should be heavily represented in my bibliography in fact barely figures there at all.

An educated speaker of the langue de Molière will be aware that figurer sur une liste is perfectly natural (as far as I know) French.  What I wrote is perfectly fine English, but I would suspect that it doesn’t occur very often, even in written academic or official English.  Why did it pop out of mouth (well…fingers) today?  French-language interference, which is funny, ’cause in language teaching we often talk about first-language interference (carrying over aspects of the grammar of your native language, such that they fuck up your mastery of a foreign or second language), but I can’t recall ever running into the concept of second-language interference, and French is mostly definitely a second language for me, not my first.  Go figure…

go figure is an expression that expresses surprise about something that you’ve just been talking about, or an assertion that you are about to make.  How I used it in the post:

I can’t recall ever running into the concept of second-language interference, and French is mostly definitely a second language for me, not my first.  Go figure…



Why doing the laundry makes me happy

Doing the laundry will make you happy if you spend sufficient time contemplating the zombie apocalypse.

What will suck about the zombie apocalypse is….well, everything, really. For example: when the zombie apocalypse comes, most people will be completely filthy most of the time. For a while, you’ll at least be able to scavenge clean clothes–you won’t have many opportunities to bathe, but let’s face it: Old Navy will not be the first store to be looted. Eventually the clean clothes will all be gone. Eventually the day will come when you’ll strip a coat off of a reeking zombie whose head you’ve just smashed like a watermelon and be happy that you have something to keep yourself warm.

Today I woke up at 5:30–late for me–and headed down to the basement laundry room. Then I went to work–in clean underwear, clean jeans, and a clean t-shirt from the 2007 Association for Computational Linguistics meeting in Prague. (I learned to say gde je stan’ce metra–where is the subway station–which was undeniably useful. I also learned to ask questions about the National Theater, which amused the taxi drivers but did not accomplish much else.)

When you compare it with how bad life is going to suck during the zombie apocalypse, doing the laundry was actually pretty fun. Going to work in clean clothes was a pleasure, as it is every day, and it always will be if you spend sufficient time contemplating the zombie apocalypse.  There’s a reason I’m the happiest person you know. Hell, I’m the happiest person you don’t know.  Think about it.

English notes

In American English, “like a watermelon” is a common simile for describing actions of crushing, smashing, and the like.  Some examples:





How I used it in the post: The day will come when you’ll strip a coat off of a reeking zombie whose head you’ve just smashed like a watermelon and be happy that you have something to keep yourself warm.

Language geekery: similes versus analogy

Simile and analogy are similar (is that a pun? if so, it’s not a very sophisticated one), but they’re not quite the same.  Analogy starts with focusing on similarity between unlike items, and then typically is followed by pointing out the differences between them.  In contrast, simile does not require any actual similarity between the unlike items, and does not include pointing out the differences.

Thus, the heuristic Detached roles is like a Hearst & Schütze super-category, but not constructed on a statistical metric, rather on underlying semantic components. (Source: Litkowski, Kenneth C. “Desiderata for tagging with WordNet synsets or MCCA categories.” Tagging Text with Lexical Semantics: Why, What, and How? (1997).)

A recursive transition network (RTN) is like a finite-state automaton, but its input symbols may be RTNs or terminal symbols. (Source: Goldberg, Jeff, and László Kálmán. “The first BUG report.” In COLING 1992 Volume 3: The 15th International Conference on Computational Linguistics, vol. 3. 1992.)

Therefore, a conversation is like a construction made of LEGO TM blocks, where you can put a block of a certain type at a few places only.  (Source: Rousseau, Daniel, Guy Lapalme, and Bernard Moulin. “A Model of Speech Act Planner Adapted to Multiagent Universes.” Intentionality and Structure in Discourse Relations (1993).) Note that a native speaker probably would have put this somewhat differently.  Where the authors say where, a native speaker might have said where you can only put a block of a specific type at a few places, or more likely, except that you can put a block of a specific type only specific places.

Given all of that: is this an analogy, or a simile? The day will come when you’ll strip a coat off of a reeking zombie whose head you’ve just smashed like a watermelon and be happy that you have something to keep yourself warm.  Scroll down past the gratuitous Lisa Leblanc video for the answer.

I sometimes use this blog to try out materials for something that I will be publishing.  This brief description of how to use analogy is intended for a book about writing about data scientist.  I would love to know what parts of it are not clear.  (My grandmother will tell me how great it is, so no need for you to bother with that.)

Answer: it’s a simile.  Note that we’re not asserting any difference between the way that you’re going to smash the zombie’s head and the way that you would smash a watermelon: a reeking zombie whose head you’ve just smashed like a watermelon.  Note also that we are not then contrasting the way that you’re going to smash the zombie’s head and the way that you would smash a watermelon.  Simile, not analogy.




Yes, please–do volunteer to be a reviewer

Yes, you CAN volunteer to be a peer reviewer!

Get any two researchers together in a bar at the end of a day at any randomly chosen conference.  They will get around to complaining about the difficulty of getting grant funding these days, service responsibilities in their institution, and how grad students don’t want to work as hard as we did back in the day.  But, before that, they will complain about the real pain point of academic work: reviewing.  (See the English notes below for an explanation of the expression “pain point.”)

“Peer review” is the process by which academic writing is considered for publication.  The mechanics of it are this:

  1. An author submits an article to a journal or conference.
  2. An “associate editor” at the journal or an “area chair” at the conference finds reviewers who are willing to read and comment on the paper–your “peers.”
  3. The reviewers read the paper, write up detailed comments on it, and make a suggestion regarding acceptance.
  4. The associate editor or area chair makes a decision about the paper.

That decision in step 4 can take a number of forms, including outright acceptance (rare), rejection (not rare), and giving the author the option of making changes in response to the reviewers’ comments and resubmitting the paper, in which case steps 3 and 4 repeat.  (They can repeat multiple times, too.)

At step 2, the associate editor or area chair needs to find three reviewers in the typical case–rarely fewer, and sometimes more.  (I once submitted a paper to a journal for which I am the deputy editor-in-chief, and the editor who handled it had it reviewed by SIX reviewers–the most I have ever seen.  To avoid the appearance of a conflict of interest, that made sense.)

Three reviewers per submission, and the big conferences in my area (computational linguistics) typically get between 1,000 and 2,500 submissions–that’s 3,000 to 7,500 reviews per conference.  There are several big conferences in my area–assume five per year, and that’s 15,000 to 37,500 reviews that need to get written per year.  And that’s just the conferences–journal publications are appearing faster than ever before in history, which is in itself not a surprise–most things are happening faster than ever before in history—but, the publication rate has been growing logarithmically, and if you’ve been reading about Zipf’s Law for a while, you know that that’s fast.   Journal submissions take quite a bit more time to review than conference papers, too–a conference paper in my field is typically limited to 8 pages, but most journals in my field no longer have page limits at all.

Just for grins, here are the page counts on my 5 most recent journal articles: 15, 8, 14, 24, and 12.  The 8-pager was in a journal with a page limit–of 7 pages!  We paid an extra-page fee.

Who writes those peer reviews?  Well…your peers.  You write your share of those 15,000 to 37,500 reviews, and the authors of those 5,000 to 12,500 papers write reviews of your papers, and… Well, it’s a huge workload.  How huge, exactly?  It’s hard to say what an average would be, but I have a reviewed a couple hundred papers over the course of the past couple of years.  Is that typical?  Probably.  And the conference papers come in bursts–conferences are deadline-driven, so all of the 1,000 to 2,500 submissions to an individual conference are being done at once.  A reviewer for a conference in my field is typically assigned 5 papers.  Of course, there is a limited set of time slots when conferences can happen–they mostly take place during breaks in the academic year, so either during the summer, or around the end-of-year holidays.  That means that their submission deadlines tend to cluster together, so you are probably reviewing for multiple conferences in the same time period.  How many?  I’ve written 14 in the past two weeks.  I may actually have spent more time reviewing other people’s papers than working on my current grant proposal–and it’s the grant proposals that bring in my salary.  Could I say no to review requests?  Of course.  But, it would not be fair to do so–while I’m reviewing those papers, someone else is reviewing mine.

….All of this en préambule to the answer to a question that I don’t get asked often enough: can you volunteer to be a reviewer?  The answer: yes.  Here’s a good example of a request that I got recently:

Dear Dr. Zipf:

I am a Ph.D. student at university name removed, majoring in computer science, under the supervision of advisor name removed. My main research fields are bioinformatics, deep learning, machine learning and  artificial intelligence.
I have done some researches in bimolecular function prediction, Nanopore sequencing, fluorescence microscope super resolution, MD simulation, sequence analysis, graph embedding and catastrophic forgetting, which were published in journals, such as PNAS, NAR and Bioinformatics, and conferences, such as ISMB, ECCB and AAAI. Attached please find my complete CV about my background.

I am very interested in serving the community and acting as a reviewer for the manuscripts which are related to my background. I know you are serving as an associate editor for a number of journals, such as BMC Bioinformatics. If you encounter some manuscripts which are highly related to my background, feel free to refer me as a reviewer.

Thank you very much for your consideration! Have a nice day!

My response:

Hi, name removed,

Thank you for writing–it is always nice to see a volunteer for reviewing!  However, I only handle articles on natural language processing, which seems outside of your areas of expertise.  I would recommend that you send your CV, and a similar email, to associate editors who specialize in your areas.  Your advisor could suggest some, and you could also look at the editorial board of relevant journals, especially ones in which you have published.
Thank you again for volunteering, and keep looking for opportunities–I am pretty sure that you will find them!
Best wishes,
Beauregard Zipf
Response to THAT:

Dear Dr. Zipf:

OK! Thank you very much for the clarification and the instruction! Have a nice day!

Notice what you do not see in this exchange: what people are afraid of, which is a response saying something along the lines of “who the hell do you think you are to dare to propose yourself as a reviewer?”  Of the 200 emails that I probably plowed through that day, this offer might have been the only message that actually brought me a little joy–even though I couldn’t use this particular reviewer, I’m certain that someone else will.  Yes: you can volunteer to be a peer reviewer!

French notes

en préambule (à): as a preamble, en guise d’introduction.
la relecture par les pairspeer review. also gives l’évaluation par les pairs and l’inter-évaluation, but I’ve never actually heard that last one.  Native speakers??
Want to read a French-language blog post about peer review in computational linguistics?  Here’s one by my colleagues Karën Fort and Aurélie Névéol.
English notes
pain pointa marketing term referring to the problem that a salesman is going to try to solve for you by selling you his product.  How I used it in the post: Before that, they will complain about the real pain point of academic work: reviewing.