Tea for tu and tu for tea

tu_vousThis past week I attended a French conference.  Almost all of the talks were in French, which means that I spent most of the week hurriedly scribbling words in my notebook to look up later. From a linguistic perspective, the most surprising thing to me was that during the questions-and-answers after a talk, people in the audience addressed the speaker with the informal pronoun tu, and the speaker addressed people in the audience with the informal pronoun tu, as well.  Like many languages, French has formal and informal forms of the word you.  Tu is the informal form of the word, and vous is the formal form of the word.   (Some languages also have plural formal and informal forms of the word you.  However, in French, both of those are vous (the same as the singular formal form)).  You will hear lots of allegedly simple explanations of when to use each one of these, but in practice, it is far more complicated, and I often goof.  Part of the reason that I love my French tutor back home is that the first two words that she taught me were the verbs tutoyer, meaning “to address someone as tu,” and vouvoyer, meaning “to address someone as vous;” these have turned out to be really useful, because often the first thing that someone who I know professionally says to me is “we can tutoyer.

I asked some native speakers why vous isn’t used in this (pretty) formal situation, i.e. a presentation at an academic conference.  Answers that I got were that everybody at the conference knows everyone else, and it’s weird to call someone that you know tu, and that besides, it’s not that large of a conference, actually.

Other than that particular linguistic observation, I mostly marveled at my own ineptitude.  Why is it that I can read books about lexical semantics in French, but the only spoken things that I understood one particular day were J’ai trop mangé (“I ate too much,” said by someone a couple of seats over from me after lunch), and on t’entend pas (“we can’t hear you”–said by an attendee to a soft-spoken speaker–note the informal t’ (a shorter form that occurs before a vowel), rather than the formal on vous entend pas).  If it weren’t for the PowerPoint slides, I would be lost all day…

In light of the embarrassment of riches as related to words that I didn’t know over the course of the past week, I’m just going to focus on verbs today:

  • repérer: lots of meanings related to noticing, spotting, or finding things.  In my field, it shows up in the related nominal form repérage d’entités nommées,  which we express as “named entity recognition” in English.
  • nettoyer: to clean or, figuratively, to clean out.  You might need to nettoyer the data from a web page.
  • constater: to note, notice, or observe; to record or certify.  Nous avons évalué l’ensemble de ces résultats et nous constatons une amélioration sur l’acquisition de paraphrases sous-phrastiques.  “We have evaluated the set of these results, and we note an improvement in the acquisition of sub-phrasal paraphrases.”  Bouamor, Max, and Vilnat, Combinaison d’informations pour l’alignement monolingue. 
  • attraper: various meanings having to do with grabbing hold of or catching something.  This came up in the context of a discussion of a French convention that I’m not sure I understand, by which students can choose to skip an exam and take a make-up.  Apparently this happens often at the university level, if I understood correctly.  I also heard réattrapage in the same context.
  • prendre en compte: to take into account, but also to take on board.  Elle permet aussi de prendre en compte les positions relatives du nom et de l’adjectif (postposition ou antéposition) dans le calcul du sens.  “It also allows taking into account the relative positions of the noun and the adjective (postposition or preposition) in the calculation of the meaning.” Venant (2007).  Utiliser des classes de sélection distributionnelle pour désambiguïser les adjectifs.
  • engendrer: to cause, produce, or create; to lead to or bring about; to engender.  There are also some meanings related to procreation.  STAG a été utilisé avec succès dans une grammaire anglaise qui permet d’engendrer simultanément les analyses syntaxique et sémantique d’une phrase… “STAG has been used successfully in an English grammar, which permits producing syntactic and semantic analyses of a sentence simultaneously.”  Danlos, STAG: un formalisme pour le discours basé sur les TAG synchrones.  (Don’t quote my translation of the relative clause in this one–I’m not sure that I got it right.)
  • disposer de: to have available, or to have at your disposal; to manage, run, or order.  (There are other meanings if you don’t have the preposition, as well as reflexives.)  Ce dont on a le plus besoin en TAL, c’est de disposer de lexiques à large couverture… “What we need the most in [natural language processing] is to have available large-coverage lexicons…”  Maurel and Tran, Prolexbase: Un lexique syntaxique et sémantique de noms propres.
  • se démarquer: to distinguish yourself, differentiate yourself, distance yourself, stand out from; in sports, to free yourself or to get free.  (Very different non-reflexive senses, as well.)  Un accent peut être stigmatisé, dévalorisé et générateur de ségrégation, ou au contraire revendiqué pour affirmer son identité, sa loyauté, son intégration à une communauté et se démarquer d’un autre groupe… “An accent can be stigmatized, deprecated, and a factor of segregation, and quite the opposite, claimed [remember that we saw this word used in a news story that talked about ISIS claiming credit for a terrorist attack] to affirm one’s identity, one’s loyalty, one’s integration into a community, and to differentiate oneself from some other group…” de Mareüil, Vieru-Dimulescu, and Adda-Decker, Accents étrangers et régionaux en français.
  • remarquer:to note or see; to notice; also to relabel or remark.  A l’aide de cette courbe, nous pouvons remarquer que globalement, au dessus de 2000 mots, l’information mutuelle des mots se “stabilise”… “With the aid of this curve, we can see that globally, above 2,000 words, the mutual information of the words “stabilizes…” Brun, Smaili, and Haton, WSIM: une méthode de détection de thème fondée sur la similarité entre mots.  A few related expressions:
    • faire remarquer: to make your point
    • faire remarquer qqch à qqn: to point something out to someone
    • se faire remarquer: to stand out, to get yourself noticed
  • déclencher: to trigger, to cause, or to set something off.  In the conference, it was used in the sense of “triggering” the execution of a rule.  During the time of the conference, a couple of guys attacked an emergency vehicle in southern France, and the verb declencher was used to describe the action of the police initiating a wide search for the miscreants.  Here’s an example from Twitter, just because I’m getting tired of typing citations from journal articles.  It means “Dannyl Roof has admitted that he wanted to set off a race war:”
  • Screenshot 2015-06-28 19.45.42
  • rapprocher: lots of meanings having to do with bringing something closer to you.  (In the reflexive, it’s approaching something.) a été effectuée en s’efforçant de rapprocher les jeux d’étiquettes de ces deux corpus…  “…has been carried out while endeavoring to bring together the tag sets of these two corpora…”  Falaise, Intégration du corpus des actes de TALN à la plateforme ScienQuest.
  • ajouter: to add or (in computing) append. …il est utile d’ajouter à l’annotation…  Bonfante, Guillaume, Morey, and Perrier, Enrichissement de structures en dépendances par réécriture de graphes.

That’s an awful lot of words!  And, that’s just some of the verbs–imagine how many nouns there were, too…  If you don’t recognize the cultural reference in the title to this blog: it comes from an old musical number called Tea for Two.  You can hear Doris Day sing it while wearing a dress with sparkly hems here.  If memory serves, the lyrics include Tea for two, and two for tea, me for you, and you for me…

Charity and leather goods

2015-06-06 15.09.00In America, there is a huge genre of books about France and the French.  What American hasn’t at least heard of Mireille Guiliano’s book French women don’t get fat, and perhaps even read it?  (643 reviews on Amazon, average of four stars.) Many of them are pretty much just full of stereotypes, but some attempt real analysis.  One of the things that I’ve read in such a book is that the French are, in general, less charitable than Americans.  The explanation given is that in America, the assumption is that people in need will be taken care of by their community and religious organizations and the government will just take up the slack, while the assumption in France is that people will be taken care of by the government, while communities and religious organizations will just take up the slack.

This is belied somewhat by the fact that the streets of Paris are full of beggars, and there is money in their cups.  Yesterday I found some more evidence that the French are perhaps not so much less charitable than Americans as is thought to be the case.  The picture in this post was taken from the side of a bin for collecting things for the poor.  Of course we have such bins in America, too, but I think that what it says on the side of this bin is compelling.  Such bins in America typically have a sign saying something like “please do not put trash in this bin.”  The sign on the bin that I saw yesterday instructs you that the following can be placed in the bin:

  • “Clean and dry clothes and household linens in a closed sack”
  • “Shoes tied together by pair”
  • “Leather goods”

What I thought was so striking about this was the idea that people would donate leather goods–presumably jackets and the like.  I can’t imagine an American donating a leather jacket to charity–they’re far too expensive to give away.  Words that I learnt in the course of the day:

  • la maroquinerie: this can mean leather goods, and also a leather goods shop.
  • sortir en boîte: to go clubbing.  This has nothing to do with charity–I saw it in an advertisement in a newspaper that was sticking out of a trashcan.  Yes, linguists collect data constantly, even from newspapers sticking out of trashcans.  No, I did not take it out of the trashcan.

My apartment reeks of camembert and paint fumes

Camembert is sold in wooden boxes.  Here is one with a picture of a poilu, or French soldier from World War I, on the lid.
Camembert is sold in wooden boxes. Here is one with a picture of a poilu, or French soldier from World War I, on the lid.

France has hundreds of cheeses.  You hear lots of exact numbers, but I suspect that no one really knows how many there are.  Camembert is perhaps the most French of the French cheeses–it is the Frenchman’s stereotype of a French cheese.  (If you’re French: Americans think that the stereotypical French cheese is a brie.  We can’t get camembert worth the name in America–raw-milk cheeses aged less than 60 days are illegal.  Yes, illegal.)

Every French cheese has a story.  The story of camembert is that it was created by one Marie Harel when a priest fleeing to England around 1790 gave her some suggestions based on how they made cheese back in his home in Brie.  (The Church was gone after with a vengeance after the French Revolution.  Over 200 priests were killed in the September Massacres in Paris in 1792.  I went to a beautiful Vivaldi concert nearby.)  According to Kathe Lison’s delightful The Whole Fromage: Adventures in the Delectable World of French Cheese, camembert makers distributed it for free to soldiers in the trenches during World War I, hoping to create loyalty, and it worked.

Part of camembert’s charm for Americans (when we can actually buy it, which is when we come to France) is that it smells like we think a French cheese ought to smell: pretty bad.  The hallways in the apartment that I’m renting were just painted, and the combination of the smell of the camembert sitting on my kitchen counter and the fresh paint is…intoxicating, and not in a good way.  Still, the camembert made for a great dinner tonight with the stereotypical baguette and red wine–shoot me, I’m a tourist.  Here are some words that are helpful for reading about camembert:

  • puisque: since, because, seeing as; just as, just like.
  • le convive: guest.

Devenu le symbole de la France avec la baguette de pain et le verre de vin rouge, il a une taille idéale pour un fromage, puisqu‘on peut le manger en une seul fois à quatre ou cinq convives“Having become the symbol of France along with the baguette and the glass of red wine, it has the ideal size for a cheese, because one can eat it at one sitting with four or five guests.”

Hawaiian shirts turn out not to be the way to go in the Parisian workplace

2015-06-22 19.50.43In France, it’s important not to look like everyone else.  It’s also important to be in style.  This, obviously, creates a conflict. I was feeling whimsical when I packed, and decided to structure my summer wardrobe around my collection of Hawaiian shirts.  (I can only bring so many pieces of clothes for a six-week stay, so packing well is really an issue.)  Today I happily put on one of those shirts–a bright blue one that matches my eyes.  It didn’t go over well at the lab. My office mate Brigitte: “So, what’s up with your shirt?” Me: “I…um…likes Hawaii.” Brigitte: “That’s not a work shirt, that’s a vacation shirt!  So, you’re here on vacation?” Me: “I…um…works?” Brigitte: “You need to change your stock of shirts.” Brigitte is a scream.  Of course, Zipf’s Law struck in this conversation, as in any other:

  • renouveler: to renew, change, or (in the case of a contract) extend.  This is the verb that Brigitte used.
  • le stock: believe it or not, this is a French word, and it’s spelt stock, which is about as un-French of a spelling as you can imagine.  There are actually some related words:
    • le stockage: storage, store.
    • stocker: to store or hoard; to stock up on; to stock something (with an intent of selling it).

…or even read my own writing

One of the many things that is embarrassing in a foreign language: not being able to read your own writing.  I recently wrote a paper with a couple of my fellow computational linguistics folks, one of whom is French.  I wrote the first draft; she translated it into French and then added more material, made it into a better paper, etc.  It was discouraging when Zipf’s Law struck in a translation of my own writing, and I couldn’t read my own paper!  Happily, the Poisson distribution struck, too, and the great podcast Coffee Break French had a segment on one of the words that I didn’t know: voire.  This word translates as something like or even or and even.  Here’s an example from my paper:

Les chercheurs qui ont organisé la campagne ont également été touchés, voire bouleversés par leur contact avec ce corpus.  “The researchers who organized the project have also been affected, and even devastated by the corpus.”  (A corpus is a collection of analyzed linguistic data.)

Or, le Web 2 a permis l’apparition de plate-formes de myriadisation du travail parcellisé (microworking crowd-sourcing), dont Amazon Mechanical Turk, qui proposent à des demandeurs (Requesters) d’accéder à une «foule» de travailleurs (Turkers), qui sont très peu, voire pas du tout, rémunérés.  “But, the Web 2.0 has allowed the appearance of microworking crowd-sourcing platforms, among them Amazon Mechanical Turk, which offers “Requesters” access to a “crowd” of workers (Turkers), who are paid very little, or even not at all.”  (Couillault, A., & Fort, K. (2013, July). Charte Éthique et Big Data: parce que mon corpus le vaut bien!. In Linguistique, Langues et Parole: Statuts, Usages et Mésusages (p. 4).)

Some Twitter examples:

Screenshot 2015-06-04 14.57.52

“I also commit myself to review a little bit every evening in order to pass excellently my bac (high school exit exam) on French, and even the science one.”

Screenshot 2015-06-04 14.58.47

“The only interesting courses are those on history and French–the English is at a kindergarten level, or even nonexistent.”

Screenshot 2015-06-04 15.00.25

“There is a chasm–or even two chasms–between the French YouTubers and the English-speaking YouTubers.”

Thanks to Coffee Break French for clearing this Zipf’s Law example up for me–if you’re interested in learning French at any level, from complete beginner to advanced, check out their podcast and web site.

Don’t be shy: Ask your Parisian taxi driver about Uber

taxi-parisienI should start by saying that I have had some great experiences with taxi drivers in Paris.  The West African immigrant who got me from the airport to my apartment for 40 euros when it should have cost 50, the guy who plowed through downtown traffic like a crazy man to get me to the opera on time–I’ve never really felt like I was getting ripped off here.

But, who doesn’t hate taking a taxi in a strange city?  I always feel like I have to do something to demonstrate that I’m not some tourist to be driven in circles around the périphérique for two hours.  In Paris, the obvious way for an American to do that is by speaking French.  But, besides the fact that I don’t speak French well, there’s also the issue that unlike in the United States, where you might know your taxi driver’s children’s names and grade point averages by the time you get where you’re going, it’s culturally weird to have a conversation with someone you don’t know here.  So, how to establish your Parisian bona fides?  My latest hypothesis is that you do this by asking your taxi driver if they have Uber here yet.  It turns out that they do, and if your taxi driver is anything like mine was this morning as I made my way into town from the airport, he’ll have a lot to say about it.  I tried the Uber approach this morning.  It did start up a conversation, and when the traffic became completely impossible–the quarter finals of the French Open are today, and the King of Spain is driving through town for some reason–my taxi driver became a madman and got me where I was going faster than might have happened otherwise.  Words that I learnt in the course of the ride from the airport:

  • boucher: this is a noun, meaning “butcher,” but it’s also a verb, with meanings that have to do with blocking things.  So, it can mean “to cork” a bottle, and “to plug” or “to seal” a hole or a crack.  In the case of traffic, it is “to block.”  As my taxi driver said in frustration as he tried yet again to get off of the freeway: C’est bouché partout, partout, partout!  “It’s blocked everywhere, everywhere, everywhere!”
  • le débouché: an outlet, opening, or exit.  There’s also a verb déboucher that means things like “to unblock” and “to uncork.”  When we broke free of traffic thanks to the driver’s heroic exertions, I happily said débouché!  He responded glumly, pour le moment–“for the moment.”

First attempts might be tentative, but second attempts, less so

The paper that I’m going to give in France is about suicide notes.  The work that it describes is part of a project to try to train computers to predict which adolescents in the Emergency Room for a suicide attempt will make a second attempt.  This is important because second attempts are more likely to be fatal than first attempts.  (There’s a whole theory about why this is, related to the notion that self-injury is a learned behavior.  More on this in another post, perhaps.)

To prepare my talk, I need to know how to say “attempt” in French.  This is tough, because there are a number of false cognates and similar-sounding words that get involved.  The bottom line is that the word for “attempt” is la tentative (singular feminine noun).  It comes from the verb tenter, which means to try or to attempt.  (It has other meanings, too–to tempt, attract, encourage, or entice.)

Is there a way to say “tentative” in French?  Of course–see below for a bunch.  And, there are French words that look/sound like “attempt”–attent, and attentat.  Of course, they don’t mean anything like “attempt”–more false cognates.

  • la tentative: an attempt, a try.
    • la tentative de meurtre: murder attempt, attempt on someone’s life.
    • la tentative de suicide: suicide attempt, attempted suicide.
  • une attente: wait, waiting, waiting time; expectation.
  • un attentat: attack, bombing, assassination attempt; offense, outrage.
  • provisoire, expérimentale: tentative, in the sense of not committed.
  • timide, indécis, hésitant: tentative, in the sense of a thought, idea, or person.

Not science, but not claiming to be: My first reviews in French

I just got the reviews of my first conference paper submission in French.  (I wish I could say that I dared to write it in French, but no: one of my co-authors, a native speaker, translated it from English, with, of course, many additional contributions.)  The reviews illustrate a couple of interesting grammatical points, and of course, thanks to Zipf’s Law, they bring up some new vocabulary items.

  • prétendre: to claim.  L’article, qui n’est pas en soi une contribution scientifique (mais ne prétend pas l’être)… “The article, which is not itself a scientific contribution (but does not claim to be)…”
  • aborder: to tackle, as in a question or problem.  Cet article aborde la question de l’annotation… “This article tackles the question of annotation…”  Cette perspective pose des questions en adoptant la perspective—très rarement abordée–du type de corpus.  “This perspective asks questions by adopting the perspective–very rarely tackled–of the type of corpus.”
  • éprouvant(e): trying, as in having a trying day.  …les difficultés qu’ils peuvent rencontrer à annoter des données sensibles, éprouvants.   “…the difficulties that they can encounter when annotating sensitive, trying data.”  (The paper is about annotating suicide notes.)

The interesting grammatical item: the definite article in L’article, qui n’est pas en soi une contribution scientifique (mais ne prétend pas l‘être)… “The article, which is not itself a scientific contribution (but does not claim to be)…” I’m not sure what that epenthetic article is called, but I’ve heard this type of construction before, most notably in an episode of Coffee Break French, Season 4, where it was talked about at some length.  There’s clearly no English equivalent, but it is required in French, as far as I know.

PS: Yes, the paper was accepted!

Mettre en examen: Zipf’s Law, the Poisson distribution, and the wiretapping of Sarkozy

Sarkozy's legal troubles.
Sarkozy’s legal troubles.

If you’ve been following this blog for a while (or read the About page), you know that Zipf’s Law has an effect on vocabularies: every language has a very large number of words that occur only rarely.  The Poisson distribution describes distributions of rare events, and predicts that even rare events will sometimes occur in clusters.  No movie stars die for a year, and then three of them die in a month–that kind of thing.  If you think about the interaction between Zipf’s Law and the Poisson distribution, you have the fact that every day, a second language learner will run across words that they’ve never seen before–a consequence of Zipf’s Law–and you have the likelihood that they will sometimes occur in unexplained clusters–a consequence of the Poisson distribution.

This interaction was illustrated for me today by the expression mettre en examen.  After not having come across it in 16 months of intensive French study, I came across it just a couple of days ago in a book about English serial killers, and then this morning, it showed up on my phone as an alert about a news story about Sarkozy’s legal troubles.  Zipf’s Law + the Poisson distribution: you live into your 50s without ever seeing a word, and then you see it twice in a couple of days, in totally unconnected circumstances.

  • mettre en examen: WordReference.com defines it as “to investigate” or “to place under formal investigation.”  In the book that I’m reading, it was translated as “to suspect.”  I guess I probably trust WordReference.com more, but that is such data as I have.

What about that affaire des écoutes that the news alert mentions?  As you might suspect, the noun écoute is related to the verb écouter, “to listen to.”  It turns out that this noun has a number of meanings, one of which is “wiretapping.”  Former French head of state Nicolas Sarkozy’s calls to his lawyer were tapped during an investigation of suspected influence-peddling, and this has become known as the affaire des écoutes.  Here are some other meanings, from WordReference.com:

  • “oreille attentive”: listening.  Il est à l’écoute de ses clients.  “He is attentive to his clients, he is in tune with his clients.”
  • wire-tapping, phone-tapping: Le journaliste est sur écoute.  “The journalist’s phone is tapped.”  Note the pronoun sur.
  • audience; (TV) viewing figures; (radio) listening figures.
  • There’s an additional meaning related to the nautical speed of a ship, I think, but I can’t quite figure it out.

Apparently I speak French like a Spanish cow

This is probably the best of the
This is probably the best of the “Spanish cow” pictures. The cow is saying “au lait,” which means something like “with milk” in French, and, crucially, is pronounced the same as “olé!”

I’m not that comfortable in French, but I’m told that I speak it “well, for an American.”  It turns out that this means that I speak French about as well as a Spanish cow.  This is the expression for speaking French poorly: parler français comme une vache espagnole, or “to speak French like a Spanish cow.”  I can’t really think of a clever English-language equivalent.

It turns out that if you do a search on vache espagnole (Spanish cow) on Google Images, you find quite a bit of stuff.  I’ve posted some of the better pictures here.

This cow is asking,
This cow is asking, “is no one wondering how well French cows speak Spanish?”
This cow is advertising a web site, which she says will teach you
This cow is advertising a web site, which she says will teach you “to speak Spanish better…than me.”
Ukrainian Humanitarian Resistance

Resisting the russist occupation while keeping our humanity

Languages. Motivation. Education. Travelling

"Je suis féru(e) de langues" is about language learning, study tips and travelling. Join my community!

Curative Power of Medical Data

JCDL 2020 Workshop on Biomedical Natural Language Processing

Crimescribe

Criminal Curiosities

BioNLP

Biomedical natural language processing

Mostly Mammoths

but other things that fascinate me, too

Zygoma

Adventures in natural history collections

Our French Oasis

FAMILY LIFE IN A FRENCH COUNTRY VILLAGE

ACL 2017

PC Chairs Blog

Abby Mullen

A site about history and life

EFL Notes

Random commentary on teaching English as a foreign language

Natural Language Processing

Université Paris-Centrale, Spring 2017

Speak Out in Spanish!

living and loving language

- MIKE STEEDEN -

THE DRIVELLINGS OF TWATTERSLEY FROMAGE