The Paris hustling ecosystem: the bad side

There are scammers all over the world, but there are some scams that are especially Parisian.

The good meaning of hustle. Picture source:

The verb to hustle can have a couple different meanings in English, one of which is good, and one of which is bad.

  • The good meaning of hustle: behaving with what the Merriam-Webster dictionary calls “energetic activity.”  Someone who’s hustling in this sense is working hard; moving around a lot; expending a lot of effort, in a good way.  If you want to get into a good college, you’re going to have to hustle this year.  She really hustled, and she finished the program early.  Commonly said to athletes: Come on, get out there and show some hustle! 
  • The bad meaning of hustle: “to sell something to or obtain something from by energetic and especially underhanded activity…to lure less skillful players into competing against oneself at (a gambling game)” (Merriam-Webster dictionary again).  (“Underhanded” means through trickery or dishonesty.)  This is basically the same meaning as to con someone–to trick them out of money—and a hustle (it can be a noun, too) can also be known as a con, or a con game, or a confidence game (which is where the shorter name comes from).
    pool hustler
    A pool hustler is more or less the archetype of the hustler. Pool hustlers are excellent pool players. They trick people into betting with them by pretending to not be very good, and then reveal their true skill after the bets are laid. Picture source:

    You will find people running hustles (or cons) pretty much everywhere you go in the world, including places where there are no tourists–people try to hustle the locals, too. But, there are some hustles that are especially common in Paris, and some that I haven’t seen anywhere else.  Read on for descriptions of how they work.


The common Parisian hustles

There are some pretty common hustles in Paris, and you will probably see at least one of these if you go to any of the famous tourist sites (and you totally should–I firmly believe that everyone should do as many of the stereotypical Paris tourist things as they can, at least once).  Here are the things that you’re likely to see:

  • The ring hustle
  • The friendship bracelet
  • 3-card Monte, or whatever
  • The fake petition
  • The fake deaf/mute

What I find especially interesting about all of this is that there is a system in operation here–an ecosystem, if you will.  We saw in a previous post that there are specific kinds of beggars that do their thing in specific areas–the guys who make speeches on subways, the Roma ladies on the Champs Elysées, etc.  There’s a similar kind of system in effect with regard to hustles–different groups more or less own specific hustles, and specific hustles are associated with specific areas of Paris.  In addition, there are some common types of robbery: picking pockets, and snatch-and-runs. You can find countless web pages on the subject of how to avoid getting your pocket picked in Paris, and I won’t belabor the point. Of course, the vast majority of people will have no trouble with thieves at all (although I do have a friend who had his pocket picked twice during the same visit to our fair city–just rotten luck). The only thing that I would add to the bazillion web pages on not getting your pocket picked in Paris is this: don’t lay your cell phone on the table while you’re talking, or even while you’re reading emails or something–you should have it in your hands at all times, and if you’re standing in the middle of the sidewalk looking at it, you should have it tightly in your hands. Now that cell phones can be worth hundreds of dollars, picking them up off of a table on the patio outside of a cafe, or even snatching them out of someone’s hands, and running off is unfortunately a thing.

The ring hustle

british police woman with fake rings
British police officer with confiscated fake rings used in the ring scam.  They use identical rings in France. Picture source:

The basic principle of this is that you and someone else find a gold ring at that same time, and they try to convince you that you should give them money in exchange for “their share” of the ring.  The ring is a piece of crap.  I once had the same guy try this one on me twice within twenty minutes on the same bridge.  He tried it as I was crossing the bridge in one direction, and then again as I crossed back the other way–I think he might not have been very focussed that day.  How exactly you both happen to discover this thing at the same time can vary, and how exactly the person tries to talk you out of your money can vary, but the basic principle is the same: ring, money.


This is pretty much a Roma thing, as far as I can tell.  In Paris, you should especially watch for this one on the bridges over the Seine–why, I have no clue.

The friendship bracelet

This lady made the mistake of being polite to the guy and not ignoring him and walking off–now she’s been snagged. Picture source:

The basic principle of this is that you are offered a free friendship bracelet by a friendly guy.  In fact, you don’t even have to accept it–he’ll just grab your hand and start putting it on you, if you don’t avoid him well.  Once it’s on you, it’s no longer free, and he demands a lot of money for it.  Part of what makes this work is that the guy uses the bracelet as a handle to keep you physically under control–in the best (for him)/worst (for you) case, by using your finger to make the thing for you (see below).  This is almost entirely a West African thing, and the hotbed is the steps of the Sacré Coeur basilica.  Why?  I have no idea.

The shell game

Make no mistake: the people who are doing the things that I’m describing on this page are scumbags.  They steal–they just mostly don’t use violence to do it.  In the case of the shell game (and its card-based relative, known as 3-card Monte in English) though, I have to admit that I find it somewhat difficult to feel as much empathy for the victims as I usually do.  This is despite the fact if you fall for this one, you are probably going to lose much, much more money to this con than you would to anything else on this page.  More on that in a minute.

Hieronymus_Bosch_051 shell game
Hieronymus Bosch’s painting “The Conjurer,” painted between 1475 and 1480. Notice that the guy on the left in white with a black top is stealing the purse of the guy who’s watching closely. Picture source:

The basic idea here is that the guy running the con has three cups.  He’ll put something under one of them, move the three cups around, and then give free money to anyone who can guess which cup it’s under.  It’s easy–you see the guy just giving money away.  He gets you to put up some of your own money.  You do, and all of a sudden you guess wrong.  I watched a guy doing this a couple weeks ago–he was trying to get people to put up 100 euros.

The reason that I find it harder to empathize with people who get caught by this one than with people who fall for the other cons that I describe on this page is this: people have been pulling this shit for over 2,000 years.  The shell game existed in Ancient Greece.  It was already all over Europe in the Middle Ages.  How can people not have heard of this??  I have no clue.

This is mostly a Roma thing, although I saw what appeared to be a South Asian guy doing it once.  I’ve often seen it in Paris in the near surroundings of the Eiffel Tower–mostly on the Iena Bridge, and I don’t remember seeing it anywhere else.  I have to say that this is the rarest of the Paris hustles–it requires a fair amount of set-up, and a number of confederates (when I was watching the other night, there were four adult males involved, one of whom was pretending to be a stranger playing the game, and the other two of which were hanging around discreetly nearby and watching–if you get pissed and try to take your money back from the guy, good luck duking it out with four adult males at the same time).  It’s also super, super illegal, so although the potential benefits to the crooks are large, the potential costs are, too.

The fake petition

A pretty young girl who looks like pretty much every pretty young girl I’ve ever seen doing this hustle in Paris. Picture source:

The basic idea: a pretty girl asks you to sign a petition.  For no reason that I understand, it’s typically about better treatment for the deaf, and indeed, she pretends to be deaf.  Once you’ve signed, you’re pressured to donate some money for the cause.  She’s not deaf, nor are the other pretty girls who are with her with their own identical petitions, nor are the other pretty girls who you’ll see in other parts of Paris with their identical petitions on the same day.  In a variant of the usual approach, while you’re signing the petition, someone is picking your pocket.  This is mostly a Roma thing, and it’s common in front of Notre Dame and the surrounding areas, as well as the Hôtel de Ville.

The fake deaf-mute

This one happens on the local trains.  A guy gets on board and walks up and down the train leaving little printed notes on the empty seats, explaining that he is deaf/mute/whatever, and do you have a little spare change?  These guys are actually the least objectionable of all of the folks who I describe on this page–they don’t pester you.  I saw a variant of this in Slovenia last week–the guy went through restaurants, leaving his little cards (trilingual–Slovenian, Italian, and German) on the tables, with a couple little trinkets that you were invited to buy.

The free flower/rosemary/herb of some variety or another

This is a variety of the here’s-something-free-that-suddenly-isn’t-free-anymore scam.  I haven’t actually seen it in France, but I include it for completeness.  In the Spanish version, it’s a little old lady on the steps of a church.  If you don’t give her money, you are threatened with a Roma curse.  (I actually find this somewhat charming–who gets cursed anymore?)  I ran into a wonderful version involving an attractive woman in an extremely short dress in Turkey.  Wonderful mostly not in that there was an attractive woman involved, but in that I was able to participate in the ensuing mess with only as much knowledge of Turkish as you get from the Pimsleur course:

Click on the picture if you can't read it clearly.
My little adventure with a “free flower” lady in Istanbul.  Click on the picture if you can’t see it clearly–it’ll get bigger.


There are indeed lots of guys wandering through the restaurants in tourist areas trying to sell you roses in Paris, but there’s no deception involved (at least, not that I’ve experienced, and I did double-check this with a local), and they’re typically not pushy (pushiness being an identifying feature of hustling in its bad sense–see above)–it’s not really a hustle (in the bad way), per se.  I would call it the good kind of hustle–see a later post on the subject.

Videos of these folks in action

Here are some videos of these folks in action.  I didn’t shoot these–more on why you shouldn’t try to, either, below.  This is all stuff that I found on YouTube.

First, some pretty good footage of the friendship bracelet thing, shot in Italy.  I haven’t seen the shoulder thing in France, but the principle is similar–the guy does whatever he can to establish a situation such that you are physically in possession of the bracelet.  Other interesting points: notice the repeated use of a question that the guys know you’ve been answering automatically several times a day, and that it feels rude not to respond to: where are you from?  It’s also a question that lets the guy quickly establish some sort of rapport with you.  Another cute thing about this: notice the guy who keeps saying waka waka?  That’s not a Sesame Street thing–it’s Cameroonian English (Cameroon is a country in West Africa with two official languages: French, and English.)  It’s an exhortation–literally, it means something like “walk while working.”  You can hear it in Shakira’s theme song for the 2010 soccer (football, sorry) World Cup.

There’s a lot of dead footage in the beginning of this next video, but right about at the middle there’s some great footage of an attempt to snatch someone’s bags as they’re boarding the subway.  It’s a good view of how proximity to the door of a metro car is used to snatch stuff.  Atypically, these young ladies were unsuccessful, but you get the picture of how it works.

 Don’t try to film these guys in action

Don’t try to film any of this shit!  I think it’s great that people can get footage of this kind of shitty behavior and then post it on YouTube for the edification of the rest of us, but photographing or shooting video of a criminal in action is an excellent way to get punched a couple times and to have your expensive cell phone stolen.  Déconseillé, as we say in these parts.

Final words: don’t berate yourself, don’t be scared, don’t let it ruin your vacation, and don’t feel obliged to be polite to these folks

If you get snagged by the evil kind of hustler, it’s really easy to berate yourself afterwards for being a fool, a sucker.  Don’t.  Unless you go for the shell game, you’re not–these people are pros, they make their living this way. This kind of incident can really sour you on wherever you happen to be, too, and really cast a cloud over your trip.  Don’t let that happen!  These people are the tiniest, tiniest, tiniest, tiniest fraction of the people you’ll meet, and they’re pretty unlikely to be Parisians, or even French.  Plus, unless you fall for the shell game thing, these guys don’t actually take that much money off of you, and there are far, far more expensive hustles being worked in China and Turkey right now.  It’s also worth pointing out that there is very little violent crime in this country.  In America, you can get shot to death in a road rage incident pretty much any day of your life–it’s just a fact of life in our gun-cursed country.  In France, you might get robbed, but the chances of your being physically attacked if you’re not visibly Jewish are very, very low (and even if you are visibly Jewish, your chances of being physically attacked are still pretty low).  So, use some common sense, be aware that all you have to do is ignore these people, or in the case of a friendship bracelet guy handing you something, feel free to drop it on the ground and walk off without a word.  The truth is, these people are trying to rip you off, and you do not owe them one single tiny bit of the typical American friendly politeness to strangers.  You should also realize that there are plenty of people out there on the streets of Paris trying to make a living via the good meaning of “hustle”–just getting out there and working long hours in all kinds of weather, perhaps not totally within the law, but not hurting anyone, either.  We’ll talk about those in another post.

  • un tour de passe-passe: one French expression for the shell game–can native speakers help me with others?
  • l’arnaque (n.f.): rip-off, swindle, fraud, con.
  • arnaquer qqn: to rip off, swindle, or con someone.
  • c’est de l’arnaque: that’s highway robbery!
  • se faire arnaquer: to get ripped off, to be had.

Paris’s begging ecosystem

There are entire genres of begging in Paris, some unique to this city.

Picture source:

One evening I was on the RER (a regional train) on the way home from work when a woman of indeterminate age got on.  She was eating a Toblerone.  Excuse me, ladies and gentlemen, she said loudly.  (If it’s in italics, it happened in French.)  Could you give me some change, perhaps a euro?  She pulled out another Toblerone and examined it closely, turning it from side to side.  Sometimes I lure a man into a parking lot, and I bite him.  She put it slowly into her mouth.  Sometimes in Cameroon, I would eat a man.  Another Toblerone, which she chewed on meditatively.

By this point, I was seriously questioning my ability to understand spoken French.  I looked at my French coworker who happened to be sharing the train with me.  Did she just say…  Yep, he answered.  Parisians most definitely do not speak to strangers on trains, but this time a young woman sitting next to him joined in: “She says she eats men.”  (It’s pretty easy to tell that I’m not French, and she spoke English.)  The lady examined another Toblerone before putting it in her mouth.  I’m hungry.  If you have some money, some spare change… 

This was a very strange little speech to hear, and the whole box-of-Toblerone thing added a certain hallucinatory element to the experience.  But, in a Parisian context, it made a certain amount of sense.  Visitors to Paris usually notice pretty quickly that there are a lot of beggars here.  We talked in a previous post about why there are so many beggars here, and there are perfectly good reasons for it.  Although there are a lot of folks who are out there asking for money in this town, they actually fall into a finite number of classes, at least one of which is specific to Paris, and the cannibalistic Toblerone eater was an instance of one of them.  Here in France we love to classify things, so let’s run through the categories.  Beyond the intrinsic interest of the facts that there are categories at all and the nature of the categories themselves, it’s interesting to think about how the various and sundry categories manage to live together in an ecosystem of sorts–different kinds of beggars fill different niches in the city.

Métro: You will occasionally see someone–usually a man–get onto a métro car or a regional train and ask for money.  There’s a set ritual for this.  Basically, the guy makes a speech.  It tends to follow a specific pattern.

  1. Apology: Ladies and gentlemen, I’m sorry to disturb you during your trip.
  2. Statement of problems to be solved: I am homeless/jobless/I have four children and a sick wife and need a hotel room/money for food/diapers.
  3. Request: If you have some spare coins/restaurant tickets/a euro or two…

and then they walk through the car with a paper cup or with their hand out.  These guys don’t necessarily make much in a single car, but they typically do make something–more if they’re old, less if they’re young and look like they could be working for a living like the rest of us.  Then it’s off of that car and on to the next one.  In the light of the existence of this genre of begging, the Toblerone lady makes a certain amount of sense, and you have to give her credit for originality (or for insanity–I’m actually betting on the latter).

roma woman begging champs elysee
Roma woman begging on the Champs Elysée. Picture source:,paris/Interesting.

Eastern European Roma women on the Champs Elysées: There’s a genre of begging which until recently I’d only ever seen in Eastern European countries.  The way it works is that the beggar kneels on the bare sidewalk with his head on the concrete and his cupped hands held out to receive alms.  It looks really, really painful.  For the past couple years, I’ve seen Roma women doing this on the Champs Elysée.  Only Roma women so far, and only on the Champs Elysées so far.  Why them, and why there?  I have no idea.  Clearly, they’re Eastern European, but there are lots of Eastern Europeans in Paris, and I’ve yet to see any others begging like this.  Occasionally the police will come by and roust them.  They pick up their water bottles (this is, after all, 2016) and move on, then return later.

Disabled: One day this past winter I was on the metro on the way to work.  I was bundled up like everyone else in Paris, as it was cold–hat, leather jacket, neck warmer (I still haven’t been here long enough to wear a scarf), gloves.  Into the car climbed a guy in short-shorts.  His legs were these skinny, twisted things–maybe as big around as my forearm, and oddly bent.  He didn’t say a word to anyone–just struggled down the aisle with his hand out.  For a year or so, there was a guy sitting on the ground outside my metro station all day–no feet.  There’s a kid (I say “kid”–I would guess that he’s in his twenties) who has a spot outside the grocery store.  He sits there, silent, his head hanging, with a paper cup in front of him.  I’m pretty sure that he’s schizophrenic.

With kids: An Eastern European friend taught me that there’s a special place in hell for people who abuse their kids by using them for begging when they should be in school.  As far as I can tell, it’s mostly a Roma thing in Paris.  You park your family on the sidewalk under a blanket, children prominently displayed, and hold your hand out to passersby.  You occasionally also see Roma women with a baby panhandling–be especially careful, as some of them do a trick such that they only appear to be holding a baby, as it’s actually supported by a sling.  That’s the hand that picks your pocket.  (Let me point out that the vast majority of these ladies are just begging–but, the pocket-picking thing does happen, too.)

Parisian beggar with dogs. Picture source:

With animals to pet:  You’ll see a lot of people with an animal or two on their lap.  Drop some money in their cup and give doggie/kittie/bunny a scratch, if you feel like it.  Most weeks petting beggars’ dogs and cats is my only physical contact with another living being, so a lot of my change goes into these folks’ cups.  One of my favorite guys is usually in the Latin Quarter on weekend nights.  He has these two little spaniel mixes, and it’s clear that he adores them and they adore him.  The last time I saw him, I leaned over to drop a coin in his cup and pet the dogs.  It’s Orthodox Easter tomorrow, you know, he said.  (If it’s in italics, it happened in French.)  Really?, I asked.  Yeah, Easter–Orthodox Easter.  Cabbage, I said.  Have a good night.  (My French continues to suck.)  I still haven’t figured out why we had that particular conversation, other than the possibility that the next day might actually have been Orthodox Easter.  Lately I’ve been noticing shiftless young people with ill-kempt animals trying to do the pet-my-animal thing.  Their animals look like shit–not loved or cared for at all.  You can tell the difference, I think.  Note: be sure that the animal is there to be petted before you try to pet it!  This sounds obvious, and I guess that it would be to any non-stupid person.  However: I bent over to pet a kid’s pit-bull-looking dog one day without checking him out first, and he snapped at me.  I had no clue whatsoever that I was capable of jumping that far that fast–backwards, no less.  Obviously, if this dog had felt like ripping my arm off, he could have–he just gave me a little warning.  Learn from my stupidity.

Finally, there are plenty of run-of-the-mill beggars.  If they’re young, people mostly walk right by them, because there are plenty of frail old run-of-the-mill beggars that probably need your money even more.

Now, I’m not talking here about people who hustle–“hustle” in the good sense, or “hustle” in the bad sense.  With the exception of the people with animals, the people that I’m describing here are straight-up beggars.  Street musicians, mimes, comedians, dancers–that’s a whole nother genre.  Pick-pockets, 3-card monte, the ring scam, the bracelet scam–that’s yet another genre, and they each have their niches in the hustling ecosystem of Paris.

English notes

Short-shorts: very, very short pants.  Line from an advertisement for Nair, a leg-hair remover: Who wears short-shorts?  Nair wears short-shorts.  How it was used in the post: One day this past winter I was on the metro on the way to work.  I was bundled up like everyone else in Paris, as it was cold–hat, leather jacket, neck warmer (I still haven’t been here long enough to wear a scarf), gloves.  Into the car climbed a guy in short-shorts.  

bunny: an informal/children’s word for rabbit.  On my first visit to Belgium, I knew just barely enough French to order a meal in a restaurant.  Seeing a meat on the menu whose name I didn’t recognize, and being an adventurous eater, I ordered it.  It being pre-Internet, I had to ask a coworker the next day what I had had for dinner.  His response (in English): You ‘ave eaten, ‘ow you say… Bugs Bunny.  How it was used in the post: You’ll see a lot of people with an animal or two on their lap.  Drop some money in their cup and give doggie/kittie/bunny a scratch, if you feel like it.  

French notes

Cameroun: Cameroon.  Pronunciation: the is silent, so [kamrun].

Roma: there are many ways to say “gypsy” in French.  In part, I know this because my favorite neighborhood bum gave me a lecture on the topic one day, with statistics.  I have very little clue as to the current social acceptability of any of them; as far as I know, Roma or Rom is OK (just as it is in the US, where the word gypsy is definitely not OK in all circles), but I’m pretty sure that all of the others have varying levels of pejorativeness.  How it was used in the post: For the past couple years, I’ve seen Roma women doing this on the Champs Elysée.  Only Roma women so far, and only on the Champs Elysées so far.  

Who has a sagittal crest?

Before you hit your dog, remember that he can bite your hand hard enough to break it–but, he chooses not to.

Due to some WordPress layout issues, there are occasional gaps in this page.  Please scroll down to get past them.  Sorry!

what if i never find out whos a good boy
Picture source:

In America, we do love our dogs.  A culturally common way for us to show our dogs affection is this: we pet them, while saying Who’s a good boy?  (or Who’s a good girl?, depending on gender).  In my family, we do it a little differently: we pet the dog while saying Who’s got a sagittal crest?  Dogs don’t look at you with any more or less puzzlement regardless of which one you pick, so: feel free to go crazy with this one.


Badger skull. The arrow is pointing at the sagittal crest. Picture source:

What’s a sagittal crest?  The next time you run into a dog, run your hand along the center of the top of his skull.  That ridge that you feel is his sagittal crest.  Sagittal means along a plane that runs from the front to the back of the body.  A sagittal crest runs along that plane.  This sense of crest means something sticking out of the top of the head–think the plume on top of a knight’s helmet.  Many animals have a sagittal crest, but not us modern humans.  You see them in species that have really strong jaw muscles.  A sagittal crest serves as one of the points of the attachment of the temporalis muscle, which is one of the main muscles used for chewing.  If you have a sagittal crest, you can have a bigger temporalis muscle, which means that you can bite/chew harder.

gorilla skull
Gorilla skull. Picture source:

If you look at relatively close relatives to humans, you see sagittal crests on some of them.  To the left, you see a gorilla.  You wouldn’t want to get bitten by this guy.  (Note that some gorilla species, especially their males, have really enormous sagittal crests–this is actually a pretty modest one, for a gorilla.)





pan troglodytes skull
Excellent replica of a Pan troglodytes (common chimpanzee) skull. Picture source:

Here’s (an excellent replica of) a Pan troglodytes (common chimpanzee) skull.  This guy (I think it was a guy) had more of a sagittal crest than you (you don’t have any), but he didn’t have much, compared to that gorilla.  Other chimps vary.  Monkey species vary pretty widely regarding the presence or absence of a sagittal crest.







An Australopithecus robustus species. This specimen is known as “The Black Skull.” Picture source:

Some hominids that were ancestral to us had sagittal crests, but they disappeared pretty early in the course of our evolution.  Here is a picture of the “Black Skull,” about 2.5 million years old.  It’s from a type of Australopithecus robustus.  By the time Homo erectus comes along (starting about 1.9 million years ago and lasting until about 70,000 years ago), the sagittal crest is gone.  Picture below.

So: feel free to express your affection for your dog any way you want–you can’t possibly be any geeker than my son and me.  Scroll down past the picture for French vocabulary.

Homo habilis skull, dated at 1.9 million years ago. Picture source:


Relevant French vocabulary (see the Comments section for more):

  • la crête sagittale: sagittal crest
  • le muscle masticatoire: chewing muscle (note: the “c” in muscle is pronounced in French)
  • le muscle temporal: temporalis muscle
  • la morsure (action de mordre): bite (noun)
  • la morsure (marque de dents): teeth marks

Data mining, text mining, natural language processing, and computational linguistics: some definitions

Parsing, data mining, and encryption are not going to get you. That pistol in your nightstand might, though.

Every once in a while an innocuous technical term suddenly enters public discourse with a bizarrely negative connotation.  I first noticed the phenomenon some years ago, when I saw a Republican politician accusing Hillary Clinton of “parsing.”  From the disgust with which he said it, he clearly seemed to feel that parsing was morally equivalent to puppy-drowning.  It seemed quite odd to me, since I’d only ever heard the word “parse” used to refer to the computer analysis of sentence structures.  The most recent word to suddenly find itself stigmatized by Republicans (yes, it does somehow always seem to be Republican politicians who are involved in this particular kind of linguistic bullshittery) is “encryption.”  Apparently encryption is now right up there with dirty bombs in terms of things that terrorists are about to use to kill us all.  (“All” might be an exaggeration.  I find it interesting that the United States had 33,169 firearm deaths in 2013–roughly 11 times as many deaths as on 9/11–and yet, Republicans seem to think that it’s important that we make firearms as widely available as possible.  I guess they just don’t like people very much.)  As a moderately technical person, this strikes me as odd, since I’ve always thought of encryption as that nifty mathematical technique (I was about to say “algorithm,” but I think the Republicans are down on that one now, too) that keeps you from intercepting my text messages, me from reading your Ashley Madison profile, and so on.

In between the Republican outrage over parsing and the current panic over encryption, we had the sudden appearance in the public consciousness of data mining.  As far as I knew up to that point, data mining was a bunch of statistical techniques for finding relationships between things.  Suddenly it was showing up in scary news stories–Google the phrase “data mining is evil” (you have to put the quotes around it to search for the phrase, as opposed to the individual words) and you will get 1,400 hits as of the time of writing (May 2016).

Besides being bemused by this intrusion of American know-nothingness into public discourse, I have a personal stake in the issue, because people often refer to what I do for a living as text data mining.  This is a misnomer–by its nature, data mining is not something that you can do with texts.  Bear with me and I’ll explain why, and then we’ll look at some French vocabulary for talking about all of this.

Data mining is a set of mathematical and computational techniques. Somehow it became a threatening expression a couple years ago, leading to crap like this. Picture source:

Data mining is basically about databases.  In a database, the statistical techniques of data mining can help you do things like discover that Republicans with HBO subscriptions are more likely to consider voting for Romney in a primary than Republicans who don’t have HBO subscriptions.  (Real one, if I remember the facts correctly.)  You can do that because you have a table in the database that tells who’s a Republican, a table that tells who has HBO subscriptions, and a table that tells you which members of a random sample told the interviewer that they would/wouldn’t consider voting for Romney in a primary.  Data mining is the science/art of figuring out what things are related (HBO subscription/willingness to vote for Romney) and what things aren’t related (making one up here: having bought an Escalade and being willing/unwilling to vote for Romney in a primary)–this among probably thousands and thousands of variables.  Doing data mining research requires things like knowing particular kinds of math, understanding how to sample a population, getting computers to do complicated calculations in a way that is time-efficient—stuff like that.

text mining
Text mining is a set of techniques for using computers to cope with the enormous amount of written information in the world today. Picture source:

With data mining, you have that database, and you know what everything is.  With “text mining,” or “text data mining,” as some people call it, you have texts, and you don’t know what anything is.  (By “you,” I mean a computer program.)  This is usually talked about as a difference between “structured” data (i.e., the database)–you know what everything “is”–what it “means”–in some sense, its semantics.  Whoops–that sentence got a little out of control.  “Unstructured” data: that’s typically how we would describe text.  With text, you know what nothing is–you don’t know what anything means–in a very literal sense, you don’t know its semantics.

“Text mining” could be thought of as turning unstructured data into structured data.  You’ve got a bunch of texts, and you want to use it to populate a database, perhaps.  Maybe you have 23 million journal articles in the National Library of Medicine, and you want to find every statement that those 23 million articles make about which genes are affected by which drugs.  Maybe you have a huge collection of French fairy tales, and you want (the computer) to find every time that a stepmother is mentioned and whether the portrayal of the stepmother is positive or negative.  You could think of both of those as turning unstructured data into structured data–you’re taking that unstructured data and using it to build a database about drugs and proteins, or a database about stepmothers.  You can see now why we tend to prefer the term “text mining” to “text data mining”–to the extent that “data mining” is about structured data, it doesn’t really make sense to talk about “data mining” with respect to language.  Where the data mining person basically just needs to know math, the text mining person needs to know something about how people write about whatever it is that you’re interested in.  I do a bit of text mining.  People will have really specific requests–tell me whether or not the genes from some experiment show up in the cancer literature, say; tell me if this is a suicide note or not; read this doctor’s note and tell me if this kid is a candidate for epilepsy surgery; stuff like that.  It’s not really linguistics, but it pays the bills, and it suits my need to do something that might actually make the world a better place.

A related field is natural language processing.  Natural language means human language, as opposed to computer languages.  Natural language processing is about building tools to handle specific linguistic tasks–parse a sentence, figure out parts of speech, stuff like that.  You might use a combination of different language processing programs to do a text mining task.  I find this more interesting, since the questions are less about some set of facts than they are about the language itself.  Where the data mining person needs to know math and the text mining person needs to know how people write about genes and drugs, or stepmothers, or whatever, the natural language processing person needs to know something about language itself–what kinds of structures sentences can have, how word frequencies are distributed, how to build linguistic resources for letting a computer process things that can’t be directly observed (e.g. semantics).  I do a lot of this kind of stuff.  Recently I’ve been working on coreference resolution–how to get a computer to recognize that Obama, President Obama, and Barak Obama are all referring to the same thing in the world, while Mrs. Obama and Michelle Obama are referring to something else in the world.  (Recognizing that those “things” in the world are people, as opposed to, say, locations, or the names of companies, is a whole different story.)

Yet another field is computational linguistics.  This is about using computational models to test theories about language.  This is my favorite, but it’s the hardest to pay the bills with.  I do some of this, too.  Nowadays a lot of my time goes into large-scale attempts to model the semantics of biomedical language.  I’m trying to investigate differences in the semantic primitives of biomedical language versus “general” English by building a large set of data-driven semantic representations of predicates found in journal articles; I’ll then compare that resource to a similar resource built for general English and look for things like whether or not the semantic primitives seem to come from the same set, whether or not given verbs have different representations in the two types of language, etc.  My hope is to get a sense of the range of types of semantic variability from this particular project.  You could imagine using computational linguistics work to build natural language processing tools, and then using those to carry out practical text mining tasks.  You could use the text “data” mining results to do actual data mining.

gender binary examples
Mathematical representations of semantics can define how the gender binary gets manifested in English. This diagram transforms gendered word relationships into a map-like space. Pairs like girl/boy and aunt/uncle have the same “spatial” relationship. Picture source:

As you can tell from my examples, I’m very much in the world of biomedical language.  There’s also a lot that you can do in the humanities with this kind of stuff.  A hot topic in the future might be using mathematical representations of semantics to study things that are/are not thought of as binaries–gender, sexuality, race, political economy, whatever.  However, I would not claim to do ANY of that–I can just barely explain it.  For more on that kind of stuff, see this excellent post by Ben Schmidt.

In practice, even people in the field don’t always differentiate between these terms, or at least don’t draw sharp boundaries between them.  My business card says that I’m the director of a text mining group, but I identify most strongly as a computational linguist.  We figured that “text mining” makes more sense as a practical field of inquiry to have within a medical school (which is where I work), so that’s what we called the group when we formed it.  If you go to the annual conference of the Association for Computational Linguistics, you will see almost no computational linguistics, but rather a ton of natural language processing.  If you go to the annual Biomedical Natural Language Processing meeting, you’ll see a mix of text mining, natural language processing, and a bit of computational linguistics.  Sometimes the distinctions really matter, though.  This post started its life as a response to someone who asked me to be on a panel about data mining, to talk specifically about text data mining.  When I responded that I don’t do data mining, they asked what the difference is–this blog post started out as my response.

As far as I can tell, the relevant community in France doesn’t make these distinctions in any kind of rigid fashion, either, despite the much-vaunted French penchant for categorization (see Nadeau and Barlow’s excellent book for a discussion of where it comes from).  However, French does have technical vocabulary for all of these fields.  Here it is:

  • fouiller: to excavate; to rummage through, to search (see also here)
  • la fouille de données: data mining
  • la fouille de texte(s): text mining
  • le traitement automatique des langues naturelles: natural language processing
  • la linguistique informatique: computational linguistics

Why there are so many beggars in Paris

There are historical reasons for the large number of beggars in Paris.

Le mendiant et son enfant Yves
Le mendiant et son enfant Yves, “The beggar and his son Yves,” dated to 1317. Picture source:

The typical stereotype of Paris is as a beautiful, majestically historical city that just oozes romance, and indeed, Paris is all that.  But, visitors are often surprised to find that it is also a city with a sometimes astounding number of beggars on the street. The reasons behind this are many, and varied, and, I think, interesting.

In the pre-modern period, the vast majority of the French (like the vast majority of everyone else in the world) were farmers.  Most children didn’t live to adulthood, and you needed a lot of hands to work the farm, so people had big families.

In the 1500s, the French death rate took a relatively sudden drop.  People were still having those big families, so there were a relatively large number of people making it to adulthood.  The inheritance laws of the time included primogeniture, i.e. inheritance of everything by the oldest son, so lots of those people wouldn’t have a farm of their own to work.  Options were limited, and if they couldn’t find other employment, a lot of people hit the road.  (There’s an excellent description of the mechanics of this phenomenon in Robert Darnton’s The Great Cat Massacre and other episodes in French cultural history.)

If you hit the road in France, you’re eventually going to end up in Paris, if for no other reason than that it’s the hub of the road system (and today, the rail system).  If you can’t find other employment, your options come down to begging or stealing, and most people aren’t thieves.  So: begging.

Begging actually has a very long and somewhat respectable history in Europe.  As Robert Cole puts it: “In the middle ages, ‘Christian charity’ perceived the poor as God’s special children and therefore deserving of alms.”  Begging can be a profession, really.  (Old Eastern European Jewish joke: beggar hits a guy up for money.  Guy gives him some helpful hints on improving his approach.  Beggar responds: YOU’RE telling ME how to beg?  This would make total sense in a French context: a métier (profession) is a métier, whether you’re a doctor, an engineer, or an elevator operator.)

If you’re gonna be a beggar, though, it helps to have a schtick.  Physical lack of ability to work was a good one, and Parisian beggars were known for faking such a disability, leading to their squatting areas being known as Cours des miracles (“Courts of miracles”) for their recovery at the end of the working day.  (There was one just to the north of what is now the Place des Vosges, I believe.)  By the 1500s, begging wasn’t viewed quite as kindly.  Robert Cole again:

In sixteenth-century Paris the poor were viewed as merely layabouts who preferred to live off public welfare.  Meanwhile bad harvests, plagues, inflation and religious war increased their number dramatically.  Public begging was outlawed in 1536, and in 1551 laws were enacted which limited eligibility for public assistance and forbad women to have their children in tow when selling candles outside churches.  To do so, went the rationale, evoked sympathy from prospective customer, which proved that such women were really only begging.  A traveller’s history of Paris.

So: there have been a lot of beggars in Paris for centuries.  In 2007, the European Union was enlarged to include a couple countries with large Roma populations.  There have always been Roma in France, but now a lot more came (the Roma rights group FNASAT says 12,000 currently, and that’s after 10,000 being expelled in 2009 and another 8,000 in 2011; other estimates range from 20,000 to 400,000), and they are a prominent part of the Parisian begging ecosystem.  (There is, indeed, a Parisian begging ecosystem, and there are actually a number of distinct genres of begging in Paris–a subject in and of itself.)

To be clear: if you don’t give charity, your life is pointless.  Let me point out that this is a teaching of at least Christianity, Judaism, Islam, and Hinduism, and–for my fellow secularists in France–Rousseau, the revolutionary Constituent Assembly, National Convention, and Directory, and modern French philosophers from Sartre to Alain Finkielkraut.  (All of those links are to citations on the subject, not to their biographies.)  The Buddhist view of charity is especially appealing to me, as a (really bad) student of judo:

Buddhism views charity as an act to reduce personal greed which is an unwholesome mental state which hinders spiritual progress.  What Buddhists believe, Venerable K. Sri Dhammananda Maha Thera.

Judo’s view of the best human relationships is mutual welfare–we’re taught that human interactions should be mutually beneficial.  So, if it’s the case that charity benefits both the giver and the receiver, then it’s very judo.  Seriously, give charity–if for no other reason than that you’ll feel better about humanity if you take part in it being more humane.

  • le mendiant: beggar.
  • le gueux/la gueuse: beggar (literary).  A number of other, more pejorative meanings–highwayman for men, whore for women, etc.  Probably obsolete, but keep it mind for when you read Tartuffe.
  • le clochard: beggar; also bum.  (Slang.)
  • le/la clodo: beggar; also homeless person, tramp, hobo.

Some additions from native speaker Phildange:

  • le vagabond: wandering beggar, hobo.
  • le chemineau: same as above.
  • faire la manche: to beg.

Bilingual dictionaries: how to pick them, how to use them

I was in the Navy with an Armenian woman.  (No, you don’t have to be a citizen to serve in the American military, and that’s probably true in most countries.  In France, you can get citizenship by serving in the military–you are français par le sang versé, “French by spilt blood.”  This isn’t the case in the United States–you can apply for citizenship as a member of our military, but there actually isn’t any guarantee that you’ll get it.) We’ll call her Nairi (not her real name).  Like many members of the Armenian diaspora, Nairi was massively multilingual–she spoke Armenian, Arabic, and Spanish natively, and French and English as very strong second languages.  (I once saw her mother test her to make sure that she wasn’t forgetting any of them.)  One day Nairi came back from leave (what we call vacation in the military) with a seven-language dictionary.  I admired it, and she insisted that I take it.  I refused, she insisted, I refused, she insisted, I refused, she insisted, and finally, I took it.  What I didn’t realize was that in Armenian culture, if someone admires something of yours, you must insist that they take it.  Armenians know that they most certainly should not take it–I didn’t.  Now I do.  Stupid me–every time I see that dictionary on my bookshelf, I feel like a total jerk.

In a recent post, we talked about monolingual dictionaries–that is, dictionaries that list words in some language and give definitions of them in that same language.  Today, let’s talk about bilingual dictionaries–that is, words that list words in some language and give corresponding words in another language.  Of course, anything that we might say about bilingual dictionaries applies equally to dictionaries with even more languages, like the one that I stupidly took from poor Nairi.

I carefully said “corresponding” words just above–I carefully didn’t say “equivalent” or “the same” words.  This is because it’s often the case that there isn’t a single translation from one word in one language to one word in another language.  Even when there is one, it doesn’t necessarily “mean” the same thing, in some sense of the word “meaning.”  To give you an example from my college French 101 textbook: a fenêtre in French is a window in English–fine so far.  But, say window in English, and the referent is most likely a casement window, specifically–one that slides up and down.  Say fenêtre in French, and the reference is most likely a window that opens in the middle–horizontally.  (We would call this a French window in English.  See this post for a list of things that we call French something-or-other in English that aren’t called anything of the sort in French.)  And, as I said, there often isn’t just one.  A language that I worked on in grad school has the word invert.  But: invert what?  If you’re inverting a hollow object, that’s one verb–if you’re inverting a solid object, it’s another verb.  French has maybe two words for snow–la neige, and la poudreuse (powder snow).  Depending on how you count, English has 13 or 55 or 120 (scroll down past the Inuit words) or 182 words for snow.  So: not a 1-to-1 correspondence.

Having at least mentioned some of the theoretical issues, let’s look at the practical points of buying and using a bilingual dictionary.  In these days of Amazon, you can use reader reviews in a way that we never could before–it’s really a nice advantage over the old pre-Internet days.  However, there are also some specific things to look for.

  • Example sentences: you want a dictionary with example sentences, at least in the language that’s foreign to you.
  • Verb + preposition combinations: a good dictionary should tell you which prepositions, if any, go with which verbs.  You need to know, for instance, that in English you shoot at something, you lean toward (have a preference for) something, and you stop doing something, with no preposition.  Likewise, in French you need to know that you tirer sur or tirer contre (shoot “on” or shoot “against”) something, you pencher pour (lean “for”) something, and you arrêter de (stop “from”) doing something.
  • If you are working with language(s) that have gender, you want the gender to show up both in the Language1 -> Language2 section and in the Language2 -> Language1 section.  If you look up kitchen towel and find that the translation to French is torchon, you don’t want to then have to go to the French -> English section to see whether it’s le torchon (it is) or la torchon (it isn’t).
  • This might seem obvious, but make sure that the pronunciation is given for the words in any language whose pronunciation isn’t obvious from the spelling–and, yes, that includes both English and French.
  • This takes a while, but: when you find the word that you’re looking for in the other language, you might want to look it up in the other direction.  For example: suppose that you look up the English word towel in a crappy bilingual English/French dictionary.  In a crappy dictionary, you might find the following: serviette, torchon.  Both of those can, indeed, be used to translate towel from English to French–but, they’re not equivalent.  Serviette is for a bath or beach towel, while torchon is for a kitchen towel.  You want a dictionary that will distinguish between the various possible translations.  It’s often useful to look the French words up in turn (or the English words, if you’re going from French to English).  If you do that, you’ll find that a serviette can be a towel, but also a napkin, or a briefcase.  A torchon, you’ll find, can also be a messy document, or a rag.  It’s good to be on top of this kind of thing when you’re trying to choose between supposed synonyms.
  • Labelling of registers, or levels of appropriateness: you most definitely want a dictionary that includes slang, obscenities, informal words, etc., or you’re not going to get very far in real life.  However, you also want a dictionary that labels words that are non-standard–offensive words, etc.  This kind of thing can be really, really hard to catch when you’re learning a language from movies, your neighbors, etc.

The always-awesome Lawless French web site has a good page on the subject of how to use a bilingual dictionary, and it has much better examples than I do.  You can find it here.

So, what are some good bilingual English/French dictionaries?  Here are some options.

  • The best thing out there these days is almost certainly  It has lots of language pairs, example sentences, colloquial expressions, pronunciations, male and female forms of adjectives, plurals, a verb conjugator, and a reverse look-up feature that does exactly what I suggest you do in the last bulletted item above.  The auto-c0mplete feature in the search box saves me enormous amounts of time (and guessing about spellings).  There’s an excellent WordReference iPhone app.  Be aware, though, that the iPhone app will not generally let you look up obscenities–you have to go to the web site for that.
  • For the Kindle or for the Kindle app on your phone, the Collins English-French and French-English dictionaries are quite good.  They’re quite highly rated on  I have the Collins dictionaries on my phone, and use them whenever I don’t have Internet access and therefore can’t get to  The Collins dictionaries also have an advantage over WordReference: they don’t give as many super-subtle translations.  The only bad thing about WordReference is that it can sometimes give an overwhelming number of other-language translations.  That’s great when you want it, but when you don’t, you might prefer the Collins dictionary.  As it happens, there is a Collins dictionary tab on the WordReference site, and it’s easy to click on that.
  • is fantastic for seeing things in context.  You will generally get lots of example sentences.  There’s an iPhone app for that, too.
  • is another good one for seeing things in context.  It sometimes has better coverage of colloquial, slang, and obscene language than Linguee does.  Again, there’s an iPhone app.

I found Nairi on Facebook recently.  I sent her a friend request–no response.  Is it because she doesn’t remember who the hell I am?  Is it because she hates me for taking her dictionary?  I have no idea.  Nairi, if you’re reading this: I’m sorry!

Refugees are dying and I can’t understand the word for “capsize”

Refugees and migrants are dying in shocking numbers in the Mediterranean. Here is some vocabulary that you’ll need to know to talk about the tragedy in French.

Map of the European migrant crisis as of 2015. Picture source:

One of the ways that the world is sucking right now is the migrant crisis in Europe.  As I write this (in April 2016), there are tens of thousands of refugees and migrants stranded in Greece.  Many of these people cross from Turkey to Greece by boat, and many go from North Africa to Italy by ship.  Tragically high numbers of these sink; in April of last year, five vessels sank, with a death toll of about 1,200 people.

The other day I was listening to the news on the radio.  It was yet another story about the refugee crisis.  The word aufrage kept coming up, but I couldn’t find it in my dictionary.  Un aufrage, I kept hearing.  Looking up similar stories on line solved the mystery: it was not un aufrage, but un naufrage–a capsizing or shipwreck.  I had “segmented” (as linguists say) the n of naufrage as part of a separate word, coming up with un aufrage. 

This isn’t an uncommon phenomenon.  One of the surprises for students in introductory linguistics classes is that in speech, there are no breaks between words–if I showed you a spectrogram (a sort of recording of a sound wave) of a sentence, you would see a continuous sound.  “Segmenting” that stream of speech into smaller units is something that humans do–it’s not something that’s there in the acoustics.

Occasionally speakers of a language will, over time and as a community, “reanalyze” words in a way that changes the segmentation, and eventually the pronunciation.  The word uncle is a word that has undergone this process.  A variant of the word in English is nuncle.  Oxford describes it as archaic or dialectal, but it’s there.  You can see it in Shakespeare:

Can you make no use of nothing, nuncle?

–King Lear, Act 1, Scene 4

The word is thought to have come from a segmentation of phrases like mine uncle as my nuncle, thine uncle as thy nuncle, etc.

The same thing can happen in other languages, too–any time people speak, there’s an opportunity for segmentation errors.  Children who are learning their mother tongue often try out different segmentations.  For example: in a past post, we looked at some bear-related vocabulary in French and English.  Here are various and sundry relevant phrases:

  • un ours: a male bear.
  • une ourse: a female bear.
  • un ourson: a baby bear; a teddy bear.
  • un nounours: a teddy bear.

I once read a great blog post in which a French guy wrote about his toddler producing three different pronunciations of the word ours (male bear) in one day: ours, nours, and I believe lours (the last one would be a reanalysis of l’ours, “the bear”).  (Sorry I’m guessing about that last one–I can’t find the guy’s post.)

Linguistics geekery, which you should feel free to skip: one of my homeworks in Phonetics 101 was to look at spectrograms and find indications of syllabic association, which can correspond to word segmentation, on occasion.  It’s possible to do so–sometimes.  For nasals in French, as far as I know, it would be restricted to some variability in when a vowel is nasalized before a nasal consonant, versus when it’s produced as a sequence of an unnasalized vowel before a nasal consonant.  American English speakers, who have no contrast in nasalization versus lack of nasalization before a vowel, are unlikely to be able to perceive it, and I don’t know at what age a French kid would be likely to acquire it.

I have no clue how the current situation will or should be resolved.  Obviously, if your town is being destroyed by the Syrian government, or ISIS, or whatever other assholes are causing death and misery in the Middle East these days, it makes sense that you would take your family and go elsewhere, and it’s simple human decency to shelter people in that situation.  However, the situation is not clear in other ways–even the fact that the Wikipedia article on the subject is titled European migrant crisis and not European refugee crisis is a loaded choice, and one that has implications about how the people who are affected should be treated.  The situation continues to evolve, with European and world sympathies tilting now one way and now the other–in favor of sheltering the affected people after a tragedy like the widely-publicized drowning of a Syrian toddler, and in opposition to it after the despicable assaults on women by crowds of migrant men last New Year’s Eve in Germany.  Certainly the situation will have long-range effects on Europe.  I began this post by talking about one of the ways in which the world sucks right now–the existence of this crisis.  One of the ways in which the world doesn’t suck right now is that many people in many countries have been very active in welcoming refugees, providing real support services for them, and generally acting like decent human beings.  This will get worked out.


It’s raining, it’s pouring, the old man is snoring: how to talk about rain in English and French

How to talk about rain in English and French.

It’s raining, it’s pouring, the old man is snoring,

He went to bed and he bumped his head and he didn’t get up ’til the morning.

–Children’s song

Adam Gopnik once described Paris as “a scowling gray universe, relieved by pastry.”  The “gray” part comes from the observation that it’s very often cloudy here.  Actually, one of the things that I love about Paris is that it rains here.  In the US, I live in a very sunny, dry part of the country–300 days of sunshine a year.  However, I grew up in a very, very wet part of the country, and I miss that.  So, coming to Paris in March and seeing flowers bursting from wet earth on my walk to work through the forest is a real treat.

Being from a very wet place, I have a large vocabulary for talking about rain in English.  Here are some examples of relevant verbs.  These are all impersonal verbs, using what linguists call a pleonastic pronoun, i.e. it’s:

  • to rain: the default verb.
  • to pour: to rain hard–see the children’s song above.
  • to rain cats and dogs: to rain hard.
  • to rain/pour buckets: to rain hard.
  • to mist: to rain very lightly.
  • to drizzle: to rain, especially if it’s cold.  (I’ve seen a couple definitions of this as “to rain lightly.”)
  • to sprinkle: to rain, especially for a short period of time.
  • to storm: to rain very hard, often with thunder and lightning.

Usage examples:

  • pleuvoir: to rain.  Il pleut: it’s raining.  (I always seem to confuse this with il pleure, “he’s crying.”
  • Il pleut à verse: it’s pouring.  (Native speakers: can we do the liaison here?, i.e. il pleu tà verse?)
  • Il pleut des cordes: it’s raining cats and dogs, it’s pouring rain.
  • Il tombe des cordes: same thing.
  • Il bruine: it’s misting.
  • Il crachine: it’s sprinkling.
  • y avoir de l’orage: to storm.
  • faire de l’orage: to storm.

I’ve focussed entirely on verbs here.  For lots of nouns and adjectives related to rain in English, see this great post from the web site.



Parallel corpora, collocations, and crazy people on the Métro

In which an encounter with a crazy guy on the subway leads to a statistical analysis of French adverbs.

One evening I was riding the metro home when a guy got into the car with some used books to sell.  A man sitting across the aisle from me asked to see them.  He flipped through one of them, then took a pen out of his jacket pocket and began circling words–in this book that the other guy was trying to sell.  Are you going to buy that?, the would-be bookseller asked the guy with the pen.  They exchanged words–the bookseller was not happy about having his books marked up.  The bookseller said something that Mr. Pen apparently thought was obvious or stupid.  Il est fort, lui, he snorted–he’s a sharp one. 

The central meaning of fort/forte is “strong,” but it can also be used adverbially.  You hear it a lot that way, and I’ve been trying to figure out exactly when you can use it in that way–it’s often the case that there are word combinations that are possible in a language, but that don’t sound right.  Rather, there are particular words that are conventionally used in very specific combinations.  Violeta Seretan of the University of Geneva gives some examples of English words that are used to describe the magnitude of various nouns.  The semantics of each of these is the same, but the words that are typically used are quite different.  We talk about big problems, heavy rain…  How about injury?  (Answer below.)  It would certainly be possible to say large problem, but it’s nowhere near as likely, and it sounds odd, as a native speaker.  For example, you could say large problem, but it seems odd.  I wanted to be able to demonstrate that this corresponds to some actual statistical tendency, not just my intuitions, so  I searched the enTenTen corpus, a collection of almost 20 billion words of written English, looking for big problem and large problem.  Here are the frequencies:

  • big problem: occurs 6 times per million words.
  • large problem: occurs 0.5 times per million words.

Big problem occurs twelve times more often than large problem–the latter is possible, but it’s not really what you would expect to hear from a native speaker.  We call these things like big problem “collocations”–combinations of words that occur statistically more often than you would expect by chance.

You can find collocation dictionaries for English, and they’re quite useful for second-language learners.  I don’t know of any for French, though, or at least not where to find them in the US, which is where I am at the moment.  (I’ve seen similar things in Canada.)  I additionally want to know how these adverbial uses of fort should be translated into English, so I need a way to figure this kind of thing out for myself.

First step: find a whole lot of French text in some easily searchable form.  I started with the French section of EUROPARL–a collection of documents from the European Parliament, translated to/from a wide variety of languages.  The French section of EUROPARL contains about 59 million words–so, a whole lot–and you can access it through the Sketch Engine web site–so, easily searchable.  A quick search showed me that fort is quite common in that data set:

Screenshot 2016-04-10 13.23.54
Fort shows up 17,130 times in French section of the EUROPARL corpus–257 times per million words.  That’s pretty frequent.

Once I know that, I know that there will be enough data to calculate the collocations–recall that this is a statistical thing, so you need plenty of data.  The Sketch Engine interface gives me a number of options for how to do the calculations (scroll down to get past the screen shot):

Screenshot 2016-04-10 13.26.44

…which I show you just so that you’ll see that there are a lot of approaches to doing this. I just went with the defaults.

The calculations yielded quite a few possibilities.  Here are some of them:

Screenshot 2016-04-10 13.30.59

If you’re a stickler for data, you might have noticed that the collocations are ordered by the log of the Dice coefficient, which you could think of as a measure of the statistical effect, I guess.  I am really looking for the most common collocations involving fort, though, so I’ll reorder by the cooccurrence count, i.e. the raw count of how often the collocations occurred:

Screenshot 2016-04-10 13.53.36

Crap–that basically tells me nothing.  Why not?  Zipf’s Law.  Remember that Zipf’s Law tells us not only that most words are pretty rare, but also that some words are really, really common, and in French, that certainly includes de (“of”), et (“and”), une (“a”), and the rest of what we’re seeing here.  (Moral of the story: don’t expect the most frequent things in a language to necessarily be the most revealing things in a language.)  If I scroll down a bit, though, I see bien on the list.  683 examples of this–a frequency of 10.25 per million words.  Bien is often an adjective, which would presumably make fort adverbial in these cases, so we’re on to something now.  Let’s check out some of those examples:

Screenshot 2016-04-10 13.58.14.png

So, now I have some cases where it would make sense to use fort, but I want to know how they would correspond to English, too.  This requires that I have access to the corresponding English text.  No problem–recall that the EUROPARL corpus is multilingual.  In particular, it is what is known as a parallel corpus, which means that it contains the same contents in multiple languages, not just similar contents (although that kind of corpus can be useful, too).  I searched for the phrase fort bien.  Here’s an example of the output:

Screenshot 2016-04-10 14.12.24

So, now I have some French/English equivalents for fort bien:

  • Étant donné les prévisions de la politique structurelle ­ que je connais fort bien With these forecasts of the structural policy – which I know very well
  • ce que Jean-Pierre Chevènement a fort bien nommé récemment… referred to recently, and very aptly, by Jean-Pierre Chevènement
  • C’est pourquoi, comme l’a déjà fort bien expliqué M. Kalas  Hence, as Mr Karas has stated to his credit
  • je comprends fort bien la préoccupation  … I have a great deal of sympathy for the unease
  • Vous savez fort bien que…  You know very well that
  • non seulement parce que le président le connaît fort bien…  …not only because the President is very familiar with it…
  • Il est fort bien d’ organiser des réunions, mais ce sont les résultats qui comptent.  Meetings are all very well, but it is the result that counts.
  • ils se tirent fort bien d’affaire.  …they are managing really rather well.
  • et je les comprends fort bien.   …which I fully understand.
  • Ils les connaissent fort bien et un par un.  They recognise each and every one of them very well.

I’m feeling good about how to use fort bien now, but I want to know about other ways that fort could be used with an adjective.  So, I’ll do another search of the parallel corpus (i.e. the matched French and English texts), but this time I’ll just search for fort, and I’ll specify that I want it to be an adverb.  Here are some of the results:

Screenshot 2016-04-10 13.39.56

Now I have some general examples of how to use fort:

  • Nous estimons fort positif que  We see it as a very positive sign that
  • Le rapporteur constate également fort justement que The rapporteur has also quite rightly stated that
  • Ce que nous faisons maintenant est probablement fort important…  What is being done may well be very important
  • …l’ Union européenne a fort justement octroyé  …the European Union was right to support…
  • nous entretenons des relations bilatérales fort satisfaisantes avec  …We have very satisfactory bilateral relations with

I don’t know every adjective with which it would be OK to use fort, but I know one more than I did when I got out of bed this morning, and I’m cool with that–one less time when I’ll have to use très, which is all that they teach us in school.

A colleague had some observations on this:

On top of being used in collocations, it also marks a style / genre which is somewhat formal or elevated (“soutenu”). This might explain why it remains frequent mostly in collocations and is less frequent (or more marked) in freer combinations. This gives the expression a literary turn or a pretense to a higher register.  Both in speech and in writing, it is “soutenu.”

Another native speaker had this to say about it:

“Fort” is used as a synonym of “très”, before adjectives or adverbs . You can use it in about any case, it’s just more elegant than “très”, but not really literary .

The Mr. Pen guy on the subway turned out to be pretty crazy, as far as I could tell.  At one point he snapped at my adorable cousin, who happened to be visiting, and I told him to cut it out.  This was followed by an initially amusing conversation between him and me that at some point degenerated into a loud tirade on his part.  I kept telling him that my French wasn’t that good and I couldn’t understand him, but he just kept going and going.  Eventually French people around us began telling him to stop being an asshole and words to that effect, so I assume that it wasn’t very nice, but honestly, I couldn’t tell you.  At some point a large and very drunk French guy got on the subway car, and started seriously getting in Mr. Pen’s face–it was clear that this was going to turn violent.  Mr. Pen was a very diminutive Haitian man, and I wasn’t going to watch him get the shit beaten out of himself no matter how bizarre he was being, so I got involved.  The train stopped, Mr. Pen jumped out, and Mr. Drunk Guy launched into an animated discussion with me about American heavy metal, punctuated by snatches of Metallica songs.  All in all, an unusual evening on the metro, but not an unpleasant one by any means–just part of life in The Big City, as we say in English.

Oh: it’s serious injury.



Ground game: broken arms and politics

In the US, politics and judo have some things in common. Here’s some English vocabulary for talking about them.

ronda rousey mesha tate
Ronda Rousey has one of the best ground games in the world. Here she arm-bars Mesha Tate. Go to Google Images to find pictures of what Tate’s arm looked like afterwards. Picture source:
France is the #2 judo country in the world, after Japan.  The population of France is about 66 million people, and about 550,000 of them do judo.  (For comparison: the population of the US is bout 330 million people, and about 20,000 of them do judo.)  The first person I met in France was a diminutive, beautiful woman in her 50s or so who I ran into at a judo practice.  She’s nowhere near my size, but can arm-bar me every 7 minutes or so, on average.  She’s a great example of French judo: she beats me (over and over) not with strength, but with a subtle, contemplative approach to the sport that relies on imagination and on a deep understanding of how to move in three dimensions and apply basic principles of leverage and physics efficiently–and gently.  (Sorta like the famous French diplomacy, I guess.)  In judo, we would say that she has a great ground game—the ability to fight on the mat, off your feet, where we use not the throws of standing judo, but arm-bars, chokes, and pins.

The phrase ground game has been in the news quite a bit lately.  We often hear about what a great ground game Bernie Sanders has, or about how Trump keeps winning state primaries despite not have a good ground game.  In the context of politics, your ground game is how good your campaign is at the very local tasks that require actual personal involvement–particularly, getting your supporters to the polls.  A good ground game requires two things.

  1. You have to know who your supporters are.
  2. You have to have engaged, committed volunteers everywhere.

Regarding the first: today, this is mostly a matter of data science.  Sasha Issenberg’s book The victory lab does a very good job of telling the story of the development of today’s personalized, data-driven politics.  Once, politicians and political parties put a lot of effort into trying to convince people to get behind their ideas.  Today, it’s generally thought that trying to change people’s minds is expensive and inefficient; on the other hand, getting the people who already support you to actually go to their polling place and vote is relatively inexpensive, and it’s quite effective.  In 2008, the Obama campaign was able to develop pretty good guesses about who was going to vote for their candidate (how they did it is really interesting, but somewhat sobering—see the above-mentioned book), and they focussed their get-out-the-vote effort on those people.

Regarding the second: this is the essence of the ground game.  Cruz’s win in the Iowa primaries this nominating cycle was widely attributed to his strong ground game.  One of the many, many mysteries of the Republican race for the nomination has been that Trump has done quite well despite not having much of a ground game anywhere.