trec_eval: calculating scores for evaluation of information retrieval

How do we calculate the performance of Google, Bing, and other search engines? Here’s how to run the program that does it.

The official TREC graphic. Picture source:, but I suspect that he took it from somewhere else.

For some years, the US National Institute of Standards and Technology (NIST) has run a “shared task” in which groups can try out and compare the pluses and minuses of various approaches to information retrieval.  Information retrieval is the process of using a computer to find sources of information–that might be web pages if you’re Google, or books if you’re a library–and in a shared task, multiple groups agree on (a) a definition of the task, (b) a data set to use for the task, and (c) a scoring metric.  The shared task model helps to make research by different groups more comparable.  Otherwise, you often have different groups defining the “same” task somewhat differently, and/or evaluating their performance on different data sets, and/or reporting different metrics in their publications.  We deliberately don’t call shared tasks “competitions”–the hope is that the meetings where people discuss the results are occasions for sharing and learning (although, of course, the people with the highest scores brag like hell about it).  The National Institute of Standards and Technology’s shared task on information retrieval, known as TREC (Text REtrieval Conference), was one of the earliest shared tasks in language processing, and it’s probably the most famous of the ones that are still taking place today.

When you participate in TREC, there’s a computer program that is used to calculate scores for your performance on the various tasks.  It’s called trec_eval.  The documentation is sketchy, and bazillions of graduate students around the world have individually figured out how to use it.  In this post, I’m going to pull together some of the details about how to format your data correctly for the trec_eval program, and then how to run it.  I won’t go into all of the options, which I don’t claim to understand–my goal here is just to give you the information that you need in order to get up and running with it.

I’ve pulled this information together from a number of sources, and supplemented it with a bit of experimentation and some very patient emails from Ellen Voorhees, the TREC project manager.  If you find a particularly good web page on trec_eval, please tell us about it in the comments.

In what follows, I’ll discuss data format issues for the gold standard and for the system output.  (One of the dirty little secrets of the otherwise-glamorous world of natural language processing and computational linguistics is that we spend an inordinate amount of time getting data into the proper formats for doing the stuff that’s actually interesting to us.)  Then I’ll show you how to run the trec_eval program.  You’re on your own for interpreting the output, for the moment–there are just way too many metrics to fit into this blog post.

One global point about the qrels file and the system output file: as Ellen Voorhees points out, For both qrels and results files, fields can be separated by either tabs or spaces. But all fields must be present in the correct order and all on the same line.

Formatting the gold-standard answers

You will typically be provided with the gold-standard answers for trec_eval, assuming that you’re participating in a shared task.  However, you may occasionally want to make your own.  For example, I sometimes use trec_eval to evaluate bioinformatics applications that return ranked lists of things other than documents.  In that case, I model the other things as documents–that is, I replace the document identifier that I describe below with, say, a pathway identifier, or a Gene Ontology concept identifier, or whatever.  In TREC parlance, the gold standard answers are called the qrels.  You only need to remember this if you want to be able to read the original documentation (or know what people are talking about when they talk about evaluating information retrieval with trec_eval).

There are four things that go into the gold standard.  One of them is a constant–the number 0 in the second column, which is there for reasons that are now opaque.  They get separated by tabs.  Here are the four things:

  • query number: there are typically 50 or so queries in a good gold standard.  Each one is identified by a number.
  • 0: this is a constant.  It’s the number zero.  Historically, it’s the iteration number. More specifically, this page at NIST says that it is “the feedback iteration (almost always zero and not used)“.  Ellen Voorhees says that the ‘iteration number’ was intended to record what iteration within a feedback loop the results were retrieved in so that a feedback-aware evaluation methodology (such as frozen ranks) could be implemented.  But it soon became clear that trying to compare feedback algorithms across teams within TREC was a far thornier problem than simply recording an iteration number and the idea was abandoned. But by that time there were all sorts of scripts that were expecting four-column qrels files, so the four columns remain. Some of the later tracks have started to use this field. For example, web track diversity task qrels use this field to record the aspect number. But those tasks are not evaluated using trec-eval, and trec_eval does not use the field.
  • document ID: in the prototypical case, this will be the identification number of an actual document.  If you’re using trec_eval for something other than document retrieval per se, it can be any individual item that can be right or wrong–a sentence identifier, a gene identifier, whatever.
  • relevance: this is either a 1 (for a document that is relevant, i.e. a right answer) or 0 (for a document that is not irrelevant, i.e. a wrong answer).

Here’s a screen shot of the qrels (gold standard) document

Screenshot 2016-02-04 17.35.42
A sample qrels document. Picture source: screen shot of a portion of

Formatting your system’s output

There are six things that go into the file with your system’s output.  Two of them are constants, although I think that in theory it’s possible to put something else in these columns in order to support some kind of (possibly now defunct) analysis.  They get separated by tabs.  Here are the six things:

  • query number: the query numbers should match query numbers in the gold standard.  Wondering what happens if you have a query number in your output file that’s not in the gold standard?  Ellen Voorhees says that Query ids that are in the results file but not in the qrels are ignored.  Query ids that are not in the results file but are in the qrels are also ignored by default. However, that is not how we evaluate in TREC itself because systems are supposed to process the entire test set. (You can game your scores by only responding to topics that you know you will do well on.)  To force trec_eval to compute averages over all topics included in the qrels, use the -c option—this uses ‘0’ as the score for topics missing in the results file for all measures.
  • Q0: this is a constant.  It’s the letter Q followed by the number zero.
  • document ID: see above–prototypically, this would be a document ID, but you could have IDs for other kinds of things here, too.  Due to the way that TREC gold standards are built, I would be surprised if it’s a problem to have a document ID show up in here that’s not in the gold standard. On this subject, Ellen Voorhees says that Document ids that are in the results file but not in the qrels file are assumed to be not relevant for most measures. There are some measures such as bpref that ignore unjudged (i.e., missing in qrels) documents. You can force trec_eval to ignore all unjudged documents using the -J
    option, but people using this option should have a clear idea of why they are doing so
    …and you probably shouldn’t. I certainly don’t.
  • rank: This is a row number, and within a topic, it needs to be unique. trec_eval actually ignores this number. How does it break ties in the score, then? Ellen Voorhees explains: …trec_eval does its own sort of the results by decreasing score (which can have ties). trec_eval does not break ties using the rank field, either: whatever order its internal sort puts those tied scores is the order used to evaluate the results. This is done on purpose, with the idea being that if the system really can’t distinguish among a set of documents, then it is fair to evaluate any ordering of those documents. A user who wants trec_eval to evaluate a specific ordering of the documents must ensure that the *SCORES*
    reflect the desired ordering.
  • score: there needs to be some number in this spot.
  • Exp: on various and sundry web pages, you will see this described as the constant “Exp”. Ellen Voorhees again: The final field is the run tag (name). I don’t think trec_eval uses it at all (although it might pass it through to label the results), but TREC uses it heavily.  The final field is not used by trec_eval other than it expects some string to be there.  The reason the format contains a field trec_eval does not use is because TREC itself uses the field in lots of places.  The final field is the “run tag” or name of the run.  TREC uses the value of that field to name the files it keeps about a run, to identify the run in the appendix to the proceedings where the evaluation results are posted, to label graphs when run results are plotted, etc.  The run submission system enforces that the run tag is unique across runs (over all tracks and all participants) for that TREC.

Here’s an example of a snippet from a system output file:

Screenshot 2016-02-04 17.51.30
An extract from an example of a system output file. Picture source: me.

Running trec_eval

There are a number of options that you can pass to trec_eval, but the basics of the usage are like this:

trec_eval gold_standard system_output

The official score reports that you get from TREC are produced like this:

trec_eval -q -c gold_standard system_output

That the usage should be like this is obvious from the documentation only if you know that qrels refers to the gold standard, and now you do!  So, if I have a file named pilot.qrels that contains my gold standard and a file named pilot.system.baseline that contains my system outputs, then I would type this:

./directory_containing_treceval_executable/trec_eval pilot.qrels pilot.system.baseline

On to the options. This from Ellen Voorhees: By default, trec_eval reports only averages over the set of topics. With the -q option, it prints scores for each topic (the -q is short for ‘query’), and then prints the averages. The score reports TREC participants receive from NIST are produced using the -q (and -c) option.

As I said above, I won’t attempt to explain the scores that you will get in this one little blog post.  See a good reference on information retrieval evaluation for that. You can find an explanation of the scores, and the motivation for each of them, in the appendix of every issue of the TREC proceedings, which you can find on line for free.

What’s making us sound stupid today II

Objects and events. Picture source:, by Johannes Trame, Carsten Keßler, and Werner Kuhn.

Is an event a thing?  In traditional grammar, they are, at least on the level at which we’re taught traditional grammar in the Anglophone education system.  Events are nouns, and specifically common nouns, as far as I know.  So, we see a similarity between many dogs and many breakdowns, and a difference between many storms and a lot of juice.  Dogs and breakdowns are easily pluralizable and take many, while juice is not pluralizable (it certainly is, but with different meanings) and takes a lot of.

So: in English, events are things.  However, today I ran across some evidence that in French, they are not.  Here’s how it went, and how I sounded stupid.

I’d been trying to work out the details of some flights for the past couple days.  My host in France was the go-between between me and the person booking the travel.  Eventually the person booking the travel sent me some flights, and I wanted to write back to say that they were fine–“that works,” as you might say in English:

Screenshot 2016-02-18 13.41.50
My email.

One of the things that I really, really appreciate about France is that many French people (as you will have read in innumerable books about France) are willing to point out your errors in French.  This is how we improve, and I love it!  Here’s what I got back:

Screenshot 2016-02-18 13.43.38
(Part of) the response.

What’s going on here?  It’s as my interlocutor described it: marcher is something that can refer to a thing, but not to an event.  From a linguist’s perspective, this is fascinating, because it sheds some light on the status of a basic, very fundamental question in the semantics of a language: what are the kinds of distinctions that the language makes?  Or, from a more poetic standpoint: from the point of this language, how is the world constructed?  This is a question of ontology, the subject of this post from a couple days ago.  Questions about language can be framed as very concrete questions about statistics, and they can be framed as very abstract questions about philosophy, and both approaches have their uses.  Either way, the answer to the question should come from actual data.

Anyways: that’s how I sounded stupid today.  Or, at least, that’s one way that I sounded stupid!  Oh, and one more thing: the French word for “event” is one of the words affected by the big spelling reform coming up this fall.  It’s going from événement to évènement.  You know what this means: one more word that I’ve been pronouncing incorrectly for the past two years!

Update, March 26th, 2016

I showed this post to my interlocutor.  Here’s his response–an alternative analysis.

Screenshot 2016-02-26 15.00.37

American writers trying to explain themselves in French

Ta-Nehisi Coates. Picture source: By David Shankbone (Shankbone) [CC BY 3.0 (, via Wikimedia Commons.
Ta-Nehisi Coates is this super-cogent writer whose essays I love to read.  His second book, Between the world and me, won the 2015 National Book Award for Nonfiction, and he was recently awarded a MacArthur Genius Grant.  He took the MacArthur money and moved to Paris, as any reasonable person would.  Here is a wonderful video of him in the midst of trying to learn French.  I can completely relate to his pain.  As he puts it: he sounds like an intelligent guy in English, but in French…different story.  That’s totally the story of my life these days–I think I’m fairly articulate in English, but when I try to explain the simplest things in French, I sound like a bumbling idiot.  Oh, well–practice makes perfect.  I hope.

The video:

  • les réparations ( reparations.

Testicles and the evolution of the intellectual

The unexpected connections between a Romani trailer park, Enlightenment intellectuals, and a police inspector.

Joseph d’Hémery, policeman, inspector of the book trade and therefore of authors from 1748-1753. Picture source: by Nicolas-François Regnault (* 1746; † 1810) [Public domain], via Wikimedia Commons.
I’m watching a French movie about a Rom guy who finds God.  In the part of the movie that I’m currently at, the plot involves a feud between an old man and a young guy.  The old man feels disrespected, and wants revenge.  This gets expressed linguistically in part by the way that various participants are referred to in the script.  Specifically, disrespect for a man is communicated by referring to him with some variant or another of the word boy.  In the little world in which I spent my teenaged years in, this was a huge insult–far better to be called mother fucker than to be called boy.  The connotation is that you’re weak and insignificant.  In his essay on the development of the concept of the intellectual in Ancien régime France during the mid-1700s, Robert Darnton talks about how the policeman and inspector of the book trade Joseph d’Hémery referred in his files to writers without social distinction as boy, regardless of their age.  Gentlemen, in contrast, were referred to as men.  As Darnton puts it, Boy” implied marginality and served to place the unplaceable, the shadowy forerunners of the modern intellectual, who showed up in the police files as gens sans état (people without an estate).  I was quite shocked when I found myself living in the southern US later in life and discovered that it’s quite common for older men there to address younger men as boy.  Here are some of the words that are used in this way in the film:

  • le gosse: kid.  (In Quebec: testicle.)
  • le gamin: kid, youngster.

Simultaneously, there’s a lot of talk in the film about testicles.  It’s not cross-linguistically uncommon for testicles to be a metaphor for courage, and this Slate article by Juliet Lapidos maintains that such is the case in French.  (I don’t know anyone in France well enough for them to use that kind of slang around me, so I can’t speak from experience, one way or the other, but I was able to validate this claim on  (Another aside: an old friend used to claim that the following typology exists: languages that use the word nuts to refer to testicles, and languages that use the word eggs to refer to testicles.)  Testicles are referred to in the film as follows:

  • les couilles ( balls (testicles).  We saw this recently in the expression je m’en bat les couilles (I don’t give a shit).

The ostrich and the platypus

Screenshot 2016-02-15 20.55.30
The representation of “cell wall” in the Gene Ontology. Picture source: screen shot from

One evening in December I sat in the living room of a friend half an hour south of Paris.  We sipped wine and talked about the recent kidnappings of hens from her hen house.  She knew what kind of animal was stealing them, but neither of us knew what the French word for it was in English.  Words for animals are a great illustration of Zipf’s Law—you know so, so many of them, but the vast majority of those almost never get used.  We discussed this fact, and that discussion quickly led to the word ornithorynque: “platypus.”

Why the hell would a couple of computational linguists half an hour south of Paris need to talk about a platypus, or for that matter, an ostrich?  Me and my friend both work with things called ontologies.  You can think of an ontology as a set of things and a set of relationships between them, where the relationships are generally restricted to either “A is a B” or “X is part of Y.”  For example, the Gene Ontology contains the specifications that a cell wall is an external encapsulating structure, that an external encapsulating structure is a cell part, and that a cell part is part of a cell.  Armed with that information, a computer (or a person) can infer things, such as that a cell wall is part of a cell.  This might seem obvious to you, but it’s not obvious at all to a computer.  A computer can’t really understand language, and to a computer, cell wall and cell migration both look pretty similar—two nouns in a row, the first of which is cell—but, a cell wall is a part of a cell, and cell migration is not.  Ontologies are one way of encoding the kinds of information that we think humans use (and therefore computers presumably need) to understand language—for example, to be able to understand that if I say The children ate the cookies.  They were delicious, then they means the cookies, but if I say The children ate the cookies.  They were hungry, then they means the children.

necessary and sufficient conditions cow-venn-diagram
Necessary and sufficient conditions for being a cow. The claim of the diagram is that in order to be a cow, you must have four legs, hooves, and no feathers. The claim is also that if you have four legs, hooves, and no feathers, that is enough to establish that you are a cow. Do you buy (translation of buy in this context: “accept the claim of”) this cartoon? Picture source:

Ontologies are great ideas, but in practice, it isn’t that easy to get them to work.  Let’s take mammals, since it’s a mammal that was stealing my friend’s chickens.  In an ontology, in order for something to be fully defined, you have to state the necessary and sufficient conditions for something to belong to a category.  That is, the conditions that must be met to belong to the category—the necessary conditions–and the conditions that, if they are met, are sufficient to let you belong to that category.  In French, we call these les conditions nécessaires et suffisantes, or CNS.  Let’s think about the necessary and sufficient conditions to be a mammal.  Nurse your young; three middle ear bones; hair; neocortex; endotherm; give live birth.  Damn–what about the platypus?  The platypus is a mammal, but it lays eggs.  That’s why the platypus—l’ornithorynque (n.m.)—came up in our conversation.  The fact that things like the platypus exist is a problem for ontologies (and ontologists).  Ontologies have to assume these really rigid boundaries for semantic categories, established by conditions nécessaires et suffisantes, and in practice, people don’t seem to think about semantics that way.


Prototypical and peripheral birds. Picture source:

How do people think about semantics, then?  There’s decent evidence for what’s called the prototype theory.  The prototype theory posits that we have representations in terms of some prototypical member of the category.  Other things might be closer to the prototype, or other things might be farther from the prototype, but we can accommodate all of them within the category, since it doesn’t require rigid boundaries.  If you have feathers, and you’re bipedal, and you lay eggs, and you fly, then clearly you’re a bird–you’re like the prototype for a bird.  But, even if you don’t fly, you can still be a bird—and that’s how an ostrich gets into the conversation.  Last summer I was giving a talk about semantic representations, and I was reviewing prototype theory.  The ostrich is a classic example to use when you’re talking about prototype theory—unlike a prototypical bird, it doesn’t fly, but it’s still a bird.  I couldn’t remember the word for ostrich, which I constantly confuse with the word for Austria.  Mercifully, my host was sitting in the front row, and he told me: autruche. 

If you’re interested in reading about this kind of stuff in French, I’m a big fan of the book Initiation à l’étude du sens, “Introduction to the study of meaning,” by Sandrine Zufferey and Jacques Moeschler.  I don’t know of any book in English that’s better.

  • un ornithorynque: platypus
  • une autruche: ostrich.
  • Autriche (n.f.): Austria.
  • les conditions nécessaires et suffisantes: necessary and sufficient conditions
  • le modèle du prototype: prototype theory

Is a preposition a bad thing to end a sentence with?

To be, or not to be- that is the question:
Whether ’tis nobler in the mind to suffer
The slings and arrows of outrageous fortune
Or to take arms against a sea of troubles,
And by opposing end them. To die- to sleep-
No more; and by a sleep to say we end
The heartache, and the thousand natural shocks
That flesh is heir to.
–William Shakespeare, Hamlet

Is a preposition a bad thing to end a sentence with?  No: if you want to sound like a native speaker of English, then you need to end sentences with prepositions.  In his writing guide The sense of style: the thinking person’s guide to writing in the 21st century, the linguist Steven Pinker‘s take on the alleged rule against ending sentences with prepositions is that “mockery is appropriate.”

If you teach introductory linguistics, you’ve probably had undergraduates show up in your class convinced that there’s actually some problem with ending sentences with prepositions.  They never seem to have any clue why, beyond the fact that someone told them so at some point.  It’s a belief that puzzles the hell out of linguists, since ending sentences with prepositions is clearly part of the English language–indeed, there are many constructions that require it.

Think for a minute about what the alternative to ending a sentence with a preposition is.  There are two options: one for when you’re asking a question, and the other for a non-question.  If you’re asking a question–not just any question, but one that uses one of what linguists usually call wh-words or Q-words, like what or where–you can move the preposition to the front of the sentence, preceding the wh-word:

Normal English option: Formal English option:
Who are you going to give it to? To whom are you going to give it?
Where are you going to get it from? From where are you going to get it?

For non-questions, you can make a relative clause, and move the preposition to follow the relativizer:

Normal English option: Formal English option:
That’s the store I’m going to buy it from. That’s the store from which I’m going to buy it.
That’s the guy I’m going to give an ass-kicking to. That’s the guy to whom I’m going to give an ass-kicking.

Linguists call the option that’s more common in formal English pied pipingYou might remember the Pied Piper of Hamelin.  He was hired to remove all of the rats from a little town in Germany.  When the townspeople didn’t pay him, he led all of their children away.  Similarly, we think of the wh-word and/or the relativizer as “leading away” the preposition from where it would normally go.

I’ve never really understood how anyone could believe that there’s anything “real” about the don’t-end-a-sentence-with-a-preposition thing. In fact, there are plenty of things that you can’t say in English without a preposition at the end of the sentence. Do you want to take a dip in the pool before lunch? Only if you’re going to.  I found a nice one on

What did you bring that book that I don’t like to be read to out of up for?

I tried to figure out a way to say this with pied piping:

For what did you bring up that book out of which I do not like to be read to?

For what did you bring up that book to which? whom? I do not like to be read out of?

I’m a native speaker, and I can’t come up with a way to do it.

At the top of this page, you’ll find a quote from Shakespeare featuring a sentence-final preposition.  My point in including it is to demonstrate that our greatest writers have used the construction.  However, you shouldn’t take a writer’s use of something as prima facie evidence that they approve of it–you need to look at who it’s used by (or by whom it’s used, if you prefer).  For example, when I translate my own speech from French into English, I typically do so using English in ways that I would never use it if I were speaking English, with the express purpose of trying to communicate how bad my French is: “This wants to say what, égout?,” I asked. I…um…likes Hawaii.  It’s OK—I is leaving early today.  Jane Austen puts some constructions only into the mouths of people who she wants to portray as idiots.  Who says the lines in the quote at the top of this page?  Hamlet, the protagonist of what is widely considered to be Shakespeare’s greatest play, the one that you’re likely to have read even if you haven’t read anything else by the man.  Point being: it’s tough to argue that Shakespeare put that preposition at the end of the sentence because he didn’t like it.

So, where did this whole “a preposition is a bad thing to end a sentence with” mishegas come from?    The linguist Steven Pinker attributes it to the seventeenth-century British poet and literary critic John Dryden, who he says originated it in an excoriation of playwright and poet Ben Jonson‘s work.  According to David Thatcher’s book Saving our prepositions, it then found its way into Robert Lowth’s 1762 book Short introduction to English grammar, and insinuated its way into English-language pedagogy from there.

Are there similar phenomena in France–alleged rules that don’t actually reflect at all how the language is used by native speakers?  Probably, but I don’t know what they are, and indeed, mixing language from different registers–saying the colloquial je crève d’envie de… (“I’m dying to…”) in a social context in which I should say je meure d’envie de… (also “I’m dying to,” but more appropriate for a formal situation) or failing to say ça me fait égale (“It doesn’t matter to me”) and instead saying je m’en bat les couilles (also “it doesn’t matter to me”, but more literally something like “I bang my balls about it”) is exactly the kind of thing that I mess up all the time.

I’ll leave you today with another quote from a non-stupid Shakespearean character:

We are such stuff as dreams are made on.

–William Shakespeare, The Tempest

…and, yes, it’s on, not of.






“They” is the American Dialect Society word of the year: gender neutrality and gender inclusivity in English and French

How do you do gender neutrality in a language in which every noun and adjective is either male or female? Here’s the French approach.

“And whoso fyndeth hym out of swich blame,
They wol come up . . .”

  “Whoever finds himself not guilty of such, they should come up…”

—Chaucer, The Pardoner’s Prologue. Translation by me–I was an English major.

The American Dialect Society’s Word Of The Year for 2015 was the word they used as a singular pronoun.  The usage goes back to the 1300s, probably less than a hundred years after we borrowed the word itself from the Old Norse pronoun þeir.  In the dialect of English that I grew up speaking, it’s used to refer to a single person in the third person when their (wow–there it is–I didn’t plan that!) gender is not known or not relevant.  If someone lost their cell phone, Beverly has it.  If one more person tells me “God needed your mother for an angel,” I’m going to punch them right in the fucking stomach.  If you see a dog with a bone in their mouth, don’t try to take it from them.  There’s a beautiful analysis of Jane Austen’s use of singular they at this web page on the web site, your home for all things Austentatious.  The author points out that there are some grammatical constructions that you can tell Austen disapproved of because she only puts them in the mouths of characters who are idiots.  This isn’t one of them–Austen uses it narratively.

The American Dialect Society singled out they specifically for its conscious use as a gender-neutral or gender-non-binary pronominal referent for even a known person:

Screenshot 2016-02-12 05.22.54
Picture source: screen shot from

As far as I know, France and the French language haven’t much gotten into the question of whether or not gender is binary and, if not, how we should do pronominal reference (i.e. using words like he/she/it/they/zhe/ix), but gender inclusivity is definitely an issue.  It’s an especially thorny issue in France because in French, every noun has a gender.  We have a very small number of such nouns in English–king/queen, actor/actress, man/woman, boy/girl, bachelor/bachelorette, etc.  In French, though, every noun has a gender.  Choix (choice)?  Male.  Liberté (liberty)?  Female.  Pied (foot)?  Male.  Main (hand)?  Female.  Many, many words referring to humans are gendered–director (directeur for a male, directrice for a female), actor (comédien for a male, comédienne for a female), dancer (danseur for a male, danseuse for a female), student (étudiant for a male, étudiante for a female)–on and on.  Some words only have one form, and French people struggle with those–for example, there’s a very current controversy over whether female ministers in the government should be referred to as Madame le ministre (with the male definite article le that ministre requires grammatically) or as Madame la ministre (with the female definite article that Madame seems like it ought to go with).  (I think that the ministers themselves prefer Madame la ministre, but the (female head of the) French Academy insists that it is Madame le ministre and that Madame le ministre it will stay.)

How do you go about being gender-neutral in French, then?  Here’s one attempt to do it.  It showed up in my email inbox yesterday.  What you’ll see is that the writer attempts to be not gender neutral, exactly, but rather gender inclusive: all of the nouns and adjectives have been modified so as to refer to both males and females.

Screenshot 2016-02-13 05.33.24
Picture source: screen shot of an email advertising a “summer school” in computational and statistical textual analysis.

All of the hyphenated things are attempts to make the words cover both genders, rather than just one.  Most of these work.  A male PhD student is a doctorant, and a female PhD student is a doctorante; the writer has written doctorant-e-s to try to cover the plural of both male and female PhD students.  “Advanced” would be avancés for the male plural and avancées for the female plural; the writer tries to cover both of them with avancé-e-s.  This technique doesn’t always work smoothly.  The male plural of “desirous of” or “wanting to” would be désireux de, and the female plural would be désireuses de; the writer has tried to cover both with désireux-ses, which doesn’t work out as cleanly as doctorant-e-s, but one gets the idea.  It works out even less well for the plural of “researcher,” which would be chercheurs for males and chercheuses for females; the writer went with chercheur-e-s, rather than chercheurs-seuses, as they did (there it goes again–I don’t know the writer’s gender, so my dialect uses they as a singular pronoun) for désireux-ses. 

A very common way that I see people try to be gender-inclusive in writing is by repeating nouns and pronouns that refer to people in the male and female forms.  Here’s an email about a Meetup in Paris about machine learning (a technique for getting computers to learn how to do things):

Screenshot 2016-02-13 05.45.50
Picture source: screen shot of an email from a Meetup group in Paris.

It’s saying “for those of us who stayed here in Paris,” but the word those is repeated: once in the female plural form celles, and once in the male form ceux.  It’s the same technique that we use in English if we write he or she.