Sometimes my mouth just stops moving

The hard part is not studying more than one language–the hard part is keeping them separate.

One of the more interesting books that I’ve read over the course of the past couple years was Michael Erard’s Babel no more: The search for the world’s most extraordinary language learners.  It is a book about polyglots and polyglossia–people who speak a lot of languages (as opposed to linguists, who are people who study language in general).

Erard is an actual linguist, and knows what he’s talking about.  One of the points that he makes that I found interesting is that there’s no single recipe for learning a “second language”–in his travels amongst the polyglots, he found that people who are into this kind of thing figure out what works for them, and it’s not necessarily the same approach for everyone.

So: I’m going to show you how I prepare for my annual trip to Guatemala, where I volunteer with a wonderful group called Surgicorps.  (We provide free specialty surgeries for people for whom the almost-free national health care system is still too expensive.)  But, don’t feel like it’s a magic recipe (am I mixing metaphors here?) for success–just know that it has been working for me for the past few years, and there’s something that will work for you.  (Which might be this!)

For context: Spanish is a “second language” for me–one that I can function in for my daily life, and professionally.  But: because I spend at least half of my life in the French language and only speak Spanish when I go to Guatemala, it’s very difficult for me to not mix French into my Spanish incessantly.  (As I believe Erard also points out: the difficulty is not learning a bunch of languages–the difficulty is keeping them apart.)  Consequently, on July 1st of every year since I started spending As Much Time As Possible in France, I cut French out of my life completely.  En contrepartie, on July 1st I start doing the same kinds of things in Spanish that I would normally do in French–listening to the news on the way to work, learning my daily vocabulary words, reading The Walking Dead comics, etc.

I also put together a schedule of everything that I need to work on between July 1st and July 30th.  If you’re unfortunate enough to have been reading my blog for the past couple years, you saw me do this for the month before I took my French C1 test.  The main difference is that for the CEFR exams, I need to include “written production” in the things that I work on–for my volunteer work in Guatemala, I don’t need that, because I almost never need to write anything in Spanish.  So, for Guatemala preparation, I have four main categories of things to focus on:

  1. Vocabulary: technical (medicosurgical)
  2. Vocabulary: general
  3. Grammar
  4. Oral  production

Why do I have an entire “section” for general vocabulary?  Because as I’ve written about before, that’s the biggest challenge.  Medical vocabulary is finite–there are only so many body parts, surgical procedures, etc.  It’s the general vocabulary that gets you–remember that Zipf’s Law reflects the fact that languages are full of words that almost never occur, but, they do.  When the guy comes to the hand surgeon with two mangled fingers hanging there uselessly, the first question that the surgeon asks him is going to be what happened, and the answer to that could be anything.

  • A snake bit me
  • I got a cactus spine stuck in my palm
  • The fuel pump caught fire and exploded while I was in the passenger seat
  • Two guys tried to steal my car and they went after me with a machete

…all of which I have run into.

So, I expand out my vocabulary study into these categories:

  • Vocabulary: technical (medicosurgical)
    • Areas of the hospital
    • Surgical techniques and equipment
    • anesthesia
    • anatomy
      • the hand (because I mostly work with a hand surgeon)
      • gynecology (because I don’t interpret for the gynecologists very often, and therefore like to make sure that I give the terminology a once-over since I don’t have occasion to use it much)
      • the face and head (because we always have multiple plastic surgeons with us)
  • Vocabulary: general
    • the Guatemalan regional dialect (lots of fun loan words, mostly from one or another of the 20+ Mayan languages spoken in the country)
    • professions (see this post for why that gets a day of its own)
    • farm work and other kinds of manual labor (because most of our patient population consists of children or manual laborers–see this post)
    • animals and plants (see above about “anything can happen to your hands”)

I split grammar into three topics:

  1. Conjugation (because when in doubt, I’ll conjugate Spanish verbs as if they were French, and that does NOT work)
  2. Usted forms of verbs (they get a day of their own because it’s the form that I should be using with patients and their family members, but I almost never use it in my daily life)
  3. The subjunctive (much easier in Spanish than in French because it gets used far more often in Spanish, so you don’t have to think about it as much–my French problem is that I use the subjunctive too often)

Now, I know you’re wondering: why do I have oral production on my list, and why don’t I have oral comprehension?  Oral comprehension is the hardest part of learning any language for most people, and oral production is what most anglophones find the easiest part of learning Spanish.  The answer goes back to Michael Erard: the hard part is not learning more than one language–the hard part is keeping them separate.

This comes into play for me in two ways.  One way will be familiar to anyone who has two foreign languages running around in their heads: when you don’t have a word that you need in one language, it’s hard not to substitute it with the word from the other.

The other way that French interference in Spanish works out for me is more subtle, and it’s purely a question of oral production: it’s very difficult for me to say sequences of sounds in Spanish that would not be possible in French.

A problem context that comes up quite often is possessive pronouns followed by vowel-initial nouns.  For example (English followed by formal/informal French and then formal/informal Spanish):

your eye votre œil ton œil su ojo tu ojo
my artery votre artère ton artère su arteria tu arteria

Francophones will note that artère is feminine, but it has the masculine form of the possessive pronoun–mon.  No huge surprise to students of French–any vowel-initial noun takes the masculine, consonant-final, form of words like possessive pronouns.  Where the problem comes up: when I have to say one of those words before a vowel-initial noun in Spanish, my tongue stops.  It’s like it runs into a wall–my mouth just stops moving.  What the fuck??

From a linguist’s point of view: I’ve developed my own little foreign-language phonology.  In languages other than my native one (American English), that little phonology really does not like sequences of vowels at the end of one word and the beginning of the next.  So, I need to say tu abuelita, your grandma, but my phonology really, really wants it to be tun abuelita, or something of that ilk, which does not exist in Spanish… and my vocal apparatus just comes to a halt.

Solution: oral production drills.  Focussed drills, not just making myself speak–that will happen in Guatemala, where I’ll show up a week before the rest of the team to get those Spanish-language juices flowing.  I’ll put together exercises for myself that focus on the specific things that I know I have trouble getting out of my mouth, et voilà.  For example: ¿le duele todavía su axila?  (Does your armpit still hurt?)  Ya hablamos con su abuela (we already spoke with your grandmother).  Both of those are short sentences that force me into saying the vowel + vowel sequences–in these cases, su axila (your armpit) and su abuela (your grandmother) that are so hard for me.


Screen Shot 2018-06-30 at 09.12.08So, you take all of those individual things to work on, mix ’em up to give yourself a little variety in your daily study.  Prioritize things in a way that makes sense for what you plan to be doing with the language–I have a day in there for learning the vocabulary of food and beverages, but that’s more so that I can translate the menu for my fellow volunteers than for the actual volunteer work, so it wouldn’t make sense to be working on that first, and I don’t.  Mix in some review days–review is essential, and you don’t want to do it all at the end.  Boum, as the French kids say–a month’s-worth of work.  I’ll start it on July 1st, and I’ll finish it sitting in the plane on the way to Guatemala on the 30th.  If I screw up and miss a day?  Not the end of the world–I’ll make it up.  If I just can’t stand anesthesia vocabulary on July 11th?  No problem–I’ll just switch a couple days around.  Is the list intimidating?  No–the opposite.  I know that if I prepare, everything will probably go fine, and I know that if I work my list, I’ll be prepared–so, it’s actually reassuring, not intimidating.


Why no days for working on oral comprehension?  Because that’s what listening to the news on the way to work, podcasts while I stretch, etc., are for.  That really has to be part of your daily life–you can’t partition that off into specific days.  Gotta work, work, work your oral comprehension.  On the good side: not one second of the time that you spend doing it will be wasted.


English notes

a couple versus a couple of: this is controversial amongst English speakers.  People who prefer a couple of are likely to complain about those of us who say a couple.  Je les emmerde.  How I used it in the post: If I just can’t stand anesthesia vocabulary on July 11th?  No problem–I’ll just switch a couple days around. 

ilk: maybe acabit in French?  How I used it in the post: My phonology really, really wants it to be tun abuelita, or something of that ilk, which does not exist in Spanish… I think in French something of that ilk would be quelque chose du même acabit, or words to that effect.  Phil d’Ange?

The picture at the top of this post is from lolphonology.tumblr.com.  I picked it because in the post I carped about sequences of sounds, and the meme is about sequences of sounds (one in particular–the sound of the ch in English chat, but more on that another time, perhaps).  You don’t get it?  No worries–that just means that you’re cool, not nerdy like some stupid linguist.

Nightmare after nightmare: How to run a polyglot terrorist organisation

…nightmare after nightmare from which I woke up screaming only inarticulate sounds because I couldn’t come up with the words that I needed in ANY of my languages

In my daily life, I speak English (my native language) six months out of the year.  The rest of the time, I live my life in French, except for one week in August that I spend doing volunteer work in Guatemala.  In preparation for that week, I stop using French July 1st and spend the entire month trying to push French down and wake Spanish back up.  This morning I laid down for a nap after a couple hours of studying Spanish anatomy vocabulary, and had nightmare after nightmare from which I woke up screaming only inarticulate sounds because in the nightmares, I couldn’t come up with the words that I needed in any of my three languages.

If you read this blog, you’re probably more than a little bit interested in language, its powers–and its complications.  If so: you could do worse than to read Michael Erard.

Erard is a UTexas-Austin-trained linguist and writer who has been especially active in the area of polyglottism.  Besides being an excellent narrative writer, he also has some ideas that are outside of the linguistic conventional wisdom, which is why I find him interesting–in particular, he has made me think a lot about what it means to “speak” a language.  I have always had the standard linguist’s attitude about that–you “speak” the language(s) that you learned natively; anything else… you don’t speak, exactly.  So, ask me whether or not I speak French, and I will say no, despite the fact that it’s the only language that I speak comes out of my mouth 6 months out of the year.  That’s the same thing that I’ll say if you ask me if I speak Spanish, despite the fact that I spend one week a year doing interpreting for a bunch of surgeons in Guatemala.  (You should donate some money–it’s a great group.)  Erard argues that in the global world in which we live today, where many people live their lives in languages that aren’t native to them, the definition of “to speak” a language would more usefully be broadened.  Erard is a smart guy who got his PhD at a hell of a lot better school than I did–he’s worth listening to.  (If you can figure out a way to end that sentence without a preposition: go for it.)

This 2016 article by Erard, currently writer-in-residence at the Max Planck Institute for Psycholinguistics, is a good example of how he thinks.  The first in a 3-part series, it addresses the question of how the Islamic State manages to function on the battlefield in the absence of a shared language.  The French Foreign Legion and the Israeli Army have always been (and remain) forces with large numbers of members who don’t speak the national language.  They have historically dealt with the problem via formal instruction.  ISIS has gone a different way–as Erard shows, one that is not without precedent.

Erard’s native language is English, and he writes nicely in it.  Let’s start with some of the more-obscure words and phrases that he uses:

As everyone knows by now, ISIL has attracted new and seasoned jihadis from all over the world. But many of its 30,000 recruits (a typical estimate) don’t speak Arabic. So without a common language, how do they fight?

  • seasoned: experienced.
  • common: shared; the same for everyone.

This question has become particularly interesting in light of a fairly recent change to ISIL’s fighting structure. The change was revealed, in passing, in “Confessions of an ISIS Spy,” a series of Daily Beast articles last November by Michael Weiss, based on his interviews with a supposed ex-ISIL intelligence officer named Abu Khaled.

  • particularly: especially; more than some other things.
  • fairly: somewhat; not entirely, but not just a little bit, either.
  • in passing: from CollinsIf you mention something in passing, you mention it briefly while you are talking or writing about something else.
  • supposed: claimed to be.

(In passing, I will note that right at this moment, I can hear my cat snoring from the next room.  I didn’t even know that cats snored at all, let alone at high volume!)

Abu Khaled told Weiss that many of the ISIL battle groups, called katibas, were originally organized by language or ethnicity. But in mid to late 2015, ISIL began reorganizing fighters into mixed katibas, either combining muhajireen (foreign fighters) with ansar (local fighters) or mixing muhajireen from different places.

  • battle group is actually a technical term, referring to different sizes of units of organization depending on whether you’re talking about the Army or the Navy, but they’re typically probably a lot bigger than what Erard is referring to here, although I couldn’t swear to that as he doesn’t specify what he means.

Because I write about language, languages, and the people who use them, this piqued my interest. To confirm it I contacted Amarnath Amarasingam, a Canadian sociologist who studies foreign ISIL fighters. He told me he had heard from his contacts that linguistically heterogeneous katibas were indeed being assembled on a trial basis.

  • to pique (someone’s) interest: piquer la curiosité de quelqu’unsusciter l’intérêt de quelqu’un (WordReference.com).  I’m a little nervous about that second translation–intérêt in French rarely seems to correspond to interest in English, based on the odd looks that I get when I apparently misuse it constantly where I would have used interest in English…

It seems that the operational simplicity of having everyone in a katiba speak the same language also had a downside: It created an insularity among some fighters, who were perceived as running their own agenda. Abu Khaled, who also speaks French, told Weiss he had put together a proposal for a francophone katiba, but that it wasn’t approved. Previous problems were cited with an all-Libyan katiba that had proven more loyal to its own emir (leader) than to ISIL, and also with Russian-speaking katibas that had a tendency to go rogue. To avoid rifts and create a coherent army, ISIL now seemed to feel, it was better to get people to exchange cultures and languages.

  • downside: disadvantage.  Often used when you’re comparing the disadvantage to an/some advantage(s), which would then probably be referred to as upside(s).  

But how, in that case, does ISIL turn a gaggle of immigrants with no common language into an effective fighting force?

  • gaggle: the term for a group of geese.  You can use it metaphorically for any group that lacks organization.

…and for the answer to Erard’s question, I’ll refer you to his article.  One little point that I’ll mention in there is the appearance in a quote of the word strategic: 

You don’t need to understand the strategic objectives to blow up a school bus. At that level, it’s easy: go there, kill everybody, end of discussion.”

Objectives here are goals.  Strategic, from strategy, is being opposed to tactical, from tactics.  See this post where we talked about that distinction, which is quite important in the quote that Erard gives, from the perspective of its use in Henry Reed’s poem Movement of bodies:

Those of you that have got through the rest, I am going to rapidly
Devote a little time to showing you, those that can master it,
A few ideas about tactics, which must not be confused
With what we call strategy. Tactics is merely
The mechanical movement of bodies, and that is what we mean by it.
Or perhaps I should say: by them.

Strategy, to be quite frank, you will have no hand in.
It is done by those up above, and it merely refers to,
The larger movements over which we have no control.
But tactics are also important, together or single.
You must never forget that, suddenly, in an engagement,
You may find yourself alone.

…and I’ll tell you this: in the American military, we are not allowed to refuse illegal orders (say, to blow up a school bus)–we are required to refuse illegal orders.  Think about that if you would like to understand how outraged my fellow veterans and myself were by Trump’s comment, in response to an interviewer’s question about how he could possibly support Putin, a killer: There are a lot of killers. We’ve got a lot of killers.  You think our country’s so innocent?”  Fuck you: the American military is not the moral equivalent of the Russian military.

The graphic showing the distribution of languages in ISIS comes from the temporalflight.tumblr.com blog.

 

 

 

 

 

What’s making me happy today: Inuktitut mining terminology

The big issues in the news in the US at the moment:

  • The shame of the Trump administration’s treatment of migrant children from Central America
  • Why reporters don’t use the word “lie” to describe the things that come frequently out of the mouth of the current president of our country

Depressing.  So depressing that the New York Times, the best-known newspaper in the United States and one of the most well-known newspapers in the world, has taken to publishing (to take to +present participle explained in the English notes below) a page of good news on Sundays.

I took a look at said good news this morning, and was underwhelmed–there’s just nothing there to remotely match the tragedy that I see unfolding around me here in the United States.  But, that doesn’t mean that I can’t find things to feel grateful for.  Other than my cat, the fact that my kid isn’t in a refugee camp in Syria, and the fact that I will have food available to eat for breakfast today (lots of people won’t), what’s making me happy this morning is this vocabulary of mining terminology in Inuktitut.


Inuktitut is an Inuit language spoken by a bit under 40,000 people in the north of Canada.  It is a polysynthetic language, meaning that words have many parts, to the point that it’s not clear whether you would want to say that they actually have words, versus sentences.  Here’s an Inuktitut word/sentence (from Wikipedia, citing an article on developing a screening tool for Inuktitut-language speech pathologists):

  • ᖃᖓᑕᓲᒃᑯᕕᒻᒨᕆᐊᖃᓛᖅᑐᖓ (in Inuktitut orthography)
  • qangatasuukkuvimmuuriaqalaaqtunga (transliteration)
  • “I’ll have to go to the airport” (English translation)

Illuminating points about Inuktitut: the language has three vowels–in the International Phonetic Alphabet, [i], [u], and [a].  Spoken languages have a minimum of three vowels (yes, my fellow linguists, I am leaving out one controversial case here–don’t hate on me), and if a language only has three vowels, they are [i], [u], and [a].  (Languages with four vowels add [e] or [o].  Languages with five vowels have [i], [u], [a], [e], or [o].)


How do you come up with new terminology for a language that doesn’t have it?  A workshop brought together a group consisting of:

  • Tribal elders
  • Inuktitut language specialists
  • A mining expert

…who spent three days hashing it all out.

So, what does mining terminology look like in Inuktitut?  Here are some examples from the glossary, maintained by the Department of Indigenous and Northern affairs:

Latitude
imaginary lines that cross the surface of the Earth parallel to the equator used to determine location with longitude
(ᓄᓇᙳᐊᕐᒥ ᓴᓂᒧᐊᖓᓂᖓ)

Mineralization
the process by which a mineral is introduced into a rock, resulting in a valuable or potentially valuable deposit
(ᐅᔭᕋᖕᓂᐊᒐᒃᓴᕈᕐᓂᑯ)

Placer
a deposit of sand or gravel that contains particles of gold, gemstones, or other heavy minerals of value
(ᐃᒪᕐᒧᑦ ᓴᖅᑭᑕᐅᓂᑯ)

Stratum
a layer or bed of rock
(ᐃᑭᐊᕇᑦ)

…and that’s what’s making me happy today.  Does seeking happiness in les bizarreries of human linguistic, social, and technological behavior that mean that I will ignore my citizenly duty to stay on top of the daily crimes of the crooks that are currently running my country?  No.  Does it mean that I actively and daily seek things to be grateful for?  Yes–and I recommend that you try it.  Can’t hurt, right?

Source of the picture at the top of the page: http://kiggavik.ca/tag/community-engagement/


English notes

to take to + present participle: to start a routine practice of doing something.  Examples:

How I used it in the post: The New York Times has taken to publishing a page of good news on Sundays.

to take to + noun or present particple: to become good at some activity, especially quickly. You can often differentiate this from the previous usage by the presence of an analogy along with it: like a…

to take to can also be used with a person as its object, or with a location, either physical (he took to the podium) or metaphorical (he took to Twitter to…) The meanings are different here–I’m too lazy to find a bunch of examples on this Sunday morning…

underwhelmed: not at all impressed. Note that underwhelmed is not the opposite of overwhelmed. (Lesson: do not look for “logic” in language.) Some examples:

(Love this one!)

How I used it in the post: I took a look at said good news this morning, and was underwhelmed–there’s just nothing there to remotely match the tragedy that I see unfolding around me here in the United States. 

Amazing two-headed baby

What would a linguist say about it? Pretty much nothing.

Getting divorced mostly sucks (speaking from experience here–I do it a lot), but it does have one good side: you clean your basement.  Picking through old files from my days of teaching Linguistics 101, I found this old photo from the cover of the National Enquirer, a tabloid that you flip through while waiting in line at the grocery store and then occasionally buy despite yourself.

I found the headline interesting because it touches on a couple of recurrent themes in the history of thought about language, but goes in an unusual direction with it.  The themes:

  • The original language
  • Language deprivation experiments

The original language

There is a very long history of wondering what the original language was.  The top candidate in the various and sundry ravings about this is Hebrew.  Why?  It’s the language of the Bible (specifically, the Old Testament to those of you who are Christianically inclined).  Latin often comes up, too.

What would a linguist say about the question?  Pretty much nothing.  From the Hominidés.org web site:

Depuis le 17e siècle la question se posait : depuis quand l’homme utilisait-il le langage articulé ? De nombreuses théories ont été avancées dont certaines très farfelues (voir ci-contre). En 1866 la Société de Linguistique de Paris (fondée en 1864) mit un coup d’arrêt à ces tentatives fantaisistes et interdit tout simplement la publication de textes relatifs à l’origine du langage.

My translation: Since the 17th century, the question has been asked: from when have humans used spoken language?  Numerous theories have been advanced, some of which are quite nutty (or even French French French [too lazy to look up ci-contre on a Saturday morning]).  In 1866 the Linguistic Society of Paris (founded in 1864) French French French [see preceding bracketed statement] and completely forbad (forbade?) the publication of papers on the origin of language.

Why forbid study of the origin of language?  Because your theories are not testable, and if something is not at least in theory testable, it’s not science.  Linguists are not even certain that language originated just once–one explanation that has been advanced for the astounding variability in human languages is the polygenesis hypothesis, which proposes that language originated multiple times in different human(-ish) populations.  (The single-origin hypothesis is the monogenesis hypothesis.)  Hell, we’re not even certain that language originated in spoken form–it could well have been signed.  (Yes: signed languages are languages, like any other.)

 

Language deprivation experiments

The idea behind a language deprivation experiment is to deny children exposure to language and see what happens.  I’m not totally convinced that any of the reported language deprivation experiments (see some listed on this Wikipedia page) actually ever happened, but their stated motivations frequently include the belief that children who are not exposed to any language would spontaneously speak “the original language,” and guess what?  Latin is often reported as one of the anticipated tongues.

Language deprivation tragedies

In fact there is a depressing number of cases in which children actually have been deprived of exposure to language, either through mishap or through horrific criminal misdeeds.  What doesn’t happen when they’re rescued: they don’t speak Hebrew; neither do they speak Latin.  They don’t speak anything, and if they’re rescued too late, they never do.  (This is often taken as evidence supporting the critical period hypothesis about child language acquisition.)

The weird direction in which the National Enquirer takes their story

…is that they talk not about children who are old enough to have acquired language, but rather babies; they then take the kid-speaks-Latin phenomenon as a way to talk about proof of reincarnation.  Not unheard of (see here and here, and here), but not run-of-the-mill, either.


There’s that part of me that wants to talk about the role of two-headed babies in the history of genetics, but my breakfast ice cream is melting, so we’ll have to wait for another time…  Breakfast ice cream–yum…


English and French notes:

despite oneself: En dépit de soi-même, I think.  …tous mes efforts sont vains, je t’adore en dépit de moi-même.  (Jean-Jacques Rousseau, Julie ou la nouvelle Héloïse, which I am reading at the moment and find hilarious.)

to suck: a borderline vulgar way of saying to be bad in the sense of undesirable.  I ran across craindre un max as a French-language equivalent once, but nobody seems to recognize that when I say it.

to forbid: a super-irregular verb.  In French: interdire, I think.  (Man, I am really lazy today…)  From the bab.la web site (and I don’t buy forbid as a past participle at all, although once again, I’m too lazy today to look for actual evidence):

Screen Shot 2018-06-23 at 11.51.45
Source: bab.la web site

 

 

 

I don’t know—how many languages do YOU speak? Complexities of language and identity in the Ukraine

I’ve often written here about how irritated linguists get when you ask them how many languages they speak. But, I’ve written much less about how difficult it actually is to say how many languages ANYONE speaks. Here’s an article from the Washington Post about the complexities of the linguistic situation in the Ukraine and how your ability to understand non-linguistic phenomena there are affected by exactly how you pose questions about linguistic phenomena.

www.washingtonpost.com/news/monkey-cage/wp/2018/06/08/ukrainians-are-getting-less-divided-by-language-not-more-heres-the-research/

Now it’s time to have some fun!

Funny how nobody ever shows up to commit a mass murder at a school with an ax. Or a baseball bat. Or a samurai sword, or a trench knife (fatal as fucking hell), or a machete. We’re supposed to believe that firearms have nothing to do with anything, though.


The latest in early childhood education in America: instructions for school lockdowns sung to the tune of Twinkle, Twinkle, Little Star. Sing it and weep.

www.washingtonpost.com/news/morning-mix/wp/2018/06/08/lockdown-lockdown-is-a-kindergarten-nursery-rhyme-at-massachusetts-school/

Cabinet of Curiosities: Buying non-touristy stuff in Paris

The most common questions that people ask me about life in Paris:

  1. How come nobody in Paris speaks English?  (How come explained in the English notes below.)
  2. How come whenever I try to speak to people in Paris in French, they always answer me in English?
  3. Aren’t you afraid of terrorist attacks?
  4. Where can I buy non-touristy souvenirs?

(1) and (2) are, of course, contradictory, and I’ve written about them before (and will again, ’cause it’s super-complicated).  I’ve written about (3), too, and no, I’m not–every 3 days in the US, we have more gunfire deaths than Paris had in its worst terrorist attack in history.  I literally have a greater chance of being shot to death in a road rage incident on my way to work in the US than I do of dying in a terrorist attack in Paris.  Seriously.

(4): a question that I love to answer.  Today I’ll tell you where to buy non-touristy souvenirs in Montmartre.


Before there were museums, there was the cabinet of curiosities–le cabinet de curiosités.  If you were powerful, or maybe just really rich, your cabinet of curiosities was where you showed off your collection of … interesting stuff.  Mostly stuff from the natural world.  A narwhal’s tusk, say; rare stones; perhaps some fossils.  Showing it off was the point.  As Wikipedia puts it:

The Kunstkammer (cabinet of curiosities) of Rudolf II, Holy Roman Emperor (ruled 1576–1612), housed in the Hradschin at Prague, was unrivalled north of the Alps; it provided a solace and retreat for contemplation[3] that also served to demonstrate his imperial magnificence and power in symbolic arrangement of their display, ceremoniously presented to visiting diplomats and magnates.[4]

Montmartre is a neighborhood in the northern part of Paris.  As you might expect from the name Montmartre, it has an elevation, and at the peak of that elevation is one of Paris’s most popular tourist attractions: Sacré Coeur, “Sacred Heart,” France’s way of saying it’s sorry that Paris seceded from it in 1871.

I jest–bitterly: Sacré Coeur expresses France’s wish that Paris would say that it’s sorry that it seceded in 1871.  Sacré Coeur is reactionary France’s way of putting words in Paris’s mouth–specifically, an apology for having seceded from France in 1871.  As if it weren’t enough that the Versaillais (the soldiers of the national government) killed 20,000-ish Parisians when they retook the city.  La semaine sanglante, it’s called–The Bloody Week.


img_5991Descending from the aforementioned elevation on a Sunday-afternoon walk the other day, I came across Grégory Jacob and a truly delightful place to buy non-touristy stuff in Montmartre.  Curiositas is a charming little store in the style of a 19th-century cabinet of curiosities, complete with a nice selection of marlin snouts–far more practical in a little Parisian apartment than a narwhal tusk, and just as pointed.

Grégory spent 20 years as an optician before the insurance companies sucked the joy out of the profession, at which point he decided to become a boutiquier (see the French notes below for some subtleties of the terminology of shop-owners) and opened Curiositas.  His new profession lets him pursue his passions–la chine, la brocante, les curiosités, l’ostéologie, l’entomologie–in the very neighborhood where Gabriel loses his glasses and delivers his monologue in Zazie dans le métro.  

abinthe-glass
Picture source: http://www.curiositas.paris

And all of those passions are represented–the wares on offer include skulls, bugs, and the super-cool apparatus for drinking absinthe.  (Who knew that there are nifty devices for holding the sugar cube over which you pour la fée verte, “the green fairy”–absinthe itself.  Hell, I didn’t even know that you pour it over a sugar cube.  Hell, again: I didn’t even know that they still make the stuff.)  You need coasters with anatomical organs on them?  Grégory’s got them.  An emu egg?  No problem.  Skulls?  Curiositas has both carnivores and herbivores.  You’re tired of the Montmartre crêpe shops, wannabe artists, and fabric stores?  Step into Curiositas.  Tell Grégory the weird American guy says hi.  Scroll down past the pictures for the English and French notes.

Preparing_absinthe
How to prepare absinthe. Picture source: By Eric Litton – Own work, CC BY-SA 2.5, https://commons.wikimedia.org/w/index.php?curid=845302


English notes 

how come: an informal way of saying why.  Examples:

 

 

 

“Trex” is “T-rex,” Tyrannosaurus rex. The dinosaur with enormous fangs and tiny little arms.


French notes

le boutiquier :  shopkeeper.

le commerçant : shopkeeper, retailer.

le magasinier : l’employé qui s’occupe d’entreposer, ranger des marchandises dans un entrepôt. (Definition courtesy of Grégory)

Which TWD character is your classifier? Bias and variance in machine learning

You don’t expect the zombie apocalypse to be relevant to research in computational linguistics–and yet it is; it so, so is.

Spoiler alert: this post about the TV show The Walking Dead–which, I will note, is as popular in France as it is in the US–will tell you what happens to Carol around Season 3 or 4.

In general, it’s the stuff that surprises you that’s interesting, right?  No one ever expects the arctic ground squirrel to have anything to do with computational linguistics–and yet it does: it so, so does.  No one ever expects to be confronted with problems with the relationship between compositionality and the mapping problem over breakfast in a low-rent pancake house–and yet it happens; it so, so happens(Low-rent as an adjective explained in the English notes below.) You don’t expect the zombie apocalypse to be relevant to research in computational linguistics–and yet it is; it so, so is.


large-scale-deep-learning-with-tensorflow-8-638
You probably think that I just make this stuff up. I don’t! Picture source: https://www.slideshare.net/JenAman/large-scale-deep-learning-with-tensorflow

You’ve probably heard of machine learning.  It’s the science/art/tomfoolery of creating computer programs to learn things.  We’re not talking about The Terminator just yet–some of the things that are being done with machine learning, particularly developing self-driving cars, are pretty amazing, but mostly it’s about teaching computers to make choices.  You have a photograph, and you want to know whether or not it’s a picture of a cat–a simple yes/no choice.  You have a prepositional phrase, and you want to know whether it modifies a verb (I saw the man with a telescope–you have a telescope, and using it, you saw some guy) or a noun (I saw the man with a telescope–there is a guy who has a telescope, and you saw him).  Again, the computer program is making a simple two-way choice–the prepositional phrase is either modifying the verb (to see), or it’s modifying the noun (the man).  (The technical term for a two-way choice is a binary decision.)  Conceptually, it’s pretty straightforward.

cat-detector-cpus
Cats keep showing up in these illustrations because the latest-and-greatest thing in machine learning is alleged to have solved all extant problems and made the rest of computer science irrelevant, but the major reported accomplishment so far has been classifying pictures as to whether or not they are pictures of cats.  The “It uses a few CPUSs!” part is a reference to the fact that in order to do this, it requires outlandish amounts of computing resources (a CPU is a “central processing unit”).  Picture source: https://doubleclix.wordpress.com/2013/06/01/deep-learning-next-frontier-01/

When you are trying to create a computer program to do something like this, you need to be able to understand how it goes wrong.  (Generally, seeing how something goes right isn’t that interesting, and not necessarily that useful, either.  It’s the fuck-ups that you need to understand.)  There are two concepts that are useful in thinking your way through this kind of thing, neither of which I’ve really understood–until now.

ml-cat
Picture source: http://daco.io/insights/2016/about-deep-learning/

I recently spent a week in Constanta, Romania, teaching at–and attending–the EUROLAN summer school on biomedical natural language processing.  “Natural” language means human language, as opposed to computer languages.  Language processing is getting computer programs to do things with language.  Biomedical language is a somewhat broad term that includes the language that appears in health records, the language of scientific journal articles, and more distant things like social media posts about health.  My colleagues Pierre Zweigenbaum and Eric Gaussier taught a great course on machine learning, and one of the best things that I got out of it was these two concepts: bias and variance.  

Bias means how far, on average, you are from being correct.  If you think about shooting at a target, low bias means that on average, you’re not very far from the center.  Think about these two shooters.  Their patterns are quite different, but in one way, they’re the same: on average, they’re not very far from the center of the target.  How can that be the case for the guy on the right?

Screen Shot 2017-11-09 at 10.26.24
Picture source: XX

Think about it this way: sometimes he’s a few inches off to the left of the center of the target, and sometimes he’s a few inches off to the right.  Those average out to being in the center.  Sometimes he’s a few inches above the target, and sometimes he’s a few inches below it: those average out to being in the center.  (This is how the Republicans can give exceptionally wealthy households a huge tax cut, and give middle-class households a tiny tax cut, and then claim that the average household gets a nice tax cut.  Cut one guy’s taxes by 1,000,000 dollars and nine guys’ taxes by zero (each), and the average guy gets a tax cut of 100,000 dollars.  One little problem: nobody’s “average.”)  So, he’s a shitty shooter, but on average, he looks good on paper.  These differences in where your shots land are are called variance.  Variance means how much your results differ from each other, on average.  The guy on the right is on average close to the target, but his high variance means that his “average” closeness to the target doesn’t tell you much about where any particular bullet will land.

Thinking about this from the perspective of the zombie apocalypse: variance means how much your results differ from each other, on average, right?  Low variance means that if you fire multiple times, on average there isn’t that much difference in where you hit.  High variance means that if you fire multiple times, there is, on average, a lot of difference between where you hit with those multiple shots.  The guy on the left below (scroll down a bit) has low bias and low variance–he tends to hit in roughly the same area of the target every time that he shoots (low variance), and that area is not very far from the center of the target (low bias).  The guy on the right has low bias, just like the guy on the left–on average, he’s not far off from the center of the target.  But, he has high variance–you never really know where that guy is going to hit.  Sometimes he gets lucky and hits right in the center, but equally often, he’s way the hell off–you just don’t know what to expect from that guy.

We’ve been talking about variance in the context of two shooters with low bias–two shooters who, on average, are not far off from the center of the target.  Let’s look at the situations of high and low variance in the context of high bias.  See the picture below: on average, both of these guys are relatively far from the center of the target, so we would describe them as having high bias.  But, their patterns are very different: the guy on the left tends to hit somewhere in a small area–he has low variance.  The guy on the right, on the other hand, tends to have quite a bit of variability between shots: he has high variance.  Neither of these guys is exactly “on target,” but there’s a big difference: if you can get the guy on the left to reduce his bias (i.e. get that small area of his close to the center of the target), you’ve got a guy who you would want to have in your post-zombie-apocalypse little band of survivors.  The guy on the right–well, he’s going to get eaten.

Screen Shot 2017-11-09 at 10.39.21
High bias: both of the shooters tend to hit fairly far from the center of the target. The guy on the left has low variance, while the guy on the right has high variance.

A quick detour back to machine learning: suppose that you test your classifier (the computer program that’s making binary choices) with 100 test cases.  You do that ten times.  If it’s got an average accuracy of 90, and its accuracy is always in the range of 88 to 92, you’re going to be very happy–you’ve got low bias (on average, you’re pretty close to 100), and you’ve got low variance–you’re pretty sure what your output is going to be like if you do the test an 11th time.

Abstract things like machine learning are all very well and good for cocktail-party chat (well, if the cocktail party is the reception for the annual meeting of the Association for Computational Linguistics–otherwise, if you start talking about machine learning at a cocktail party, you should not be surprised if that pretty girl/handsome guy that you’re talking to suddenly discovers that they need to freshen their drink/go to the bathroom/leave with somebody other than you.  Learn some social skills, bordel de merde !)  So, let’s refocus this conversation on something that’s actually important: when the zombie apocalypse comes, who will you want to have in your little band of survivors?  And: why? “Who” is easy–you want Rick, Carol, Darryl.  (Some other folks, too, of course–but, these are the obvious choices.)  Why them, though?  Think back to those targets.

promo329185466
Picture source: http://hubwav.com/moral-codes-walking-dead-characters-get-broken/

Low bias, low variance: this is the guy who is always going to hit that zombie right in the center of the forehead.  This is Rick Grimes.  Right in the center of the forehead: that’s low bias.  Always: that’s low variance.

Low bias, high variance: this is the guy who on average will not be far from the target, but any individual shot may hit quite far from the target.  This guy “looks good on paper” (explained in the English notes below) because the average of all shots is nicely on target, but in practice, he doesn’t do you much good.  This guy survives because of everyone else, but doesn’t necessarily contribute very much.  In machine learning research, this is the worst, as far as I’m concerned–people don’t usually report measures of dispersion (numbers that tell you how much their performance varies over the course of multiple attempts to do whatever they’re trying to do), so you can have a system that looks good because the average is on target, even though the actual attempts rarely are.   On The Walking Dead, this is Eugene–typically, he fucks up, but every once in a rare while, he does something brilliantly wonderful.

Screen Shot 2017-11-09 at 10.39.21
High bias: both of the shooters tend to hit fairly far from the center of the target. Picture source: XX
yhp9gl5wtzmv0lt0wnls_10924799_847035102035681_5419345367201271911_n
SPOILER AERT! The Walking Dead’s Carol. She starts out as a meek, mild, battered housewife who can barely summon up the courage to keep her daughter from being sexually abused. Later… Yes, she’s my favorite TWD character. Picture source: https://goo.gl/8D6323

High bias, low variance: this guy doesn’t do exactly what one might hope, but he’s reliable, consistent–although he might not do what you want him to do, you have a pretty good idea of what he’s going to do.  You can make plans that include this guy.  He’s fixable–since he’s already got low variance, if you can get him to shift the center of his pattern to the center of the target, he’s going to become a low bias, low variance guy–another Rick Grimes. This is Daryl, or maybe Carol.

8-1487895983804_1280w
The Walking Dead’s Eugene. Picture source: https://goo.gl/CWsfqm

High bias, high variance: this guy is all over the place–except where you want him.  He could get lucky once in a while, but you have no fucking idea when that will happen, if ever.  This is the preacher.

Which Walking Dead character am I?  Test results show that I am, in fact, Maggie.  I can live with that.


Here are some exercises on applying the ideas of bias and variance to parts of your life that don’t have anything to do (as far as I know) with machine learning.  Scroll down past each question for its answer, and if you think that I got wrong, please straighten me out in the Comments section.  Or, just skip straight to the French and English notes at the end of the post–your zombie apocalypse, your choice.

  1. Your train is supposed to show up at 6 AM.  It is always exactly 30 minutes late.  If we assume that 30 minutes is a lot of time, then the bias is high/low.  Since the train is always late by the same amount of time, the variance is high/low.
introduction-to-deep-learning-dmytro-fishman-technology-stream-16-638
Cat pictures, cat pictures, cat pictures–do they talk of nothing but cat pictures?  ‘Fraid so… Picture source: https://www.slideshare.net/ITARENA/fishman-deep-learning
  1. The bias is high.  Bias is how far off you are, on average, from the target.  We decided that 30 minutes is a lot of time, so the train is always off by a lot, so the bias is high.  On the other hand, the variance is low.  Variance is how consistent the train is, and it is absolutely consistent, since it is always 30 minutes.  Thus: the variance is low.

Your train is supposed to show up at 6 AM.  It is always either exactly 30 minutes early, or 30 minutes late.  More specifically: half of the time it is 30 minutes early, and half of the time it is 30 minutes late.  Assume that 30 minutes is a lot of time: is the bias high or low?  Is the variance high or low?

deep-learning-on-hadoopspark-nextml-5-638
“DL” is “deep learning,” the most popular name for the latest-and-greatest approach to machine learning. I hear that it’s really good at recognizing pictures of cats.  Picture source: https://www.slideshare.net/agibsonccc/deep-learning-on-hadoopspark-galvanize

Since on average, the train is on time–being early half the time and late half the time averages out to always being on time–the bias is low. Zero, in fact.  This gives you some insight into why averages are not that useful if you’re trying to figure out whether or not something operates well. The give-away is the variance—even when something looks fine on average, high variance gives away how shitty it is.

Want to know which Walking Dead character you are?  You have two options:

  1. Take one of the many on-line quizzes available.
  2. Analyze yourself in terms of bias and variance.

English notes

low-rent: “having little prestige; inferior or shoddy” (Google) “low in character, cost, or prestige” (Merriam-Webster)

to look good on paper: “to seem fine in theory, but not perhaps in practice; to appear to be a good plan.” (McGraw-Hill Dictionary of American Idioms and Phrasal Verbs) Often followed by “but…”


French notes

From the French-language Wikipedia article on what’s called in English the bias-variance tradeoff:

En statistique et en apprentissage automatique, le dilemme (ou compromis) biais–variance est le problème de minimiser simultanément deux sources d’erreurs qui empêchent les algorithmes d’apprentissage supervisé de généraliser au-delà de leur échantillon d’apprentissage :

  • Le biais est l’erreur provenant d’hypothèses erronées dans l’algorithme d’apprentissage. Un biais élevé peut être lié à un algorithme qui manque de relations pertinentes entre les données en entrée et les sorties prévues (sous-apprentissage).
  • La variance est l’erreur due à la sensibilité aux petites fluctuations de l’échantillon d’apprentissage. Une variance élevée peut entraîner un surapprentissage, c’est-à-dire modéliser le bruit aléatoire des données d’apprentissage plutôt que les sorties prévues.

 

We stopped serving pizza at 2:30: Relevance, inference, and a mammoth

Waiting in line at the San Jose children’s museum just now, I overheard this conversation:

Customer: Do you have pizza?

Clerk: We stopped serving hot food at 2:30.

Clear enough to a human: the clerk’s answer was no. In fact, when the customer stared at him blankly in response, the clerk said this: We do not have pizza.

What did I have to do in order to understand that we stopped serving hot food at 2:30 meant “no”? Think about this: if you walked up to me in the street and asked Do you know where Notre Dame is? …and I responded We stopped serving hot food at 2:30, would you take that to mean “no” ? I think not. So, let’s think about what had to happen at that cash register:

  1. The listener had to make an assumption that we call relevance: that the clerk’s response was, in fact, relevant to his question. (For those of you who are into pragmatics and discourse: this is one of the Gricean maxims.)
  2. The listener had to know that pizza is a “hot food”.
  3. The listener had to know or consider that the current time was past 2:30.
  4. The listener had to make a number of inferences: I got a response about hot foods, pizza is a kind of hot food, so what is true of hot foods will be true of pizza.
  5. The semantics of to stop are such that when the guy says that they did it at 2:30, I should understand that they are still doing it now. (Contrast to stop with the verb to pause, which doesn’t require the same inferences as to stop.)

…and of course it’s the fact that I get excited about the guy in line behind me at the museum snack bar not being able to get his fucking pizza that keeps me from ever getting a second date, and so I’m just gonna go talk about the mammoth skeleton with my niece and nephew. What will we say about it, exactly? See below.

The mammoth has two clavicle bones. They articulate with the sternum, but aren’t attached to it. Why do we care? Because that’s pretty characteristic of mammals. In contrast, in birds the clavicles have fused to form what’s called the furcula or fourchette in French, or the wishbone in English. Check it out next time you’re stripping the post-Thanksgiving turkey carcass, and see below for the French-language terminology.

anatomie_squelette
Source: http://fauconeduc.biz/biologie-des-oiseaux/, a pretty nice web site for avian anatomy information in French.

The mammoth’s shoulder blade (omoplate in French–it’s the thing that looks triangular) is on the great beast’s side. Why do you care? Because that’s characteristic of quadrupeds. Humans, apes, and birds all have their scapula on their backs, not their sides.

…and lacking a good ending for this post, I finish my coffee and head back to the mammoth room.

Oh: the guy got a bag of potato chips. Junk food is junk food, I guess.

Fashion!

Why do computational linguistics?  Fashion! 

When I was in graduate school–in the US–I had a colleague whose child was allegedly growing up francophone.  I think the father was an American professor in the French department, or something.  We were all very impressed.

One semester we had a visiting academic from France in our lab.  He had super-hip glasses.  Over lunch one day, the kid asked him: “why do your glasses have such tiny lenses?  His response: c’est à la mode.  

The kid thought for a minute.  Then, another question: “why do you have ice cream on your glasses?”  I try not to be mean, but I thought to myself: this kid speaks French even worse than I do, and that’s an accomplishment…


Why do computational linguistics?  There are a lot of perfectly good reasons, but this guy has the best: Fashion!  Want to know more?  Check out the OpenMINTED project.  In the following material, “TDM” stands for text data mining.

In the French notes today (scroll down past the picture): my attempts to understand various words that could be used to translate the English word fashion.

 

TDM-story-sheet-Alan-Akbik-page-001
Source: openminted.eu

French notes

la mode : style, fashion, trend; the fashion industry itself.

la tendance : trend, fashion.

tendance (adj.) : trendy, fashionable, “in.”  Register: familier.

branché : trendy, cool, hip, “in.”  Register: familier.

le branché : cool person.

In trying to figure out the differences between la mode and la tendance via looking at examples on Linguee.fr, the trend (ha) seems to be that la tendance is not used to talk about things that are “in fashion” so much as tendencies/trends more generally.  The closest uses to “in fashion” are their adjectival examples:

Screen Shot 2018-05-10 at 20.33.39
Source: screen shot from Linguee.fr.

Compare some nominal (noun) examples–their translations are more about trends in general, versus trends in the sense of things being fashionable:

Screen Shot 2018-05-10 at 20.36.46
Source: screen shot from Linguee.fr.

Linguee.fr gives a number of examples of avoir tendance à, translated as “to tend to:”

Screen Shot 2018-05-10 at 20.38.36
From Linguee.fr.

For fashion in the sense of haute couture and the like (yes, that’s the English term, too), la mode seems to be more common:

Screen Shot 2018-05-10 at 20.50.19
From Linguee.fr.

Change the gender to masculine — le mode — and you have senses along the lines of “mode” in English:

Screen Shot 2018-05-10 at 20.52.34
From Linguee.fr.

…and some fixed expressions (all examples from Linguee.fr):

  • le mode d’emploi : operating instructions, instruction manual, user guide
    • J’ai lu le mode d’emploi avant d’utiliser l’appareil.  I read the instruction manual before using the device.
    • Le mode d’emploi est fourni en cinq langues.  The operating instructions are provided in five languages.
    • Avant de nous contacter, veuillez vous assurer d’avoir respecte le dosage des produits et suivile mode d’emploi.  Before contacting us, please, make sure that
      you take the right dosage of the products and follow the instructions for use.
    • le mode de vie : lifestyle
    • le mode aperçu : preview mode

It seems so simple that it makes one wonder: why was I ever confused about this?  As it happens, I have a pretty good memory for the contexts in which I run into words, so I can tell you that the source of my confusion is an advertising poster that I saw in the metro one day.  I interpreted it (possibly incorrectly) as meaning something like “so you think you know what’s cool?”, and my recollection is that it said something like tu penses que tu connais la tendance?  Maybe it’s just that the aforementioned kid (ledit marmot) spoke French better than I thought, and I speak French even worse than I thought…

 

 

Ukrainian Humanitarian Resistance

Resisting the russist occupation while keeping our humanity

Languages. Motivation. Education. Travelling

"Je suis féru(e) de langues" is about language learning, study tips and travelling. Join my community!

Curative Power of Medical Data

JCDL 2020 Workshop on Biomedical Natural Language Processing

Crimescribe

Criminal Curiosities

BioNLP

Biomedical natural language processing

Mostly Mammoths

but other things that fascinate me, too

Zygoma

Adventures in natural history collections

Our French Oasis

FAMILY LIFE IN A FRENCH COUNTRY VILLAGE

ACL 2017

PC Chairs Blog

Abby Mullen

A site about history and life

EFL Notes

Random commentary on teaching English as a foreign language

Natural Language Processing

Université Paris-Centrale, Spring 2017

Speak Out in Spanish!

living and loving language

- MIKE STEEDEN -

THE DRIVELLINGS OF TWATTERSLEY FROMAGE