In which a cook thinks I’m an idiot because of some vowels.
French and English have pretty different sets of vowels. (Vowel inventories is the technical term in linguistics.) One of the basic facts of humans and languages is that we can be unable to hear differences between sounds that we don’t have in our native tongue, and each of the two languages has lots of vowels that the other doesn’t have. When I say that we can’t hear differences between sounds, that implies that there are sounds with which we confuse them, and which sounds those are is not random at all: people categorize the sounds of their language in pretty structured, principled ways, and when they fail to distinguish the sounds in other languages, that “failure to distinguish” manifests itself as (se traduit par, I think, in French) putting sounds from the other guy’s language into the same category as some sound in your language.
Two-tube models of the vowels [i], [u], and [a]. The third author of the paper from which I took this figure once left a note on my desk that had the effect of getting my office mates off my fucking back about the messiness of said desk for the remainder of my post-graduate education, but that’s a story for another time. Picture source: https://goo.gl/9u7BNpThe principles by which this kind of thing gets structured can be described in terms of the articulatory characteristics of the sounds (what you do with your mouth parts to make them), the acoustic characteristics of the sounds (what the waveform would look like if you graphed it), and the auditory perception system (how your brain and your peripheral nervous system interpret incoming sounds). I mention this not because I think that you’ll be fascinated by the details of the effects of, say, Helmholtz resonators versus two-tube models (see the picture) of vowels, but so that you know that there’s a reason that you (if you’re a native speaker of English), me, and all of our fellow “Anglo-Saxons” (a term which seems to be falling out of use in France today, but which I still find amusing, since if there’s anything that I’m not, it’s an Anglo-Saxon) are confusing the same vowels.
For English speakers (Americans, anyway–I don’t know very many of our friends from the Commonwealth and wouldn’t presume to speak for them), one problem pair in French is the vowels that are spelt ou and u. Technically, those are both what are called high tense rounded vowels (here’s a post with a link to a nice video about them from the Comme une française YouTube series). In English, we only have the vowel that’s written ou, which is more or less the same vowel that we have in the words who’d and boot. We tend to hear French words with the vowel spelt u as the vowel spelt ou. Both of them are super-common in French; here are some examples, from the amazing site MinimalPairs.net (y is the International Phonetic Alphabet symbol for the French vowel spelt u):
Words that differ only in having the French sound spelt u versus the French sound spelt ou. Picture source: screen shot from http://minimalpairs.net/en/fr.
Most of the time, even us Anglo-Saxons (see the disclaimer above) can get by on context: there just aren’t that many times when the situation doesn’t let you figure out whether your waiter is asking you about joue (cheek) versus jus (juice), or when the rest of the sentence won’t give you a pretty good guess as to whether your interlocutor just said coup (a blow, roughly) or q (the letter of the alphabet).
However: there’s one French “minimal pair”–set of two words that only differ by a single sound–that can pretty much always show up in the same context. To wit: au dessus and au dessous. What those mean: roughly, over and under. The only difference in the sounds of those is the ou (which we have in English) of under, and the u of over. Have you seen my cigarettes? Yeah, they’re (on top of/underneath) your sweater. Would you do me a favor and put this (on/under) that box? It happens all the time.
Still life with buckwheat: two foil-wrapped gallettes, one on top of the other. Picture source: me, right before dinner.
To wit: I was feeling badly in need of an actual meal the other day, but too tired to cook after work. Not a problem, as there’s a little Breton place right across the street from the metro station that’s popular for take-out. I popped in on my way home and ordered a couple gallettes de sarazin–a buckwheat crêpe–one a complet (“with everything”), and one with zucchini and cheese. The nice lady brought them out to me in the bag that you see in the picture, and explained: The complet is on the (top/bottom), and the gratinée is on the (top/bottom).
Fuck: my old nemesis, au-dessus and au-dessous. I gave her a baffled look. She gave me a baffled look right back: what could I possibly not be understanding?? We’d just had an involved conversation on the topic of why I should really be topping off my dinner with her home-made apple crumble (her position on the topic) and why my general fatness suggested that I should not, in fact, be doing so (my position), so why would I suddenly be confused by something that any French toddler would understand? She looked at me for a bit, with that look on her face that means Is this bizarre foreigner jerking me around, or what?, and then finally tried again: en haut–gratinée. En bas–complet. No verbs, no pronouns, none of that fancy stuff–two prepositions, two nouns.
Message received. I left a good tip in hopes of maintaining some semblance of normalcy in the relationship, ’cause I am, in fact, de souche Bretonne (half, anyway), and I do love my cider and chicken gizzards, and that restaurant is the best place in the neighborhood to get them. It’s not like there aren’t other good Breton restaurants in Paris, but this one’s mine, damn it.
Quine wasn’t kidding: Gavagai really is a thing, and you don’t have to go any further than the grocery store to experience it.
Winters in Paris are nothing to write home to mom about, but they’re nothing to complain about, either. (To be nothing to write home to mom about explained in the English notes below.) If you can get past the crushing darkness, which descends on you at the relatively civilized hour of 5 PM but doesn’t lift until a quarter past 8 in the morning, the weather is relatively mild. Your mileage may vary depending on the strength of your heater, but overall, the winter weather here isn’t really that bad.
One of the beauties of life here is the produce in the markets. The stuff in the supermarkets is as crappy as the produce in the supermarkets in the US, but if you go to your neighborhood market, the situation changes totally. In my neighborhood, the market takes place Sunday and Wednesday mornings. The produce, eggs, meat, and dairy products are pretty local. In the US, the situation is quite different, for very specific reasons. Most food gets shipped long distances, and that has consequences for produce in particular. The plus side of the American supermarket is that you can buy any fruit or vegetable whatsoever 365 days a year. The downside is that to make those fruits and vegetables available year-round, they have to be shipped from distant climes, which means that they cannot be ripe, or they’ll bruise in transit. So: you can have any fruit or vegetable you want, but it will always be unripe and tasteless. That’s pretty much the situation in French supermarkets, too, at least in Paris. (I have no clue what goes on elsewhere. I avoid leaving Paris as much as possible, due to the whole lapins anthropophages issue in the countryside. Don’t say you haven’t been warned.)
The supply chain for Parisian markets is pretty local, which means that you don’t have the constraint against ripe produce–you don’t have to worry about everything getting ruined in transit because it doesn’t get shipped very far. That means that in the summer you can buy pretty much anything, but in the winter it’s mostly apples and potatoes, and you can tell how long they’ve been sitting in someone’s cellar. (I exaggerate here, but just a bit. I did say mostly.) The payback: in the summer, the produce is incredible. If you have never walked by a crate of strawberries that were so ripe you could smell them: it’s amazing. If you have never had a merchant ask you when you were going to eat your produce, and then pick it out for you so that it would last exactly as long as you needed it to before being overripe: it’s quite the service.
So: it’s Sunday, which means my market day, which means my weekly dose of vegetables. (I try to keep my vegetable consumption down to the minimum required for life. Where there are vegetables, there are des lapins anthropophages, and…well, like I said: you’ve been warned.) My marchand préféré (I suggest that you pick yours based on the length of their line–longer lines are better, and they get extra points for higher ratios of old ladies) had some cherry tomatoes, and they were lookin’ good. Price: 2.95 a barquette.
Seulement voilà (the problem is): what’s a barquette? If it’s a container, I’m in good shape. If, on the other hand, it’s a vine with attached little red things, then since there are several of those in one of those containers, we’re talking about more money than I’m willing to pay to run the risk of attracting the unwanted attention of the aforementioned lapins. (Some of you will recognize this as the classic Gavagaiproblem. In the language of Molière, you can also spell it Gavagaï.)
Solution: ask for just one barquette, and see what the guy hands me. That done, I took my purchases home. Once out of my shopping bag (you must carry a shopping bag–disposable plastic grocery bags are illegal here now), I set my barquette of cherry tomatoes on the table and took a whiff. Boom: right back to my childhood. A warm summer day, tomatoes warm in the sun. You get that kind of “sense memory” in an American grocery store exactly never. You get soft towels here in Paris exactly never, but oh, the produce…
You’ll find notes on the English and French vocabulary used in this post below. For more on the role of cherry tomatoes in Parisian life, check out Olivier Magny’s book Stuff Parisians like, which turns out to be accurate far more often than I ever would have thought it would be.
French notes
la barquette: small basket (of fruit or little vegetables), tub (of ice cream or margarine)
English notes
produce (noun): agricultural products and especially fresh fruits and vegetables as distinguished from grain and other staple crops (from Merriam-Webster). Note: this is a noun, and is pronounced with stress on the first syllable, not on the last syllable (as is the case with the verb). How it was used in the post: One of the beauties of life here is the produce in the markets.
to not be anything to write home to mother about: to not be particularly special (in the American sense of the word special, not the French sense).
“I mean it was fun, but nothing to write home to mom about”
Winter will be past before we know it. I’ll see the chestnuts blooming in the Place Cambronne on my way home from work (on my way to work, I study vocabulary, and don’t notice them), and rejoice in the knowledge that they will survive even the zombie apocalypse. Not far behind will be National Poetry Month. In anticipation of that, and after a long weekend of contemplating what exactly it means to have a thin-skinned assclown, a man who rages in response to tweets and threatens the press when he doesn’t like their reporting, with his fingers on the most powerful nuclear arsenal in the world, I propose a timely bit of Robert Browning. Follow this link if you’d like to hear a pretty good recording thereof. It’s pretty disturbing in and of itself, and all the more so with Trump in the presidency. I gave commands; then all smiles stopped together. There she stands as if alive….Notice Neptune, though…thought a rarity, which Claus of Innsbruck cast in bronze for me! (Rough translation: I had her killed. Hey, look at this great thing that I have!)
My Last Duchess
Robert Browning
That’s my last Duchess painted on the wall,
Looking as if she were alive. I call
That piece a wonder, now; Fra Pandolf’s hands
Worked busily a day, and there she stands.
Will’t please you sit and look at her? I said
“Fra Pandolf” by design, for never read
Strangers like you that pictured countenance,
The depth and passion of its earnest glance,
But to myself they turned (since none puts by
The curtain I have drawn for you, but I)
And seemed as they would ask me, if they durst,
How such a glance came there; so, not the first
Are you to turn and ask thus. Sir, ’twas not
Her husband’s presence only, called that spot
Of joy into the Duchess’ cheek; perhaps
Fra Pandolf chanced to say, “Her mantle laps
Over my lady’s wrist too much,” or “Paint
Must never hope to reproduce the faint
Half-flush that dies along her throat.” Such stuff
Was courtesy, she thought, and cause enough
For calling up that spot of joy. She had
A heart—how shall I say?— too soon made glad,
Too easily impressed; she liked whate’er
She looked on, and her looks went everywhere.
Sir, ’twas all one! My favour at her breast,
The dropping of the daylight in the West,
The bough of cherries some officious fool
Broke in the orchard for her, the white mule
She rode with round the terrace—all and each
Would draw from her alike the approving speech,
Or blush, at least. She thanked men—good! but thanked
Somehow—I know not how—as if she ranked
My gift of a nine-hundred-years-old name
With anybody’s gift. Who’d stoop to blame
This sort of trifling? Even had you skill
In speech—which I have not—to make your will
Quite clear to such an one, and say, “Just this
Or that in you disgusts me; here you miss,
Or there exceed the mark”—and if she let
Herself be lessoned so, nor plainly set
Her wits to yours, forsooth, and made excuse—
E’en then would be some stooping; and I choose
Never to stoop. Oh, sir, she smiled, no doubt,
Whene’er I passed her; but who passed without
Much the same smile? This grew; I gave commands;
Then all smiles stopped together. There she stands
As if alive. Will’t please you rise? We’ll meet
The company below, then. I repeat,
The Count your master’s known munificence
Is ample warrant that no just pretense
Of mine for dowry will be disallowed;
Though his fair daughter’s self, as I avowed
At starting, is my object. Nay, we’ll go
Together down, sir. Notice Neptune, though,
Taming a sea-horse, thought a rarity,
Which Claus of Innsbruck cast in bronze for me!
English notes
assclown: “someone who, wrongly, thinks his actions are clever, funny, or worthwhile.” ““someone who seeks an audience’s enjoyment while being slow to understand how it views him.” A specific kind of asshole, defined as “A person counts as an asshole, when and only when, he systematically allows himself to enjoy special advantages in interpersonal relations out of an entrenched sense of entitlement that immunizes him against the complaints of other people.” Sources: John Kelly on the Strong Language blog, and Aaron James, in his book Assholes: a theory of Donald Trump.
Fra: “used as a title equivalent to brother preceding the name of an Italian monk or friar” (Merriam-Webster). My best guess is that it’s used here to suggest that the Duke things that the painter was overly familiar (brother) with his wife, and/or that his wife was overly familiar with the painter.
familiar: a word with at least two parts of speech (adjective, of course, but also noun). In the (attempt at an) explanation above, it’s used with this range of meanings, again from Merriam-Webster: a: being free and easy
familiar
association of old friends> b: marked by informality familiar essay>
I can’t sleep, which leads to tokenization issues and the definition of “for my money.”
I don’t sleep well. That is to say: I don’t sleep very much. Not at night, anyway.
In the best-case scenario, the middle of the night, when in theory I should be sleeping, is my time to study vocabulary or to read. In the worst-case scenario, the middle of the night is when I return emails from people who are in North America, and therefore awake.
Tonight’s email brought a help-wanted ad from the School of Informatics at the University of Edinburgh, posted by the amazing Mirella Lapata. (I say “amazing” because her paper with Regina Barzilay at the Association for Computational Linguistics annual meeting in 2005 opened my eyes to the possibilities for inventive evaluation strategies in computational linguistics in a way that my eyes had not previously been opened.) For my money, the University of Edinburgh’s graduate program in computational linguistics is the best in the world, so I forwarded Mirella’s email to the students in our program, most of whom are not computational linguists, but most of whom would be quite suited for one of the advertised jobs in the School of Informatics. I added the following introduction to the email:
Picture source: me.
This got me the following response from one of my students in the US (and therefore awake):
Picture source: also me.
Now, I love getting this kind of question, for many reasons. It lets me repay the apparently endless patience of my colleagues in France for my crappy command of their language. It lets me be the person who knows the answer to a question about language, which in French happens exactly never. It gives me a socially acceptable excuse for talking about language, which I enjoy way more than is cool. It suggests that someone actually both read and thought about what I wrote. (You pick whichever one you think portrays me in the best light.) In fact, I love that kind of question so much that I will often go out and find naturally-occurring examples, which like any good linguist these days, I do on the Interwebs. A trip to the Sketch Engine web site and a search of the Open American National Corpus found me these:
Picture source: screen shot of the Sketch Engine web site.
…which, of course, like most things of interest, leads to a question. In this case, the question is: what’s wrong with the Sketch Engine web site? Where did all of those spaces come from?
The answer: there’s nothing wrong with the Sketch Engine web site. Part of any analysis of written data is choosing an answer to this question: what is a word? It’s not typically obvious what the answer is. Give students in a beginning language processing class this sentence, and ask them what the words are:
My dog has fleas.
(For reasons that are obscure to me but that I think have something to do with playing the ukulele, that is a famous sentence.) Ask them what the words are, and the first answer will be anything separated by white space:
My
dog
has
fleas.
…at which point they quickly realize that they’ve just posited that fleas. is a word, and they modify their hypothesis, to be anything separated by white space and stripped of punctuation:
My
dog
has
fleas
.
(I’m not making this up–in fact, I did it in class last Tuesday.) Next they figure out that they probably want My and my to be considered the same word, which means that they need to do something about the case of letters, and if they speak any of the bazillion languages that have more inflectional morphology (example in a minute) than English does, then they might want to do something with aller/allais/allai/allasse, etc.
Things get pretty complicated pretty quickly, though. Suppose that you’re dealing with English. What do you do with
wouldn’t don’t haven’t didn’t
Seems pretty straightforward–you want something like this:
would n’t do n’t have n’t did n’t
…except that it’s not straightforward at all, because then you have to propose
wo n’t
…which people generally aren’t happy about.
The table of contents of “Le mot,” by Maurice Pergnier. The point of the picture is that the first 46 pages of the book address the various arguments for and against the whole idea of the word. Picture source: me.
There are a variety of ways to answer these sorts of questions, and it does actually matter. From a practical point of view, the choices that you make about how you do this–the process is called tokenization–is important enough that it affects the performance of computer programs that do things with language. (Here’s a recent paper on the topic.) From a theoretical point of view, your choice takes a position on a hugely controversial topic in linguistics: what a word is. (The best discussion of the controversy that I’m aware of is in the book Le mot, by Maurice Pergnier.)
So, why are those spaces there in the Sketch Engine output? Let’s look at it again:
Picture source: screen shot of the Sketch Engine web site.
One of the immediately obvious things is that they have “tokenized” the punctuation off, so that “personal growth” becomes ” personal growth ” and (1995) becomes ( 1995 ). The next thing that you might notice is that there is some ambiguity in the output. Look at what happens to that’s and people’s ..
…which become
that ‘s and people ‘s
Now we have two ‘s …and they are different, but look the same. What is a computer program to do with that? Welcome to my world. Nobody said that computational linguistics was going to be all about suicide prevention and curing cancer, right?
The Regina Barzilay and Mirella Lapata paper that I mentioned above:
Regina Barzilay and Mirella Lapata. 2005. Modeling Local Coherence: An Entity-based Approach. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, 141-148. Ann Arbor.
The declaration of competing interests: I don’t have any. Sketch Engine doesn’t pay me–I pay them, and I get a hell of a lot of use out of it.
French notes
— Je vous regardais tout à l’heure, vous étiez marants tous les deux le flicmane et vous.
— A tes yeux, dit la veuve Mouaque.
— “A mes yeux? Quoi, “à mes yeux”?
— Marants, dit la veuve Mouaque. A d’autres yeux, pas marants.
How would you say for my money in French, or more generally, label something as someone’s opinion, yours or otherwise? There are a lot of options, and unfortunately, I don’t know the status of any of them with respect to register of language, contexts in which they are or aren’t appropriate, etc. Here’s what I’ve come across so far, and I should point out that I also don’t know which of these can only gracefully be used to introduce your own opinion, versus which could also be used to introduce someone else’s opinion. I’ll also mention (and then I’ll shut up) that of all of these, I’ve heard the first one (à mon avis) the most, the second one exactly once (in Raymond Queneau’s Zazie dans le métro), and the rest never, as far as I know. If any of you native speakers out there can offer suggestions about when and where to use which of these, it would be great.
à mon avis
a mes yeux
selon moi
à mon gré
de mon point de vue
d’après moi
d’après mon point de vue
à mon sens
de l’aveu de qqn: I think that this one implies something negative, along the lines of “as Chomsky himself admits,” as opposed to relaying an opinion about which you’re not necessarily making any judgment one way or the other.
“Wakarimasen” means “I don’t understand” or “I don’t know.” Picture source: https://goo.gl/qImFiq
I was walking down the street in Tokyo this morning when a fellow foreigner acknowledged my existence.
This is a far rarer occurrence than you might think in this country with a very low immigration rate, where running into another “Western” foreigner is pretty uncommon outside of tourist areas, and you might expect that it would lead to at least a smile, if not an actual conversation. I’ve had many occasions when Japanese who spoke some English struck up random chats with me, but I’ve noticed that the few foreigners who you run into in Japan will, in general, resolutely avoid meeting your eyes. (Note that I’m talking about foreigners who live here–not tourists.) Why? I can only guess. OK, my guess: foreigners here in Japan struggle so very hard to integrate themselves into the culture that I suspect that they’re loath to, in some sense, admit that they are “others” by sharing in the otherness of some random visitor such as myself.
So, when a clearly foreign guy caught my eye and smiled at me this morning on my way back from a morning visit to the neighborhood shrine, I was so surprised that I don’t think I smiled back. Then I felt like a total jerk. Maybe being someone who lives here–you don’t come out of the very busy Ochanomizu station at that time of the morning unless you’re going to work, so I’m guessing that he does–he’s used to getting that reaction from other foreigners. Still: I felt like even more of an asshole than I usually do.
French notes
le sanctuaire shinto: Shinto shrine
English notes
to meet someone’s eyes: to look directly into someone’s eyes, acknowledging the contact.
I wonder if you meet my eyes out of kindness sometimes
How it was used in the post: I’ve noticed that the few foreigners who you run into will, in general, resolutely avoid meeting your eyes.
to be loath to: to be deeply unwilling to do something. (Definition adapted from Merriam-Webster.)
to loathe: to dislike to the point of disgust.
Keeping track of the difference between these two is actually quite difficult even for native speakers. You can read an article about the history of the problem here on the Merriam-Webster web site. There are two parts to it. One is keeping straight the fact that the verb ends with an e, and the adjective doesn’t. The other is that the verb is pronounced with the th of this and the, while the th of the adjective can be pronounced with the th of this and the,or with the th of thin.
I will NEVER understand why Elena picked Damon, I loathe her
How this showed up in the post: foreigners here in Japan struggle so very hard to integrate themselves into the culture that I suspect that they’re loath to, in some sense, admit that they are “others” by sharing in the otherness of some random visitor such as myself.
You can unscrew a lightbulb, you can unplug your monitor, and you can unbuckle your suspenders, so why can’t you unsee things? It has to do with the prefix un- when it’s attached to verbs. In order to be able to un- a verb:
The verb has to refer to changing the state of something. So, you can undress yourself (changing your state from being dressed to not), you can unclog a pipe (changing its state from being clogged to not), and you can unlock a door (changing its state from being locked to not).
The state has to be reversible. So, you can dress/undress yourself, you can clog/unclog a pipe, and you can lock/unlock a door. But: you can bake a cake, but can’t unbake it; you can dry a shirt, but as far as I know, you can’t undry it; you can breakan egg, but you can’t unbreak it.
So: you can see something, but you can’t unsee it, because when you see something, you’re not changing its state, and that’s the sine qua non of verbs that can take un-.
Ack–data! I almost forgot that I’m an empiricist! In fact, the verb to unsee occurs a lot. It occurs with a frequency of 0.02 words per million in the enTenTen13 corpus (19.7 billion words of English, available on the Sketch Engine web site). But, it’s cool: it doesn’t mean to undo the seeing of something. When we talk about unseeing things, we’re usually talking about the very fact of not being able to unsee them, and what that actually means is this: we can’t forget them, and/or we can’t move beyond whatever we learned from what we saw.
In fact, the interwebs are full of talk about things that can’t be “unseen.” Some examples:
Old guy at the YMCA was pooping with the door open. C’mon man I can’t unsee that.
Why does unsee work so well for this use, when it can’t have the meaning that you would think it would? I suspect that it’s precisely because (a) it’s basically an impossible verb, and (b) it’s used only to describe an impossible action. And, the fact that the meaning of unsee is not the meaning of see plus the meaning of un- is important here. We’ve talked often about the basic principle of compositionality–the idea that meaning in language comes from something like “adding together” the meanings of different things. Here is a case where the meaning is clearly not compositional–to unsee something, were it possible, would not be what it is if it were compositional. (Were it possible explained below in the English notes.) So: cool, if you think that it’s cool to violate the expectations of linguistics, computer science, and philosophy. (I do think it’s cool, but maybe that’s why I’m single.)
What I can’t unsee: pierres d’attente. I took a guided tour of Haussmannian Paris the other day. What that means: the enormous redesign of Paris in the 3rd quarter of the 19th century, when huge swaths of the city were torn down and rebuilt into the stereotype that you’re thinking of when you visualize Paris today. (See here for a post about the typical Haussmannian streets and how they relate to your ability to survive the zombie apocalypse in Paris, as well as here for a post about the typical Haussmannian apartment buildings and how they, too, relate to your ability to survive the zombie apocalypse in Paris.)
The new Haussmannian buildings went up in the order in which their lots were appropriated, the old buildings torn down, and the new buildings financed. That meant that it was often the case that buildings were put up that one day would have neighbors, but didn’t yet. In anticipation of the need to line up with adjacent buildings–lining up with things was very important in Haussmann’s Paris–the front-facing walls of the buildings had projections that were meant to facilitate alignment with future neighbors. So, pierre d’attente: “waiting stone,” I guess. (I think they can also be called pierres d’accord.)
Pierres d’attente. Picture source: me, on the rue La Fayette, on a rainy day in January 2017.
Pierres d’attente. Picture source: me.
Now, at some point, architects realized that if you have pierres d’attente sticking out of the side of your building, they catch rain, and then it can run into your walls, and that is most definitely not a good thing for your building. So, people started cutting them off, which is why you will see things like this:
Apartment building with pierres d’attente removed. Picture source: me, on the rue La Fayette.
But: not everyone was happy about this. Haussmannian apartment buildings are part of our patrimoine, and pierres d’attente are part of Haussmannian apartment buildings, so those pierres d’attente are part of our patrimoine, and no asshole should be cutting them off, right? Point taken, and cutting off your pierres d’attente is apparently no longer allowed. But, hey, this is France, and we’re logical–so, what you can do is, you can cut them so that there’s a pente, a slope, on the top edge. (I just had to throw the French word in there, on account of the fact that when I memorized it, I thought that I would never, ever get to use it–and there, my friends, is a very concrete example of Zipf’s Law in action.)
The guided tour was great. Seulement voilà (the thing is)…the tour guide explained pierres d’attente to us, and now I can’t stop seeing them. It’s OK–frankly, the more there is to occupy my fevered little brain, the better…
English notes
Anglophone students of French whine about the French subjunctive, and frankly, I’m not sure that Francophone professors are thrilled about teaching it to us, but: the fact is, English has a subjunctive voice, too. Or, more accurately: it can. This varies quite a bit by dialect, but English can have a subjunctive, in at least the following circumstance: talking about things that are not real at the moment. For example, here are some options, with and without the subjunctive:
If I were you, I wouldn’t tell him to fuck off–he’s a lot bigger than you are.
If I was you, I wouldn’t tell him to fuck off–he’s a lot bigger than you are.
You can recognize the subjunctive by the weird agreement of If I were you, rather than If I was you. Both are correct, and most Americans would say If I was you, but If I were you is more natural in my dialect. (I come from a relatively obscure area in the northwest of the country.)
Would you prefer that he give you a pat on the back, or a kick in the ass?
Would you prefer that he gives you a pat on the back, or a kick in the ass?
Again, you can recognize the subjunctive by the weird agreement of he give you versus he gives you.
How the subjunctive was used in the post: Here is a case where the meaning is clearly not compositional–to unsee something, were it possible, would not be what it is if it were compositional. I chose obscenity-laden examples to make clear that this isn’t a formality thing–the subjunctive is just more natural in my dialect. Again, most American speakers of English would say the form of these two sentences without the subjunctive, but both are fine. I have no idea how this works in the United Kingdom–can any of you Brits comment on this?
Who among us has not looked across the majestic sweep of the Place de la Concorde, up the stretch of the Champs Elysées, or through the luxurious Luxembourg Gardens and wondered: what will this place look like when it’s overrun by zombies?
I first published this on November 13, 2015, from Denver, Colorado. Not long afterwards, phone calls and texts started coming in fast and furious: relatives who were hearing about the Islamic State terrorist attacks that would kill 130 people and injure another 368 that evening. The post didn’t seem so funny in that context, and I took it down after an evening of trying to reach family and friends in Paris. 14 months later, Paris has brushed off her shoulders and kept walking, as she always does, and I am ready to play my infinitesimally small part in that.
Who among us has not looked across the majestic sweep of the Place de la Concorde, up the stretch of the Champs Elysées, or through the luxurious Luxembourg Gardens and wondered: what will this place look like when it’s overrun by zombies? Who among us has not looked down an unending line of the 7-story Hausmannian apartment blocks that make Paris look like Paris and thought: it would really suck to have to clear 7-story building after 7-story building–with optional basement–of zombies…
The English Wikipedia page on zombies is quite long, and discusses zombies from every angle that one could think of–folklore, the evolution of the zombie archetype, the zombie in modern fiction, the significance of the zombie apocalypse, and the zombie in popular culture–each with its sections and subsections. In contrast, the French Wikipedia page on zombies is pretty much just this sentence:
Un zombie (ou zombi) est, dans le folklore, un mort-vivant ou un individu infecté d’un virus nuisible à certaines parties du cerveau.
Of course, even with just one sentence, Zipf’s Law brings us some new vocabulary items:
le mort-vivant: living dead.
nuisible: harmful, damaging, injurious; pest.
I have no idea what it means that there is a long English Wikipedia page on zombies and a very short French one. Probably something profound about France and America, but I don’t know what. I do know this: I hate zombies.
About 14 months later, the French Wikipedia page on zombies is considerably longer, and I’ve reached a new level in my thinking about the relationship between zombies and those Haussmannian apartment buildings: they will contain the zombies nicely, so they’re actually going to be a big help in recovering from the zombie apocalypse. However, I’m leaving this post as it was on November 13th, 2015–a fond memory of a more insouciant time.
One December night a couple years ago I came home from a pleasant evening spent wandering the Christmas market on the Champs Elysées with some friends to find a call from my mother. My father’s heart had stopped in the emergency room–twice. By an amazing stroke of luck, his cardiologist had been passing through at the time, and he had resuscitated him. They pumped on his chest, they shocked him–lots of times. Now he was on a ventilator (a machine that breathes for you) in the intensive care unit, and it wasn’t clear whether he would survive the night.
I got on the phone with the airline, threw some clothes in a suitcase, and the next morning I was on the first plane out of Paris. After crossing the Atlantic, and then North America, and then switching planes for the last leg from San Francisco to our home town in Oregon, I finally landed in Portland. I grabbed a rental car and sped to the hospital.
I got to my father’s room. He had survived the night. He was doing well, all things considered. The breathing machine had been removed. The number of IV tubes, monitors, and other beeping and buzzing things that he was attached to was not enormous, given the circumstances. He was still hoarse from the tube that had been in his throat as he greeted me in his own special way:
I see you’re wearing your fat clothes.
(Your “fat clothes” are the clothes that a person whose weight tends to go up and down wears when their weight is up.)
This didn’t feel anywhere near as bad as it must sound. In fact, my reaction was: Okey-dokey–looks like he’s doing fine! And, I actually wasn’t anywhere close to as fat as I usually am, so it seemed like a win, as far as I was concerned. It might not actually be the case that every cloud has a silver lining, but you can at least try to ignore the fucking cloud, right?
I’m guessing that you laughed at my story. Possibly you’re crying, remembering your own parents criticizing your weight, or your choice of clothes, or your choice of boyfriend, career, or political party–if so, I apologize. In either case, my story probably made an impression. Why? Because it is so entirely different from what one would expect.
It’s differences that make things interesting. I say that not as a statement about the value of diversity (although diversity is valuable) or about the value of surrounding oneself with dissenting opinions (although dissenting opinions are valuable), but as an assertion about why we are interested in things, and in particular, about why we read what we read. Presumably you don’t pick up the newspaper in the morning to see what was the same yesterday as the day before–you pick it up to see what was different yesterday from what usually happens. Roger Schank (famous artificial intelligence guy once upon a time, not-quite-so-famous Trump University guy more recently) has a whole theory about this being the reason that time seems to go by faster as we get older–the more we’ve already experienced, the fewer new things there are to notice, and so we just don’t notice time going by the way that we did when we were younger. The excellent book They say/I say is based entirely on the notion that academic writing—you could generalize it to what the authors call persuasive writing, in which you try to convince the reader of your particular take on something—is most convincingly done by starting out showing how the position that you’re going to take runs counter to positions that have been taken previously. Americans think that the French are rude, but they’re actually hyper-polite–you just have to know the differences between American etiquette and French etiquette to recognize it. Paris is always portrayed as the city where everyone strolls leisurely in a state of Zen-like relaxation, but Monday through Friday, we’re all just rushing to work as fast as we can and hoping that nobody gums up the works by throwing themselves on the train tracks. “French people are so nice” is not an interesting topic. “Americans think that French people are rude, but they’re actually very polite”–that’s a bit more interesting. Why? Because of the contrast between what you thought was true, and what’s going to be asserted.
Something that everyone knows is true: suicide is the coward’s way out. I went looking for pictures to illustrate this statement with on Google Images, and they are legion. Among the memes, posters, and tweets that I found were sentiments communicated by the following:
Albert Camus, second-youngest winner of the Nobel Prize for Literature
Seneca (not sure whether the Elder or the Younger–ironic, if the latter)
Some sort of fuzzy baby bird
Some entertainer I’ve never heard of
A pretty girl
A handsome guy
A guru
However: despite the fact that “everyone knows” that suicide is for cowards, this turns out to be bullshit. In fact, the distribution of suicide in society (in American society, at any rate) is not random: people with positions in life that we typically think of as requiring extra amounts of courage—police, military people and military veterans, prisoners, murderers–are more likely than the average person to kill themselves. Lemme run a list of occupations with elevated suicide rates by you (lemme and to run something by someone explained below in the English notes):
Sex workers (not sure why this paper gets cited a lot when people write about suicide rates in sex workers, but it does–my literature search kept leading me back to it. Google Scholar shows that it’s been cited 373 times. For comparison: my most heavily-cited article has been cited 275 times.)
Think about this: the instinct to preserve your life is pretty much the strongest instinct that any living thing has. It takes a tremendous amount of courage to go against it. Over and over, when you look at the statistics from the studies that I listed above, what you see is the following: people who are a lot tougher than you and me are at a higher risk for suicide than you are. It’s not a game for cowards–it’s really and truly a game for people who can look death right in the eye, and step forward, against every living being’s strongest instinct.
Thomas Joiner’s book Why people die by suicide explores this issue and its implications in depth. He makes the point that killing oneself requires overcoming what may be the strongest drive in human beings: self-preservation. (You don’t buy it? Pick up a razor and see if you can slit your wrists. No, not your wrists–just a little cut someplace where it won’t do you any damage. Did you do it? I thought not.) His theory is that self-harm is essentially a learned behavior—that you must, in essence, be trained, or train yourself, to have the capacity to kill yourself. You must be fearless, and you must be able to tolerate pain. See that photo of a Japanese army officer about to kill himself? See that diagonal straight line to the left of the photograph? It’s a rifle. After the officer opens himself up with his sword, that guy shoots him in the head to put him out of his misery. (Back in the day, after you cut yourself open, one of your samurai buddies took your head off with a sword.) Yes, I spared you the next picture in the series. If you want to see it, follow the link.
So, why do people kill themselves? I don’t claim to know. I’ll give you a quote from the Harvard University Press notes on Joiner’s book:
Among the many people who have considered, attempted, or died by suicide, he finds three factors that mark those most at risk of death: the feeling of being a burden on loved ones; the sense of isolation; and, chillingly, the learned ability to hurt oneself.
(You might notice that I’m not crediting the sources of any of the “suicide is for cowards” memes that I included earlier in this post–from my point of view, the authors are welcome to shove the lack of a citation up their butts.)
If you know someone who killed themselves, the two questions that you’ve been asking yourself ever since are probably:
Why did they do it?
Could I have prevented it?
With respect to the first question: probably nobody knows but the person themself. With respect to the second question: probably not. We can’t know what could have happened, right? But, I can tell you this: psychiatrists, psychologists, and licensed clinical social workers spend years learning how to prevent suicide, but they definitely cannot always do it. I don’t know how you could expect yourself to do any better than a trained psychiatrist.
You might be able to do something about somebody else’s suicide, though. For psychiatric disorders, language plays a central role in diagnosis. Applying language technology in this domain could potentially have an enormous impact.
Seulement voilà–the thing is–if you want to use computers to do things with language, then you need language data with which to train and evaluate the computer. Until recently, if you wanted to get your hands on actual data, here’s what was available: you could obtain a set of suicide notes collected and annotated by my colleague John Pestian at Cincinnati Children’s Hospital Medical Center (and me and a bunch of other people). That data has been revealing, and we’ve learnt things about suicide from that data that we didn’t know. But, that data was hard to come by. Putting that data set together took years (if you can read French, you can find a paper here on some of the issues), and if you want to get your hands on it, you need to go through some hoops to demonstrate that you have a legitimate research interest, that you will not be posting people’s suicide notes on Facebook or Pinterest, and so on.
Social media has completely changed the landscape of the availability of linguistic data, including linguistic data related to depression and suicide. In fact, the past couple years have seen an explosion of work on the linguistic characteristics of mental states associated with mental illness, including suicidality. But, you can’t just grab it–just because people post their lives on social media doesn’t mean that it’s OK for you to use that stuff for your own purposes. Ethical questions abound, and that’s just as true for the tweets, posts, or whatever of the psychiatrically healthy controls as it is for those with mental illness, suicidal behavior, or whatever. And that’s where you come in.
OurDataHelps.org is a group that collects social media data, particularly linguistic data, for use in doing research like the stuff that I’ve described here with the goal of suicide prevention. They want your data if you have ever flirted with suicide, but they want your data if you haven’t, too–you always need something to compare to, and people like me need data from non-suicidal people to compare to the data from suicidal people. That could be you! Check it out: OurDataHelps.org.
Not that you care about my point of view, but: I support people’s right to kill themselves. As the famous suicidologist Ed Shneidman put it in an interview with my colleague John Pestian: you ask me how many suicides I want? I want zero. But, I support the right to do it.
This is a pretty prevalent attitude amongst suicide researchers. My goal here is to give the person a chance to be shown that they have some options that they might not know they have–but, in truth, my motivation is less to prevent your death than it is to spare the people that you would leave behind the pain of losing you. Ultimately, you have the right to end your life, if you choose to do so. But: you probably won’t do it unless you believe that the lives of your loved ones will be improved by your death. It won’t be, and it’s actually for them that I do the work that I do in this area.
English notes (no French notes today)
lemme: an informal way of writing the informal pronunciation of “let me.” Don’t use this in work- or school-related emails, but it’s totally fine in casual written communication.
to run something by someone: to get someone’s input or permission. I’m going to run my abstracts by Pierre and see what comments he has.
In fact, there are a bazillion expressions with run and a preposition. (You might remember that bazillion is a word that means a large, but unspecified, number.) Off the top of my head:
to run through [a person]: to pierce completely, going in one side and out the other, as with a sword or spear.
to run through [information, instructions]: to discuss or present, typically all of it, but not necessarily in a lot of depth. “Before we get on the boat, let’s run through what to do in case someone falls overboard.”
to run by [location]: to go someplace, but not stay there very long. “I’m going to run by the 7-11 and pick up a lightbulb.”
to run [something] by [someone]: to get someone’s input, or permission, or opinion. “I’m going to run my abstracts by Pierre and see what comments he has.”
to run over [someone/something]: to pass over with a car. “Crap, I ran over a skunk, and now my car stinks to high heaven.”
to run over [information]: to “go over” information quickly. “I’ll just run over my notes quickly, and then I’ll go to the presentation.”
to run up [a bill]: to accumulate charges. “I ran up a phone bill like you wouldn’t believe in Guatemala–insane roaming charges…”
to run down: to locate by searching, with implication that the searching is long or laborious. “I finally ran down the guy who could issue my carte de séjour.” “The police finally ran him down.”
I live in the most boring neighborhood in Paris, but that doesn’t mean there’s nothing going on.
My little street, Christmas Eve or so.
The 15th arrondissement, where I live when I’m in France, is so boring that it typically doesn’t show up in guidebooks for tourists. A friend, Paris born and raised, once said this to me about the 15th: the rest of us don’t even think about it.
And yet: one of the things that makes Paris what it is to me is that anywhere you go, there’s a story. This morning on the way to the metro, I heard music and turned to see a taxi driver waiting at the stand–playing an electric guitar in the driver’s seat. Further down the block from my apartment is a little park. There used to be a château there, but after the revolution of 1789 it got turned into a gunpowder factory, and early one morning, it blew up. There were surprisingly few casualties–about a hundred–but they say that people found bits of clothes and body parts across the Seine in what is now the 16th.
Continue down the street and you get to the Dupleix metro station. It’s on the number 6 line, which follows one of the old city walls, and right outside the exit of the metro station was, for a long time, the place where you got taken to face the firing squad.
Turn left and you’ll soon find the rue du Commerce on your right. The famous British author George Orwell washed dishes there before he became a famous British author–if you are a Parisophile and you haven’t read his book about that time of his life, Down and out in Paris and London, you really should. And, although it would be tough to get further from an haute couture neighborhood than mine, this morning I was treated to the sight of a little old lady coming down the street in a full-length leopard skin coat. Matching high-heeled leopard skin boots. Oh–and matching leopard skin shopping bag.
Indeed, there’s a story everywhere you go in this city, and sometimes that story is personal. The 16th arrondissement (where the body parts landed when the gunpowder factory blew up in the 15th) really is the most boring arrondissement in Paris, but I never mind going there, because it’s where my grandfather lived.
I edited just a bit what my Paris-born-and-raised friend said. What she really said was this: people who live in the 15th love it, but the rest of us don’t even think about it. She’s definitely right about one thing–those of us who live here love it.
English notes (French notes follow)
To show up: to appear. Usually the subject is a human:
Party at my place Saturday night! Show up at 8…means that you should arrive at my house at 8.
Fifty percent of life is just showing up…means something like a lot of what it takes in life is to just try. (A Robin Williams quote, I think.)
…but the subject doesn’t have to be human, by any means:
My dog ran off last night, but thank God, he showed up on the back porch this morning, smelling like a garbage dump and looking pretty pleased with himself.
I was freaked because I lost my wallet, but then it showed up on my desk.
How it was used in the post: The 15th arrondissement, where I live when I’m in France, is so boring that it typically doesn’t show up in guidebooks for tourists.
to be born and raised somewhere: to be completely native to a place, because of having been born there and also having grown up there. You can use it with a normal sentence structure:
I don’t understand how someone can be born and raised In Pennsylvania but hate the steelers.
(The Steelers are the football team of the city of Pittsburgh, in Pennsylvania. In reality, the Eagles are the best football team in Pennsylvania, of course.)
The north trash. I thank God everyday that I was born and raised in the south.
(“GF” is girlfriend. Oakland is a city in California.)
How it was used in the post: I edited just a bit what my Paris-born-and-raised friend said. What she really said was this: people who live in the 15th love it, but the rest of us don’t even think about it. She’s definitely right about one thing–those of us who live here love it.
French notes:
le parigot/la parigote:Parisian. Pejorative. I wear it with pride.
très 16e:“very 16th”–in English, we would probably say “bouge,” or “boozh,” or something.
All of that time you spend on Facebook isn’t wasted if you donate your social media data to suicide research.
Until the 1960s or so, there were basically two ways to do linguistic research.
If you were into historical linguistics and/or dead languages, you looked at ancient texts.
If you were into living languages, you went and camped out on a reservation, in a village, or whatever, and you sat with native speakers and your notebook and you collected data. You transcribed things, and then went home and copied out your notes, and then you thought about them a lot.
In either case, the underlying philosophy was that there was some body of data in your hands, and your task as a linguist was to come up with a description/explanation of what was in that body of data. Seems straightforward enough.
In the 1960s or so, the American linguist Noam Chomsky turned the world of linguistics upside down with the idea that what you should be doing is describing/explaining native speakers’ intuitions about their language. Intuition is a technical term here–it refers not to “what Kevin happens to think is the case about his native language,” but to native speaker judgements about questions like Is the sentence “I saw the man on the hill with a telescope” ambiguous? This changed the conception of what constitutes “data” enormously. On this view of linguistics, there’s no need to go freeze on the Siberian tundra to get your data–you can do it in your living room. Les données, c’est moi !
Today linguists are less likely to talk about binary “yes it is/no it isn’t” questions than they are about gradient judgments–“Sentence X is more acceptable than sentence Y”. For a really good discussion about the issues from a perspective I think you’ll like, see the work of John Sprouse under the general heading of “experimental syntax.”
–Philip Resnik
From a philosophical perspective, this was a radical shift–from empiricism (sometimes a very extreme empiricism, as for example in the case of Leonard Bloomfield (leading American linguist of the first half of the 20th century, author of the first article published in the journal Language, and Yale professor who was refused membership in the Faculty Club because they didn’t let Jews join in those days), who was of the opinion that mental states are not observable, and therefore semantics is not a fit topic for science) to rationalism, and a rather extreme rationalism at that. Not everyone was happy about this, and in fact most older linguists weren’t–from a methodological point of view, it’s hard to see how you could falsify a hypothesis when the evidence that’s being presented is some version of yes it IS ambiguous–but Chomsky took the grad students by storm, and linguistics underwent a radical change. It swept the world of academia in a way that it never had before, too. (Check out Randy Allen Harris’s The linguistics wars for details.)
Meanwhile, Henry Kučera and W. Nelson Francis were thinking about the potential for computerized analysis of language, and in 1967 they published Computational Analysis of Present-Day American English, based on the study of a bit over one million words of American English that they had had typed up by keypunch operators. They were at Brown, and called their data set, which they made available to the public–now you could check someone else’s data–as the Brown Corpus. As far as I’m aware, that data itself didn’t lead to any earthshaking discoveries, but it did make people’s ears perk up: it was clear that there were possibilities for doing defensible studies of language when you could search a large body of text with a few keystrokes that just weren’t there if your data were whatever you happened to need to intuit that morning. Or whatever your grad student happened to need to intuit that morning. Or, in a pinch, whatever you happened to need to intuit in the heat of your dissertation defense.
There were some issues with the Brown Corpus, or at any rate, with trying to make similar corpora (the plural of corpus) on your own. One was copyrights. The Brown Corpus was what is called a stratified sample: it deliberately tried to structure its contents. Those contents included fiction, non-fiction, personal correspondence, books, newspaper articles–all sorts of stuff, much of which required getting permission from someone or other. Then there was the matter of those keypunch machines–all 1,014,312 words had to be entered by hand. People continued to pursue the construction of corpora, and cool things came out of that work, both in terms of linguistic theory and in terms of designing computer programs that could do things with language. But, it was slow going–people realized that bigger and bigger corpora would let them do cooler and cooler things, but typing is neither fast, nor inexpensive.
Then a miracle happened: the Internet. All of a sudden random people around the world were vomiting forth massive quantities of linguistic data, and it was mostly copyright-free, and they were typing that shit themselves. Nectar! Now you can get access to billions of words of text in an amazing variety of language. Is it necessarily clean, pretty, or legible? No. Is it real? Yes, and that’s what matters, at least to linguists. My colleague Graciela Gonzalez at Penn has done amazing things with social media data, ranging from monitoring medications for previously unknown adverse effects to monitoring prescription medication abuse.
Until recently, there was remarkably little data available on the language of suicidal people, or even on the language of people with psychiatric disorders in general. This is surprising, because with so many mental illnesses, the symptoms are, for the most part, expressed via language. As Philip Resnik, Rebecca Resnik, and Margaret Mitchell put it in the introduction to the proceedings of the first Association for Computational Linguistics workshop on computational linguistics and clinical psychology in 2014,
For clinical psychologists, language plays a central role in diagnosis. Indeed, many clinical instruments fundamentally rely on what is, in effect, manual annotation of patient language. Applying language technology in this domain, e.g. in language-based assessment, could potentially have an enormous impact, because many individuals are motivated to underreport psychiatric symptoms (consider active duty soldiers, for example) or lack the self-awareness to report accurately (consider individuals involved in substance abuse who do not recognize their own addiction), and because many people — e.g. those without adequate insurance or in rural areas — cannot even obtain access to a clinician who is qualified to perform a psychological evaluation.
Suppose you’re interested in the language of suicidal people. Until recently, if you wanted to get your hands on actual data, you could get your hands on a set of suicide notes collected and annotated by my colleague John Pestian at Cincinnati Children’s Hospital Medical Center (and me and a bunch of other people). That data has been revealing, and we’ve learnt things about suicide from that data that we didn’t know. But, that data was hard to come by. Putting that data set together took years (if you can read French, you can find a paper here on some of the issues), and if you want to get your hands on it, you need to go through some hoops to demonstrate that you have a legitimate research interest, that you will not be posting people’s suicide notes on Facebook or Pinterest, and so on.
Social media has changed all of that. In fact, the past couple years have seen an explosion of work on the linguistic characteristics of mental states associated with mental illness, including suicidality. Much of it has appeared in the proceedings of CLPsych, the above-mentioned Association for Computational Linguistics workshop. To give you some examples:
Glen Coppersmith and his colleagues at Johns Hopkins worked with tweets from people with post-traumatic stress disorder, seasonal affective disorder, depression, bipolar disorder, and psychiatricaly healthy controls. They found that based on the contents of the tweets, they could do a pretty good job of classifying which of those categories the poster belonged to. They tried various methods of representing the contents of the tweets, and found that they got the best results with what are called statistical language models. In a later paper, they looked at six more conditions, and added exploratory analysis on the distributional characteristics of emotionally relevant language.
Margaret Mitchell of Microsoft and her colleagues worked with tweets from schizophrenics and healthy controls, and found some unexpected signals in the language of the schizophrenics. For example, the schizophrenic social media users were statistically more likely to use what linguists call hedging expressions like think, I believe, or I guess.
In one of my favorite papers of this ilk, Munmun De Choudhury and their colleagues looked at the language of people who moved from a Reddit for people with mental health related diagnoses to a suicide watch Reddit. They found a number of differences in the language of those Reddit users who moved to the suicide watch group and those who didn’t, including differences in what is called accommodation: the ways that we (can) adjust our language to that of the people with whom we’re communicating. (A post on the subject is in the works.)
Now, there’s an issue here: just because people post their lives on social media doesn’t mean that it’s OK for you to use that stuff for your own purposes. Ethical questions abound, and that’s just as true for the tweets, posts, or whatever of the psychiatrically healthy controls as it is for those with mental illness, suicidal behavior, or whatever. And that’s where you come in.
OurDataHelps.org is a group that collects social media data, particularly linguistic data, for use in doing research like the stuff that I’ve described here with the goal of suicide prevention. They want your data if you have ever flirted with suicide, but they want your data if you haven’t, too–you always need something to compare to, and people like me need data from non-suicidal people to compare to the data from suicidal people. That could be you! Check it out: OurDataHelps.org.
Work in this space is definitely emotionally taxing. I find myself with a rule similar to John’s “no more than 10 a day” rule — enough to constantly remind me of the importance of this work, without becoming emotionally oppressive. The emotional response to spot-checking the data is qualitatively different and far more visceral than something like sentiment analysis of beer reviews.
–Glen Coppersmith
One day I needed to read through some suicide notes. I set an afternoon aside to do it. I made it through about 150–an hour, maybe–before I read one that was like being punched in the stomach. I went out and bought a pack of cigarettes, and I didn’t even smoke (at the time!). I spent a lot of time over the course of the next couple weeks trying to forget it. I mentioned it to John. All afternoon? Man, you can’t do that–10 a day, max. He was, of course, right. In this kind of work, ethical issues come up with the researchers, too. We now provide free therapy for the people who transcribe data for us on suicide-related projects, our researchers who work directly with the data are required to visit a therapist or a clergyman once a month, and we rotate research assistants off of the project every quarter. Moral of the story: I don’t recommend that you go digging around in this data out of curiosity. But, you can be the data–suicidal or not, why not donate your social media data to OurDataHelps.org and maybe keep someone else from writing one of those notes?
Thanks to John Pestian, Philip Resnik, and Glen Coppersmith for their comments and contributions. French notes follow.
French notes
le suicide: suicide
le/la suicidé(e): person who kills themself
suicider: to drive someone to suicide; to make someone’s death look like suicide
se suicider: to kill yourself
se donner la mort: to kill yourself
la tentative de suicide: suicide attempt
maquiller un meurtre en suicide: to make a murder look like a suicide