May 2016 – Zipf's Law

How to flunk your rotation in informatics: insights from burrowing mammals

Trigger warning: this post contains graphic descriptions of Talpidae-phobic violence. Sorry, no French language stuff here–come back tomorrow (or so) for our usual exploration of the implications of the statistical properties of language for second-language learners.

Woodchuck scat — Woodchuck poo. If you’d like to know how woodchuck poo can be relevant to your career in informatics, read on. Picture source: http://www.harpercollege.edu/ls-hs/bio/dept/guide/gallery/evidence/scat/original/Woodchuck_Scat.jpg.

Here’s some advice on how to flunk your rotation in informatics. I’ve written this with details that are specific to my particular field–natural language processing–but the broader ideas apply to informatics in general, to dissertation-writing in most academic fields that I can think of, and outside of academia, to software development jobs, to grant-writing, or to almost anything with a deadline at which you will be evaluated at some point. Following this advice won’t guarantee that you’ll flunk your rotation, but not following it is an excellent way to improve your chances of passing.

Be afraid to ask questions

This is the biggie. Afraid that people will think you’re stupid if you ask questions? Don’t be–they’ll definitely think that you’re stupid if you don’t, and then don’t figure stuff out some other way. The absolute best students I’ve known were two people who had weekly appointments with me while they were doing their studies, specifically to ask questions. One of them is a rapidly rising star at a government research institute now, and the other is running a bioinformatics program. If you can’t get over your fear of asking questions, your chances of professional success are low. (I don’t mean to imply that I’m any good at answering questions–but, something about the nature of that interchange seems to have made some sort of contribution to their educations.)

Don’t make a schedule

As soon as you figure out what you’re doing for your project, don’t do what we do in the military—if you want to flunk your rotation. What we do in the military: write down a list of every step that has to be accomplished to get from where you are now to where you need to be at the end of the rotation. Are you going to think of everything? No–but, you’re going to think of most things. Don’t obsess about that.

Now put the due date by the last thing in your list of things that have to be done. Work backwards, estimating the time by which you will hit each of the preceding steps.

Now ask a question: is the date by which you would need to have started in order to get done on time already past? If so: go back to your advisor, because you need to modify your project–now. If not: great! So far, you’re on track!

A good way to flunk your rotation is to not have any way to estimate whether or not you’re on schedule to finish on time. If you don’t want to flunk your rotation: make a realistic schedule that lists everything that you have to do, and by when each step needs to be finished. (See the sidebar for one way to do this.) Go back to your timeline frequently, and make sure that you’re on track to finish by the due date. If you’re not on track: figure out what you need to do differently to get back on schedule. If you are on track: great! Part of the beauty of working out your timeline early is that you find out quickly if you’re falling behind, but to my mind, the real beauty of working out your timeline is that if you see that you’re on schedule, you have a license not to be anxious. No point in sweating if you’re on track to finish on time at the moment. Schedules can be anxiety-inducing if you fall behind, but that’s OK–if you’re falling behind, you want to figure that out now, not a month from now. The thing is, schedules can also be reassuring–if you know that you’re not behind, then there is no reason at all to lie awake at night worrying.

Don’t establish immediately that there’s data available on which to test your system

This is the number-one informatics-specific rookie mistake. (The being-afraid-to-ask-questions thing is an indiscriminate killer of everyone.) Suppose that your rotation project is to build a system that whacks moles. (English note: the verb to whack means “to strike with a smart or resounding blow.” (Source: Merriam-Webster.) It can also mean “to kill,” especially when talking about organized crime.) You’re going to want to demonstrate that it does, in fact, whack moles: if you can’t actually get your hands on any moles, you’re going to be asking the faculty to just take your word for it that this would be a really, really great mole-whacker, and that’s not likely to happen. If you find out two weeks before your rotation ends/your conference submission deadline/your grant submission deadline that there’s no data available with which to test your interesting hypothesis, it’s probably game over–come back next semester/next year/next shift in national scientific priorities and try again. On the other hand, if you realize very quickly that there’s this interesting hypothesis but no existing data with which to test it, and then you propose a way to create the data and an associated evaluation methodology, that’s an excellent approach to doing a rotation/writing a paper/getting a grant. You can use the data to test your hypothesis in the next rotation/paper/grant proposal, and you’ll be the first one to do so (important in academia), ’cause there was never any data around that would have let anyone do the experiment before.

Neil Sarkar, the Founding Director of Brown University’s Brown Center for Biomedical Informatics, makes a related point that is crucial for people doing rotations in biomedical informatics: “One thing to also consider is importance of knowing when an Institutional Review Board protocol must be filed… And not trying to evade the process of getting Institutional Review Board approval…” It’s important to think about this up front, and if you need this kind of institutional approval, you want to ask for it early, because these things can take an amazing amount of time just to prepare the request, and then you have to wait through the approval process, too.

An aside: I’m guessing that all of you non-informatics people out there are thinking that I’m just making things up with this whole issue of mole availability or lack thereof–click here for the search page of Jackson Labs, which exists in large part to connect researchers with mice that have very specific genetic characteristics needed for an incredible variety of experimental investigations. You say that you need some Chinese hamster ovary cells? I ask: what kind? Click here for the CHO-K1 line. 575 euros. They’re super-important in research on therapeutic recombinant proteins. You say you’re a surgeon who does kidney transplants, and you want to do a better job of getting kidneys to survive between when you take them out of the recently-departed and put them into the recipient? You need to understand metabolism at low temperatures. You say you want to understand metabolism at low temperatures? You need to understand hibernation. You say you want to understand hibernation? You need a lab full of arctic ground squirrels. How does a surgeon who does kidney transplants get their hands on a bunch of arctic ground squirrels? Go to the Arctic Circle (during the summer, obviously, ’cause they hibernate in the winter) with a bunch of carrots–see here for an article about how fun this is (warning: graphic picture of an arctic ground squirrel on an anesthesia machine), here for how to figure out where to put your traps, and here for details on things like the trade-offs associated with large traps versus small traps, the relative effectiveness of selective site trapping versus grid trapping, how to use a girth hitch sling to allow a single person to handle an arctic ground squirrel alone, and some stuff about toe amputation that we don’t need to go into. This undoubtedly sounds like a lot of work, and it is. It could be worse, though–if what your research requires is woodchucks (useful for the study of a particular kind of liver cancer called hepadnavirus-associated hepatocellular carcinoma), you may have to raise them in the lab yourself. This is a huge big deal if you’ve got a deadline, because they only breed in March and April, and then they’re pregnant for a month, and then they don’t actually have very large litters after all of that. Now, if you’re reading this, you probably are studying some forms of informatics, and thinking: this guy’s full of shit–I don’t need no stinking woodchucks. But, keep in mind that long time-lags are common in informatics research. For example, the CRAFT corpus took over three years to build, and PropBank has been growing for well over a decade. Data is precious, and sometimes it’s expensive, and it’s not always there when you need it–unlike Chinese hamster ovary cells, it’s often not possible to just go to a web site and buy what you need. So, if you don’t want to find yourself doing the informatics equivalent of scooping the woodchuck litter boxes while the rest of your classmates are giving triumphant rotation talks, the question of availability of data for testing your system has to be the very first thing that you resolve after you walk out of your new rotation supervisor’s office to go sit in your carrel with a warm feeling in your heart and visions of an endowed professorship at Stanford. Let me repeat the word available–the fact that your medical school has 10 petabytes of electronic health records with all of the data that you need in them does you no good whatsoever if you can’t get access to them.

Don’t establish scoring criteria up front

You want to have a conversation with your rotation supervisor very early in the process about what will constitute success. Suppose that your project is to build a system that whacks moles. What does it mean to have built a system that whacks moles? Does it have to be a successful system, or can it just exist? If it has to be successful: what does “successful” mean? Does it have to kill the moles, or is it OK to just tap them on the head? Maybe it’s actually preferable to just tap them on the head? If you don’t ask, you won’t know. Does it have to whack every mole, or is it OK if it focusses on whacking the moles that smell bad? If it whacks one mole one time, does that satisfy the requirements of the mole-whacking-system-building project, or does it need to continue whacking moles unto eternity, and if so, what are the requirements regarding the ability of the system to continue whacking moles when the zombie apocalypse comes and there is no more electricity? If it misses 1 mole out of 10, would that still constitute mole-whacking? What about if it misses 5 moles out of 10? Suppose that what’s really wanted is a system that whacks every mole, every time, exactly on the top of the head, with uniformly fatal results, all the way through the zombie apocalypse until the spirit of cooperation, mutual assistance, and recognition that we are all connected in a web of interdependence restores humanity to its rightful zombie-free position on the planet–but, although your system is only catching 50% of the moles and sometimes it punches them in the stomach instead of whacking them on the head, and you don’t really have a good plan for the whole what-happens-when-there’s-no-more-electricity thing, but in the process of building the system, you’ve come across a really novel approach to thinking about mole-whacking that is likely to yield real insight into the nature of moles, the nature of whacking, and how to think about speciesist violence in terms of a general framework with applicability to subterranean mammals as a whole, and possibly also some of the smaller lizards–but, not until a couple months after your project is over and grades are submitted. This might seem persnickety, but I have most definitely seen the situation where the student (or software engineer, or grant writer, or whatever) thought that they were supposed to be whacking moles in the sense of small fossorial mammals, but what their rotation supervisor was looking for was a system that whacks moles in the sense of a spy who has integrated themself into an organization, and those situations most definitely did not end in a way that led to the student feeling happy. (See above for how you can use fear of asking questions about things like this to increase the chances of flunking your rotation.)

A pithier version of the preceding, very long paragraph: the great suicidologist Ed Shneidman used to say that “the most dangerous four-letter word in the English language is only.” (If you’re not a native speaker of English: a “four-letter word” is an idiom meaning a curse word–fuck, shit, piss, etc.) The biggest warning sign of an impending rotation-failure (or comprehensive exam, or missed grant deadline, or whatever) is the word something in your topic. If your description of your topic is I’m going to do something with mole-whacking/semantic role labelling/protein structure prediction, then you still have major gaps in your conception of the project, and you have no idea what will constitute success–or a failing grade, either. Seriously: sounds simplistic, but the presence of the word something is a strong diagnostic.

Spend a lot of time obsessing about minor details early in the process

Have you been tasked with building a mole-whacker? Put a lot of time into thinking about moles with bad breath, moles with nice breath, and moles that would be really cute if only they did something about their taste in Restoration essayists. Are you going to build a system that does deep analysis of subtle differences between different kinds of change-of-state verbs? Spend a lot of time thinking about how you’re going to detect the ends of sentences. (If you’re not a language processing person: getting a computer program to recognize the ends of sentences is a lot harder than you might be thinking. But, it’s not super-crucial to the bigger problem of deep analysis of subtle differences between different kinds of change-of-state verbs.) If there’s one thing that I’ve learnt from spending a lot of time around French people, it’s that minor details are important. But, you need to have the big picture in your mind all the time, and if you have a 10-week rotation and you spend two weeks of that time thinking about how to do a perfect job of finding the ends of sentences, then you have reduced your chances of successfully completing your project quite a bit, unless it’s about improving the ability of computer programs to find the ends of sentences. (If you’re not a language processing person and you think that I’m just making this shit up: click here for a paper on the role of finding the ends of sentences in the task of finding bacteria habitats, or here for a paper on event response potentials as they relate to prospective and retrospective processes at sentence boundaries, or here for a paper on why you need a support vector machine with a linear kernel (or so the authors claim) to tell the difference between a period at the end of an abbreviation and a period at the end of a sentence in clinical documents (health records).)

Don’t differentiate between aspects of the approach that do and don’t test your hypothesis

By now you might accept that it’s important not to spend a lot of time obsessing about minor details early in the process. But: how do you know what makes something a “minor detail”? Minor details are things that have very little to do with actually testing your hypothesis. Now, you’re thinking: I’ve discussed what counts as success with my rotation supervisor, and we reached the consensus that analyzing subtle details of different kinds of change-of-state verbs means reaching an F-measure of 0.80 on the Semantics Evaluation Conference Official Subtly Different Change-Of-State Verb Test Set. What if I pick the wrong find-the-ends-of-sentences system, and that reduces my performance to 0.79, when it could have been 0.81 if only I’d picked the right find-the-ends-of-sentences system? In that case, I would suggest that you renegotiate what you’re doing with your rotation supervisor. The question with which you would start the conversation: what’s interesting about getting an F-measure of 0.80 versus 0.79? How would that change our knowledge of the world, or software for analyzing subtle differences in the various and sundry kinds of change-of-state verbs, or moles, or whatever? Can we frame the project in terms of a question of some sort that might have broader implications for how one might approach this kind of task in the future, such that my career doesn’t succeed or fail on the basis of whether or not I’m good at finding the ends of sentences?

Don’t have a hypothesis

If you would like to flunk your rotation, it’s helpful to not have a hypothesis. If you don’t have a hypothesis, then you’re less likely to know whether or not you’ve tested anything, which means that neither you nor the faculty who will be grading your rotation project will know whether or not you finished your rotation project. That’s not a guaranteed way to flunk your rotation–you’ll leave the faculty in the position of guessing whether or not you finished it, and maybe they’ll guess that you did–but, it’s a pretty good one.

Don’t know why you’re doing your project

On some level, you always know why you’re doing your project–you’re doing it because your advisor thinks that it would be a good idea. But, why? Let’s step back a bit. Suppose that you have a hypothesis in hand. From a practical perspective, you care about knowing why you’re investigating that particular hypothesis out of a universe of possible hypotheses because if you know why you’re investigating that particular hypothesis, you’re more likely to do a good job of investigating it, or so I assert. Some reasons that I assert that: we discussed above the importance of being able to differentiate between things that take up a lot of time but don’t actually test the hypothesis and things that do contribute to testing the hypothesis. In fact, if you know why you’re testing the hypothesis, then you might realize (hopefully early in the process) that your specific hypothesis isn’t actually going to contribute very much to achieving whatever it is that was your rotation advisor’s motivation for suggesting the project in the first place. That’s the practical reason. There’s a more general reason, too: you’re a graduate student. You want to get a graduate degree. In most fields, we give people graduate degrees when they have contributed some significant piece of knowledge to the stock of what we know. You can certainly contribute pieces of knowledge to the stock of what we know without having any kind of broader conceptual framework (say, a theory) for understanding why those pieces of knowledge would be relevant to someone somewhere, but it’s harder to contribute a significant piece of knowledge to what we know without some kind of broader conceptual framework. It’s that broader conceptual framework that establishes the context that defines your piece of knowledge as significant or not; your piece of knowledge consists, in some sense, of whether or not your results are consistent with your hypothesis; your hypothesis is more likely to be a useful hypothesis if you know why you’re evaluating it. There has been far more written about what makes a hypothesis a useful hypothesis (or not) than I will ever understand before I retire, but it’s worth your while to check out at least some of it. You can find relevant stuff in epistemology, or in philosophy of science, or in statistics–there’s something for every taste.

The epistemology of flunking rotations: Where I got all of this stuff

Some of this stuff comes from my own experience of flunking things–I left graduate school feeling like I knew a lot more about how to not get a PhD than I did about how to get one. I asked a number of people who teach in graduate programs of computer science, medical informatics, bioinformatics, and linguistics to look at the post, and incorporated their comments. The rest comes from years of watching people flunk rotations, as well as flunk master’s thesis defenses, comprehensive exams, prelims… Also watching people miss deadlines for conference submissions, grant submissions, software releases–and I’ve missed more than one of those myself. Learn from my mistakes–it’s a hell of a lot less painful than learning from your own!

The picture at the top of this post shows a hibernating arctic ground squirrel in the gloved hands of a researcher. It comes from https://www.independent.co.uk/news/science/arctic-squirrel-hibernation-recycle-nitrogen-b1767464.html.

Paris’s begging ecosystem

There are entire genres of begging in Paris, some unique to this city.

toblerone-hero — Picture source: https://mcfarlandcampbell.co.uk/tag/toblerone/

One evening I was on the RER (a regional train) on the way home from work when a woman of indeterminate age got on. She was eating a Toblerone. Excuse me, ladies and gentlemen, she said loudly. (If it’s in italics, it happened in French.) Could you give me some change, perhaps a euro? She pulled out another Toblerone and examined it closely, turning it from side to side. Sometimes I lure a man into a parking lot, and I bite him. She put it slowly into her mouth. Sometimes in Cameroon, I would eat a man. Another Toblerone, which she chewed on meditatively.

By this point, I was seriously questioning my ability to understand spoken French. I looked at my French coworker who happened to be sharing the train with me. Did she just say… Yep, he answered. Parisians most definitely do not speak to strangers on trains, but this time a young woman sitting next to him joined in: “She says she eats men.” (It’s pretty easy to tell that I’m not French, and she spoke English.) The lady examined another Toblerone before putting it in her mouth. I’m hungry. If you have some money, some spare change…

This was a very strange little speech to hear, and the whole box-of-Toblerone thing added a certain hallucinatory element to the experience. But, in a Parisian context, it made a certain amount of sense. Visitors to Paris usually notice pretty quickly that there are a lot of beggars here. We talked in a previous post about why there are so many beggars here, and there are perfectly good reasons for it. Although there are a lot of folks who are out there asking for money in this town, they actually fall into a finite number of classes, at least one of which is specific to Paris, and the cannibalistic Toblerone eater was an instance of one of them. Here in France we love to classify things, so let’s run through the categories. Beyond the intrinsic interest of the facts that there are categories at all and the nature of the categories themselves, it’s interesting to think about how the various and sundry categories manage to live together in an ecosystem of sorts–different kinds of beggars fill different niches in the city.

Métro: You will occasionally see someone–usually a man–get onto a métro car or a regional train and ask for money. There’s a set ritual for this. Basically, the guy makes a speech. It tends to follow a specific pattern.

Apology: Ladies and gentlemen, I’m sorry to disturb you during your trip.
Statement of problems to be solved: I am homeless/jobless/I have four children and a sick wife and need a hotel room/money for food/diapers.
Request: If you have some spare coins/restaurant tickets/a euro or two…

…and then they walk through the car with a paper cup or with their hand out. These guys don’t necessarily make much in a single car, but they typically do make something–more if they’re old, less if they’re young and look like they could be working for a living like the rest of us. Then it’s off of that car and on to the next one. In the light of the existence of this genre of begging, the Toblerone lady makes a certain amount of sense, and you have to give her credit for originality (or for insanity–I’m actually betting on the latter).

roma woman begging champs elysee — Roma woman begging on the Champs Elysée. Picture source: http://flickrhivemind.net/Tags/beggar,paris/Interesting.

Eastern European Roma women on the Champs Elysées: There’s a genre of begging which until recently I’d only ever seen in Eastern European countries. The way it works is that the beggar kneels on the bare sidewalk with his head on the concrete and his cupped hands held out to receive alms. It looks really, really painful. For the past couple years, I’ve seen Roma women doing this on the Champs Elysée. Only Roma women so far, and only on the Champs Elysées so far. Why them, and why there? I have no idea. Clearly, they’re Eastern European, but there are lots of Eastern Europeans in Paris, and I’ve yet to see any others begging like this. Occasionally the police will come by and roust them. They pick up their water bottles (this is, after all, 2016) and move on, then return later.

Disabled: One day this past winter I was on the metro on the way to work. I was bundled up like everyone else in Paris, as it was cold–hat, leather jacket, neck warmer (I still haven’t been here long enough to wear a scarf), gloves. Into the car climbed a guy in short-shorts. His legs were these skinny, twisted things–maybe as big around as my forearm, and oddly bent. He didn’t say a word to anyone–just struggled down the aisle with his hand out. For a year or so, there was a guy sitting on the ground outside my metro station all day–no feet. There’s a kid (I say “kid”–I would guess that he’s in his twenties) who has a spot outside the grocery store. He sits there, silent, his head hanging, with a paper cup in front of him. I’m pretty sure that he’s schizophrenic.

With kids: An Eastern European friend taught me that there’s a special place in hell for people who abuse their kids by using them for begging when they should be in school. As far as I can tell, it’s mostly a Roma thing in Paris. You park your family on the sidewalk under a blanket, children prominently displayed, and hold your hand out to passersby. You occasionally also see Roma women with a baby panhandling–be especially careful, as some of them do a trick such that they only appear to be holding a baby, as it’s actually supported by a sling. That’s the hand that picks your pocket. (Let me point out that the vast majority of these ladies are just begging–but, the pocket-picking thing does happen, too.)

Parisian beggar with dogs. Picture source: http://www.newsner.com/en/2015/11/12-dogs-that-love-their-owners-no-matter-how-little-money-they-have/.

With animals to pet: You’ll see a lot of people with an animal or two on their lap. Drop some money in their cup and give doggie/kittie/bunny a scratch, if you feel like it. Most weeks petting beggars’ dogs and cats is my only physical contact with another living being, so a lot of my change goes into these folks’ cups. One of my favorite guys is usually in the Latin Quarter on weekend nights. He has these two little spaniel mixes, and it’s clear that he adores them and they adore him. The last time I saw him, I leaned over to drop a coin in his cup and pet the dogs. It’s Orthodox Easter tomorrow, you know, he said. (If it’s in italics, it happened in French.) Really?, I asked. Yeah, Easter–Orthodox Easter. Cabbage, I said. Have a good night. (My French continues to suck.) I still haven’t figured out why we had that particular conversation, other than the possibility that the next day might actually have been Orthodox Easter. Lately I’ve been noticing shiftless young people with ill-kempt animals trying to do the pet-my-animal thing. Their animals look like shit–not loved or cared for at all. You can tell the difference, I think. Note: be sure that the animal is there to be petted before you try to pet it! This sounds obvious, and I guess that it would be to any non-stupid person. However: I bent over to pet a kid’s pit-bull-looking dog one day without checking him out first, and he snapped at me. I had no clue whatsoever that I was capable of jumping that far that fast–backwards, no less. Obviously, if this dog had felt like ripping my arm off, he could have–he just gave me a little warning. Learn from my stupidity.

Finally, there are plenty of run-of-the-mill beggars. If they’re young, people mostly walk right by them, because there are plenty of frail old run-of-the-mill beggars that probably need your money even more.

Now, I’m not talking here about people who hustle–“hustle” in the good sense, or “hustle” in the bad sense. With the exception of the people with animals, the people that I’m describing here are straight-up beggars. Street musicians, mimes, comedians, dancers–that’s a whole nother genre. Pick-pockets, 3-card monte, the ring scam, the bracelet scam–that’s yet another genre, and they each have their niches in the hustling ecosystem of Paris.

English notes

Short-shorts: very, very short pants. Line from an advertisement for Nair, a leg-hair remover: Who wears short-shorts? Nair wears short-shorts. How it was used in the post: One day this past winter I was on the metro on the way to work. I was bundled up like everyone else in Paris, as it was cold–hat, leather jacket, neck warmer (I still haven’t been here long enough to wear a scarf), gloves. Into the car climbed a guy in short-shorts.

bunny: an informal/children’s word for rabbit. On my first visit to Belgium, I knew just barely enough French to order a meal in a restaurant. Seeing a meat on the menu whose name I didn’t recognize, and being an adventurous eater, I ordered it. It being pre-Internet, I had to ask a coworker the next day what I had had for dinner. His response (in English): You ‘ave eaten, ‘ow you say… Bugs Bunny. How it was used in the post: You’ll see a lot of people with an animal or two on their lap. Drop some money in their cup and give doggie/kittie/bunny a scratch, if you feel like it.

French notes

Cameroun: Cameroon. Pronunciation: the e is silent, so [kamrun].

Roma: there are many ways to say “gypsy” in French. In part, I know this because my favorite neighborhood bum gave me a lecture on the topic one day, with statistics. I have very little clue as to the current social acceptability of any of them; as far as I know, Roma or Rom is OK (just as it is in the US, where the word gypsy is definitely not OK in all circles), but I’m pretty sure that all of the others have varying levels of pejorativeness. How it was used in the post: For the past couple years, I’ve seen Roma women doing this on the Champs Elysée. Only Roma women so far, and only on the Champs Elysées so far.

Dictionary porn

The only things naked in this post are my foot, and a cat.

A surprise for you: linguists hate dictionaries. There are attitudinal reasons for this: one gets tired of undergraduates going on about how they must surely be The Official Source For What Words Really Mean. There are technical reasons for this: there’s an enormous amount of relevant information about words that dictionaries very rarely include–collocations (words that occur together more often than would be expected by chance–strong wind but heavy rain and stuff like that), argument structure (what kinds of things must occur with a word, e.g. to drink is transitive, except when it’s intransitive, in which case it means to drink alcohol specifically), crucial stuff like that.

Despite the fact that we’re not crazy about dictionaries, I would guess that most linguists probably deal with their distaste for them the same way that I do: I have a lot of them. How many, I couldn’t really tell you. In fact, I can’t even tell you how many English dictionaries I have. Do I count the dictionary of lumberjack language? How about my medical dictionaries (I have two)? My biology dictionary? My woefully-out-of-date dictionary of linguistics?

More information on dictionaries:

Descriptive and prescriptive dictionaries; strategies for ordering word senses (meanings)

Sexism and dictionaries

Choosing a bilingual dictionary

A good monolingual French dictionary app; observations about general deficiencies of dictionaries

Which dictionary do I use? Probably not a shocker to anyone who knows me: I have many monolingual English dictionaries lying around my place, and there are some electronic ones that I use, as well. Here are some of them, and when/why I use them:

This is my Macmillan Visual Dictionary. As you might guess, it’s been in my life for a while; I find it humorous that despite being a visual dictionary, it has no picture on the cover anymore, since it has no cover…Visual dictionaries are super-useful for some things. I used this one to do fieldwork. Since visual dictionaries group things thematically, they’re great for taking a structured approach to learning vocabulary in a foreign language. One of the more obscure recent additions to my dictionary collection is a bilingual French/Chinese visual dictionary–why not…

This is my beloved Webster’s 3rd–picture of my foot included for scale. When I was a young man, my father told me that if I ever saw one used, I must buy it. As it turned out, this was my college graduation present to myself. Based on the writing inside the front cover, I have reason to believe that it began its life as the property of the United States Navy: scrawled in heavy black marker are the words “Oil shack.” On a naval vessel, the *oil shack* is the control center for routing fuel to the boiler rooms and for monitoring its purity, or at least that was the case back when US naval vessels still had boiler rooms.

This is my beloved American Heritage College Dictionary. (“College” dictionaries are usually what are called “desk dictionaries”–as far as I know, it’s mainly a description of size. Picture of a cat included for scale. Some things that I like about the American Heritage College, which I was introduced to by my second linguistics professor: for usage questions, they have a panel, and they give the statistics on the panel’s votes; in the back, there’s a dictionary of Indo-European roots; there are just enough pictures to be helpful without interrupting the flow of the whole thing. (Yes, dictionaries can flow–or not.)

Webster’s Seventh New Collegiate Dictionary. This one has a special purpose. It was published in the early 1960s, and it’s my go-to dictionary for American literature from the first half of the 20th century. You can find a review of it here. (*Of course* people review dictionaries!)

My beloved compact Oxford English Dictionary. Books have been written about this one. Books have been written about its first editor. You might like Simon Winchester’s The professor and the madman: A tale of murder, madness, and the making of the Oxford English Dictionary. Somebody clearly used mine as a resting place for paint cans.

zsa_zsa_gabor_-_1959 — Zsa Zsa Gabor in 1959. Picture source: Rogers and Cowan talent agency. Downloaded from Wikimedia.

This being the 21st century, there are also some very good online monolingual English dictionaries, as well as a couple dictionary apps that I like a lot. For the moment, I’ll just leave you with this Zsa Zsa Gabor quote:

The only way to learn a language properly, in fact, is to marry a man of that nationality. You get what they call in Europe a ‘sleeping dictionary.’ Of course, I have only been married five times, and I speak seven languages. I’m still trying to remember where I picked up the other two. Source: https://www.brainyquote.com/quotes/keywords/dictionary.html

Losing face: what cows, dogs, and Neanderthals can tell you about why you have wisdom teeth

The story of wisdom teeth is as interesting as wisdom teeth are unpleasant.

One of the characteristics of the modern human skull is that the face is located primarily under the eyes. What the hell does that mean? For comparison, let’s look at some not-terribly-exotic animals. We’ll start with a nice side view of a cow.

Side view of a cow head. Picture source: http://www.dreamstime.com/stock-photo-profile-cow-head-image2728710.

Check out the cow’s muzzle. Is there any sense in which you could say that the cow’s face is under its eyes? No–the muzzle protrudes out frontally quite a bit.

In fact, by definition, a muzzle (or snout) protrudes. From the Wikipedia post on the subject: “A snout is the protruding portion of an animal’s face, consisting of its nose, mouth, and jaw. In many animals the equivalent structure is called a muzzle, rostrum or proboscis.”

There’s quite a bit of variety in muzzle (snout) shapes in the animal kingdom. Here are some possibilities in dogs. Mouse-over the pictures for technical terms that describe these different skull shapes.

If you have a long, thin snout, you’re Dolichocephalic. Picture source: https://upload.wikimedia.org/wikipedia/commons/thumb/e/ed/Lamtara_Golden_Spritzer.jpg/180px-Lamtara_Golden_Spritzer.jpg.

If you have a medium-length snout, you’re mesocephalic. Picture source: https://upload.wikimedia.org/wikipedia/commons/thumb/d/d5/Cocker_spaniel_angielski_zloty_photoshop.jpg/180px-Cocker_spaniel_angielski_zloty_photoshop.jpg.

If you have a very short, flat snout, you’re brachycephalic. Picture source: https://upload.wikimedia.org/wikipedia/commons/thumb/9/9a/Pug_600.jpg/180px-Pug_600.jpg.

If we look at various and sundry apes, we see that they have protruding muzzles (or snouts), as well. (Scroll down past the pictures.) Compare the human, the chimp, the orangutan, and the macaque, and you’ll note that the three non-humans have protruding muzzles. The human: no. (BTW: I don’t think that the macaque is an ape.)

primate_skull_series_with_legend_cropped — Human, chimp, orangutan, and macaque skulls. I don’t think the macaque is an ape, unlike the other three. Picture source: https://upload.wikimedia.org/wikipedia/commons/d/db/Primate_skull_series_with_legend_cropped.png.

220px-msu_v2p1a_-_vulpes2c_nyctereutes2c_cuon_26_canis_skulls — Skulls of four canid species: a fox, a raccoon dog, a dhole, and a jackal. Picture source: https://upload.wikimedia.org/wikipedia/commons/thumb/1/18/MSU_V2P1a_-_Vulpes%2C_Nyctereutes%2C_Cuon_%26_Canis_skulls.png/220px-MSU_V2P1a_-_Vulpes%2C_Nyctereutes%2C_Cuon_%26_Canis_skulls.png.

We can see how this anatomy relates to the rest of the skull if we look at the skull from the underside. Let’s go back to dogs–or dog-like things, at any rate. Here are four different canid species. Look at the second row from the top–that’s the underside of the skull. The narrow thing sticking out towards the front of the skull is the palate, or roof of the mouth. That’s the bone of the muzzle.

Where this becomes relevant to humans is that over the course of human evolution, we’ve gone from having protruding snouts to not having them. It’s hard to find a single picture that illustrates the progression, so I’ll run some individual ones by you. Here’s an Australopithcus africanus. Australopithecus was around from about 4 million years ago to about 2 million years ago. They’re probably ancestral to us–if not, we share a common ancestor. Note the prominent protrusion.

Australopithecus africanus. Picture source: https://whatmissinglink.files.wordpress.com/2014/05/australopithecus-africanus-sts5-together.jpg.

homo erectus and modern human

Here’s a nice side-by-side of a Homo erectus skull and a modern human skull.

Homo erectus was around from about 1.9 million years ago until about 70,000 years ago. It’s probably an ancestral species to modern humans. The frontal protrusion is nothing like the australopithecine one, but it’s still there. (Keep scrolling down–alignment problems…)

Side-by-side modern human skull and Neanderthal skull from the Cleveland Museum of Natural History. Picture source: https://commons.wikimedia.org/wiki/File:Sapiens_neanderthal_comparison.jpg.

Neanderthals were around from maybe 250,000 years ago until about 40,000 years ago. I’m not clear on the arguments as to whether or not they’re ancestral to modern humans, but we probably inbred with them. Not much protrusion left, at this point.

So, how does this relate to the question of why you have wisdom teeth? The thought is that as the muzzle of early hominids shortened down to what we (don’t) have today, it resulted in a crowding of the teeth into a smaller anterior-posterior (front to back) area.

Do we get anything out of all of this change in oral anatomy? Actually, modern humans do have a fairly unique oral cavity morphology (shape, in this sense of the word morphology). One of the results of that morphology is a lot more space in which to make different kinds of sounds, and those possibilities do indeed get exploited in the languages of the world. More on that another time. Until then, here’s some relevant French vocabulary.

le museau: muzzle, snout
le groin: pig snout
le boutoir: wild boar snout
la dent de sagesse: wisdom tooth

By the way: if you’re interested in this kind of thing, it’s worth checking out both the English-language Wikipedia page on wisdom teeth and the French-language Wikipedia page on wisdom teeth. Each one has interesting content that the other one doesn’t.

Who has a sagittal crest?

Before you hit your dog, remember that he can bite your hand hard enough to break it–but, he chooses not to.

Due to some WordPress layout issues, there are occasional gaps in this page. Please scroll down to get past them. Sorry!

what if i never find out whos a good boy — Picture source: https://twitter.com/m_pendar.

In America, we do love our dogs. A culturally common way for us to show our dogs affection is this: we pet them, while saying Who’s a good boy? (or Who’s a good girl?, depending on gender). In my family, we do it a little differently: we pet the dog while saying Who’s got a sagittal crest? Dogs don’t look at you with any more or less puzzlement regardless of which one you pick, so: feel free to go crazy with this one.

badger-4422 — Badger skull. The arrow is pointing at the sagittal crest. Picture source: http://www.jakes-bones.com/2010/09/my-new-badger-skull.html.

What’s a sagittal crest? The next time you run into a dog, run your hand along the center of the top of his skull. That ridge that you feel is his sagittal crest. Sagittal means along a plane that runs from the front to the back of the body. A sagittal crest runs along that plane. This sense of crest means something sticking out of the top of the head–think the plume on top of a knight’s helmet. Many animals have a sagittal crest, but not us modern humans. You see them in species that have really strong jaw muscles. A sagittal crest serves as one of the points of the attachment of the temporalis muscle, which is one of the main muscles used for chewing. If you have a sagittal crest, you can have a bigger temporalis muscle, which means that you can bite/chew harder.

gorilla skull — Gorilla skull. Picture source: http://alfa-img.com/show/new-gorilla-skull.html.

If you look at relatively close relatives to humans, you see sagittal crests on some of them. To the left, you see a gorilla. You wouldn’t want to get bitten by this guy. (Note that some gorilla species, especially their males, have really enormous sagittal crests–this is actually a pretty modest one, for a gorilla.)

pan troglodytes skull — Excellent replica of a Pan troglodytes (common chimpanzee) skull. Picture source: http://www.connecticutvalleybiological.com/product-full/product/chimpanzee-skull-pan-troglodytes.html.

Here’s (an excellent replica of) a Pan troglodytes (common chimpanzee) skull. This guy (I think it was a guy) had more of a sagittal crest than you (you don’t have any), but he didn’t have much, compared to that gorilla. Other chimps vary. Monkey species vary pretty widely regarding the presence or absence of a sagittal crest.

800px-Paranthropus_aethiopicus — An Australopithecus robustus species. This specimen is known as “The Black Skull.” Picture source: https://commons.wikimedia.org/wiki/File:Paranthropus_aethiopicus.JPG.

Some hominids that were ancestral to us had sagittal crests, but they disappeared pretty early in the course of our evolution. Here is a picture of the “Black Skull,” about 2.5 million years old. It’s from a type of Australopithecus robustus. By the time Homo erectus comes along (starting about 1.9 million years ago and lasting until about 70,000 years ago), the sagittal crest is gone. Picture below.

So: feel free to express your affection for your dog any way you want–you can’t possibly be any geeker than my son and me. Scroll down past the picture for French vocabulary.

800px-Homo_habilis-KNM_ER_1813 — Homo habilis skull, dated at 1.9 million years ago. Picture source: https://commons.wikimedia.org/wiki/File:Homo_habilis-KNM_ER_1813.jpg.

Relevant French vocabulary (see the Comments section for more):

la crête sagittale: sagittal crest
le muscle masticatoire: chewing muscle (note: the “c” in muscle is pronounced in French)
le muscle temporal: temporalis muscle
la morsure (action de mordre): bite (noun)
la morsure (marque de dents): teeth marks

How we’re sounding stupid today: staggering, test tubes, and French health care

There’s probably a finite number of ways to BE stupid, but there seem to be an infinite number of ways to SOUND stupid, at least in French.

Screenshot 2016-05-20 03.08.27 — The passé simple (my current favorite tense) of the verb *tituber,* to stagger/stumble/reel. Picture source: screen shot of http://en.bab.la/conjugation/french/tituber.

I sat in a lab yesterday waiting to have blood drawn for some routine tests. If it’s in italics, it happened in French:

Lab tech: I’m going to take two you are staggering.

Me: (puzzled, miming staggering by walking my fingers randomly across the desktop) “To stagger” means to walk like this, right?

Lab tech looks at me for a minute, then laughs: I’m going to take two LITTLE TUBES.

Titubes is “you are staggering/stumbling/reeling.” Petits tubes is “little tubes.” Spoken casually, it comes out as p’ti tubes, and if you don’t hear the p, that sounds just like titubes.

There’s a lot we could say about the linguistic phenomena behind this, but at the moment, I’m feeling more impressed by the experience of interacting with the French medical system. The health care system here is one of the best in the world–there’s nothing you can get in America that you can’t get here. One of my foster brothers is a surgeon with a fascinating sub-specialty. He was sent here for a week during his training, because the surgeons in France were doing techniques that hadn’t made it to the US yet. (I find it ironic that Pasteur (the most important microbiologist of the 19th century) was French, and now America forbids French cheeses made with unpasteurized milk if they’re less than 60 days old. It’s going to go to 75 days soon, which will wreak havoc with the tiny bit of an artisanal cheese movement that we have in the US.)

Health care is universal here–it was declared a human right in 1948. In addition to being great, the health care here is not expensive. These routine blood tests done cost me $200 in the US every time that I have them done; here, along with a visit with a friendly young doctor who giggled adorably at my crappy French, they cost me exactly nothing. You gotta laugh at those Trump-voting Americans who sneer at socialized medicine, and then want a socialized snow plow to clear their street before work in the morning…

le système de santé français: the French healthcare system
l’assurance maladie: health insurance
une analyse de sang: blood test
passer une radiographie: to have an x-ray
faire une radiographie: to take an x-ray
une radio: x-ray (slang).

It’s the little things that get you: how to say “yes” in French

ta-nehisi coates french composition — Ta-Nehisi Coates tries to write in French. The red writing at the top of the page says “30+ errors.” What you have to realize is that in English, Ta-Nehisi Coates is one of the most articulate people you’ll ever read–or hear. Picture source: http://www.theatlantic.com/education/archive/2014/08/acting-french/375743/.

It’s the little things that get you. It amazes and frustrates me that I can spend an evening sitting at home, happily reading a novel that uses the passé simple and the imparfait du subjonctif (two tenses that are used in literary French, but almost never in speaking–we aren’t even taught them in school). But, then I’ll go to work the next day, and someone will say “good morning” to me in a way that I haven’t heard before, and I just stand there like a blithering idiot.

The Lawless French web site just posted an article that shows just how difficult the “easy” things can be. It describes a wide variety of ways to say yes in French. You certainly don’t have to use all of them yourself, but you most definitely do need to understand them. And, as far as I can tell, it’s even more difficult than you would think from the wide range of yes-meaning expressions. For example, I’m told that ben, oui (“well, yes”) can have different meanings, depending on the intonational pattern. Say it one way, and it expresses uncertainty in your yes answer; say it another way, and it expresses confidence in your yes answer.

One of the ways of saying yes that the Lawless French web site talks about is, I suspect, one of the most common mistakes that us Americans make in France. A thousand years ago when I was in college, I took a course on linguistic field methods–how to deal with a situation where you run into a language about which you have no information whatsoever. We did Hungarian for ours. We were all amazed when it turned out that Hungarian had two separate, non-interchangeable words, both of which meant yes, but which were used completely differently:

igen is what you might think of as the “usual” yes.
de is yes, but only when you’re contradicting something that someone has said. You don’t want any ice cream, do you? De. (Yes, I do.)

Although we were all fascinated by this, it’s not that unusual of a phenomenon. French also has a “usual” yes: oui. And, it also has a different yes that you use when you’re contradicting a previous assertion: si. Me, to my delightful office mate Brigitte (if it’s in italics, it happened in French): I can’t SSH into the server. Brigitte: Si–if you can connect to the internal network, you can SSH into the server. Si instead of oui because she’s contradicting what I said–I said I can’t, and her si means something like yes, you can.

Back to the classic American mistake: in America, if we have any knowledge of a second language at all, it’s most likely to be Spanish. Spanish has one word for yes, and it’s sí. Remember the “foreign language buffer” that I swore I would not tell you about? Put an American in a situation where they can’t communicate in English and the language that’s most likely to come out is Spanish, regardless of whether or not that’s the language that’s actually being spoken around them. So: ask an American in France a simple question, and if the answer is yes, they’re quite likely to say si, even if on some level they know that the French word is oui. I have made this mistake a thousand times, myself–I’m not any more immune to it than the next American.

So: check out the Lawless French web site for more ways to say “yes” in French than you ever could have imagined, and here’s hoping that you don’t sound as stupid as I do today.

Modern humans, the forehead, and how to kill a vampire

img_anatomy_cranial_01 — Anatomy of the human skull, frontal view. Picture source: http://www.acsneuro.com/patient_resources/anatomy.

I studied the head and neck with a Romanian anatomist. He had a delightful accent when he spoke English–think Andrei Codrescu. We spent a lot of time talking about the skull. There’s a lot to say about the skull–the 22 bones that make it up, the multitude of foramina through which blood vessels and nerves enter and exit it, the evolution of the middle ear from the characteristic multi-part jaw bones that you can still see in lizards, I believe. Regarding the forehead, though, about all that he talked about was a structure called the glabella–the little depression between the eyebrows and above the nose. The only known function of the glabella, he said, is to insert a stake to kill a vampire. Imagine that being intoned with a strong Romanian accent and you have a fine example of the humor that characterizes the typical anatomist.

One unpleasant characteristic of scientists: we can suck the joy out of pretty much anything. On the plus side, we can find something interesting to think about pretty much anywhere. Arguably, one of the least interesting aspects of the human face is the forehead. It’s easy to find poems that go into ecstatic descriptions of the eyes and the mouths of a loved one, but I don’t recall ever reading a poem that praised someone’s forehead. There’s a lot to say about the forehead, though.

If you look at skulls of non-human great apes and of various extinct non-Homo sapiens species, one of the most distinct differences is that modern humans have a forehead, while the aforementioned others don’t, or at least don’t have the typical modern human tall, vertical forehead. Here is a nice schematic illustrating the trend in changes to the forehead over the course of human evolution (scroll down past it):

forehead evolution — 1) Australopithecus robustus 2) Homo habilis 3) Homo erectus 4) Homo neanderthalensis 5) Homo sapiens Picture source: http://thebrain.mcgill.ca/flash/a/a_05/a_05_cr/a_05_cr_her/a_05_cr_her.html.

It’s also useful to look at the forehead across the range of great apes. Here is a nice picture showing frontal views of the skulls of a variety of apes, great and otherwise (scroll down past it when you’re done):

HominoidSkulls — 1) Hylobates hoolock, white-browed or hoolock gibbon 2) Pongo pygmaeus, the Bornean orang-utan 3) Male and female Gorilla gorilla 4) Pan troglodytes, the common chimpanzee 5) Homo sapiens (us) Picture source: http://www.nhc.ed.ac.uk/index.php?page=493.504.508.505.

The development of the skull over the course of growth from infancy to adulthood is especially interesting, as it’s a good illustration of the concept of neoteny. Take a look again at this picture that we saw in a recent discussion of the human chin:

Neoteny_in_humans — Anatomy of the human skull, frontal view. Picture source: http://www.acsneuro.com/patient_resources/anatomy.

The top shows the development of the skull of a chimpanzee, from infancy to adulthood. The bottom shows the development of the skull of a modern human, from infancy to adulthood. The thing to notice here is that the chimp starts out with a forehead, but it goes away over the course of development. The human starts with a forehead, too, but it doesn’t go away. This is an example of a phenomenon that is often observed in the course of evolution: new species may evolve through the retention of some characteristics of the infant. This is known as neoteny. So, a dog is in some ways like an immature wolf, a domestic cat is in some ways like an immature wild cat, and so on.

The changes in the forehead over the course of human evolution are associated with a larger brain size, but interpret this fact with caution. A larger brain size doesn’t necessarily mean more intelligence–a whale has a heck of a lot bigger brain than you do. I’m not aware of any evidence that a whale is in any sense smarter than you, though, with the possible exception of the fact that humans sometimes get forehead tattoos, while as far as I know, whales don’t:

Probably not a great tattoo for a guy who’s into armed robbery–follow the link. Picture source: http://mashable.com/2015/12/22/fk-cops-tattoo-man-arrested/#xHCa157W8Gqw.

I’m guessing that this guy doesn’t get a lot of call-backs after job interviews. Picture source: http://www.tattoobite.com/tattoos/forehead-tattoos/page/10/.

I’ll bet that this guy gets beaten up–a lot. Picture source: http://www.answers.com/article/1179616/16-hilarious-tattoo-fails.

I actually have an “FTW” tattoo myself, although mine is on my shoulder, where it’s not quite as obvious as this guy’s. Picture source: http://www.tattoobite.com/tattoos/forehead-tattoos/page/10/.

So: get out there and suck the joy out of something, but I recommend that you not get a forehead tatoo…

How to review an NIH informatics grant proposal

I wrote up this little squib on how to review a grant proposal for students in one of our classes. No French here, sorry–our usual exploration of the implications of the statistical properties of language for second-language learning will continue tomorrow.

Global issues in grant reviewing

One of the biggest effects that you can have on science and on the future of science is your work as a grant reviewer for the National Institutes of Health. Your activities in this area will affect not just what kinds of research get funded, but the kinds of approaches that people take to doing that research. To give one big example of the past couple of decades, it’s reasonable to say that some of the genomics-based research that is so fruitful today has been possible only because grant reviewers stopped classifying it all as “just a fishing expedition,” having seen that “fishing expeditions” can be as useful as more traditional hypothesis-driven biology. So, when you get the call (well, today, when you get the email) inviting you to join a “study section,” don’t see it as yet another burden–realize that this is one of your opportunities to have a real effect on science in this country.

What I’ll describe here is my approach to writing a grant review. I’m sure that there are lots of others. Objectively, I can say that (a) this is a more efficient approach than other ways that I’ve tried, and (b) I’ve gotten very positive feedback from multiple scientific review officers at NIH since I started doing it this way, so I have at least one data point that’s consistent with the idea that this is a reasonable approach to the task.

One of the hazards of grant-reviewing is that you may often be asked to evaluate proposals that are outside of your strongest area of expertise. If you really and truly don’t think that you can give something a competent evaluation that’s fair to the investigators, you should certainly tell the scientific review officer so. (Do NOT wait until three days before the reviews are due to do this! You need to take a quick look at all of the proposals that you’ve been assigned as soon as you have access to them, and this should be one of the things that you’re checking for–ensuring that the proposal is not so far out of your area that you couldn’t possibly give it a professionally acceptable review. The other thing to look for: double-check for conflicts of interest.) One of the things that I like about the approach to the task that I describe here is that it can help you in writing reviews of proposals in areas in which you’re not necessarily strong, as the criteria are pretty general to scientific computing research as a field. Science is science, and if the investigators make their points well, you’re going to get them. If they don’t make their points well, then that’s not your fault–point out that they’re not making a very strong case for whatever it is that they want you to accept, and you’ll have helped them to improve their proposal.

Practical approach: reading the proposal

You’re going to have to read the entire proposal thoroughly once, and then you’re almost certainly going to have to skim it at least once. One of your goals is to do that reading/skimming, and the subsequent writing, as efficiently as possible. To that end, you’re going to do two things while you’re reading the proposal. One is that you’re going to highlight things–I’ll tell you what you’re going to highlight in a minute. The other is that you’re going to write very short notes in the margins. So: grab your pen and a highlighter. (You’ll note here that I’m assuming that you’ve printed out your proposals. Yes, I am so old that I still print out proposals.)

First step: read the RFA. You will occasionally see a proposal that just is not “responsive,” as we say, to what an RFA is looking for. This won’t typically be an issue with the “open calls,” but it’s not unusual with others. People have lots of approaches to deciding where to submit their proposals, and for some people, that boils down to submitting the same basic proposal to lots of different places with minor tweaks that try to make them seem relevant to the RFA when in fact, they really just aren’t. Also, some RFAs will have specific requirements that others don’t–maybe a dissemination plan, or a requirement to recruit students from specific under-represented groups, or what have you–and your reviewing responsibilities include ensuring that those specific points are addressed.

Having read the RFA, you’re now ready to start reading the proposal. When you do that, you’ll want to read it in such a way as to make it easier to write the review in the format in which it’s required to be written. At the moment, here’s what the required format is:

An “overall impression/summary.” This section describes what the proposal is about. Additional details summarize the strong and weak points very broadly. Overall, the picture that you paint in the impression/summary should make sense in terms of the overall numerical score that you give, and vice versa.
Specific scores for innovation, etc.
For each of those areas, specific lists of strong points and weak points.

So: you’re reading with your highlighter and your pen in hand. You’re going to highlight material that gives a good summary of what the proposal is about. You are completely free to use this material in your overall impression/summary write-up, and there’s some advantage to doing so, as it minimizes the chances of you mis-quoting the proposal. This might come from the introduction to the proposal, or it might be scattered throughout it, depending on the skill of the investigators who wrote it.

As you read, you’re going to mark in the margins anything that you think constitutes a strong point or a weak point, and which area of the review (innovation, approach, etc.) you think it’s a strong point for. You might also scribble a couple words in the margin to remind yourself what it was, exactly, that struck you as a strong or as a weak point. For example: the proposal says that Despite the obvious potential for accelerating biomedical research with better recognition and normalization of cell line names in GOA records, this problem has not previously been studied. In the margin, maybe you write + Inn. In other words: here’s something to list under “strengths” in the Innovation section of the review. Maybe the proposal says that We will write a look-up dictionary for every possible syntactic structure of the English language. Now you write – App in the margin, since this is something that you’ll want to mention under “weaknesses” in the Approach section–there is an infinite number of English sentences, so this approach is not very likely to work.

Talk about the work, not about the writer. Not the PIs do not make a convincing case that… but the proposal does not make a convincing case that… (The one exception to this is when you’re specifically asked to evaluate the investigator–more on that below.)
Try to think about the reviewing process in terms of being helpful, rather than fault-finding, and at least write as if that’s the case. So, for example, write The proposal could be improved by… rather than the proposal does not… Bear in mind here that you don’t just want to sound like you’re being helpful–you should actually try to be helpful! You don’t have to rewrite the proposal for them, but do approach this task from the good place in your heart.
Always try to find a strong point for every section of the review. They can be extremely generic–a strong point of the proposal is that cancer is a serious problem. They can be extremely specific. For example, if reviewing a spectacularly bad proposal on predicting sub-cellular localization from amino acid composition from an investigator at the University of Lower Slobovia, you might say something like this: The University of Lower Slovobia is very strong in the field of marmot communication. The scientific review officer will realize just fine that marmot communication expertise is not relevant to the likelihood of success of a proposed project on predicting sub-cellular localization from amino acid composition. If you say in the Significance section that Cancer is a fatal disease, the scientific review officer knows that if your only strong point is that cancer is a fatal disease, then you absolutely weren’t able to find any strong point of importance. And, don’t worry that saying one good thing about a bad piece of work will lead to crappy science getting funded–the fact that you’ve got one strong point and eight weak points will make it clear that the proposal needs a lot of improvement. You should, however, worry that not saying even one good thing about a bad piece of work might give a troublesome investigator an opening for hassling the scientific review officer about objectivity, or even just be unnecessarily hurtful to the investigator. You can always find one strong point, or almost always:
- Innovation: Curing cancer would be novel and important.
- Approach: Java does a good job with memory management, so this is a good choice of programming language for the project.
- Distribution plan: SourceForge is a relatively stable platform for distributing code.
- …you get the idea. These might sound flip and sarcastic to you, and it could be accurate to describe them that way, but if read on the surface, they’re pretty anodyne. I have occasionally run this specific kind of thing by scientific review officers to make sure that they don’t come across as sarcasm, and I have never gotten negative feedback on them. (That’s not to say that I won’t during the next review cycle–but, so far, so good.)

Those are general approaches to writing the review as a whole. There are also some specific things to think about when reviewing specific sections of an informatics proposal. For the Innovation section: is there innovation both in terms of the intended application (finding drug side effects that might indicate potential new uses in electronic health records, mining casual mechanisms from scientific literature, whatever) and also in terms of the informatics aspect of the work? This is potentially one of the toughest areas to evaluate if you’re not an expert in the field, but ultimately, it’s up to the investigators to make a strong case for the innovation of the project. ‘Fess up to gaps in your expertise–let the scientific review officer decide whether or not it’s OK for you to review the proposal.

Regarding the approach: here you’re looking for things that affect the likelihood of success of the work. This is potentially another tough area in terms of dealing with not being an expert, but there’s still a lot that you can do.

Look for aspects of the approach that clearly require specialized expertise in development and in testing. If the proposal has budgeted for a doctoral student to program a system requiring extensive experience with database design and security issues–something that you might conceivably hire a $200-an-hour consultant for–that’s a very valid thing to point out as a weakness of the proposal, as it just isn’t very likely to lead to success.
Who defines the use cases for the software? You want to see someone who is a potential user doing this, not, say, a software developer, unless the intended users are software developers.
Does the proposal include explicit plans for software testing, for ensuring robustness of the applications, and for maximizing the possibility of repeatability and reproducibility of the work? This is a rapidly growing area of concern in biomedical science, and it is likely to become much more so in the immediate future.
Don’t forget about data. Larry Hunter’s definition of bioinformatics is “doing original science with other peoples’ data.” Doing this requires that the data be available to you. If data is required: have the investigators demonstrated that data is available? By this time in your career, you’ve probably discovered that difficulties with getting access to data is one of the most common causes of failed course projects, missed deadlines, and the like.
Reviewers usually want to see back-up plans in case the proposed approach doesn’t pan out. Personally, I usually write these as things that are likely to be successful based on previous work, but that aren’t as innovative as what I’m actually proposing to do. You see proposals getting critiqued for not having back-up plans, but reviewers don’t typically find fault with the back-up plans themselves. (On that last point, your mileage may vary. I once wrote a proposal with a co-worker. We had very explicit back-up plans. One of the reviewers wrote: The back-up plans seem like a good idea. Why don’t the investigators just do that? Oh, well.)

There is a specific section of the review template for the investigator(s). This is the one place where you can’t avoid talking about the people involved (as opposed to my advice above to make your review about the work/the proposal, not about the investigators). This is an area where you can be really hurtful, which is not kind. It’s also a place where you can cause problems for the scientific review officer by saying things that a disgruntled investigator might be able to point to as evidence of lack of objectivity. You’re safe if you stick with the kinds of things that people usually address in this section:

Is this a senior person (in which case you need to check that they’ve actually committed time to the work)?
Is this a junior person (in which case you might want to point out specific evidence that suggests that they could manage the project, if you want to support the proposal, because lack of such evidence is a common way of trashing a proposal)?
Has the person worked in this area before, or are they trying to strike out in a new direction? The latter can be fine, if they have collaborators with suitable expertise.
Has the PI worked with the other members of the team before? If so: strong point. If not: weak point.
At some point, you need to look at the entire group of people involved. You might do this in the Investigator section, or in the Approach section. You’ll at least need to look at the types of people involved. (For an example of the latter: the PI might say that they would hire, say, a statistician, without specifying who it would be.) An informatics project that’s likely to be successful is going to have the right mix of researchers and technical people. At the risk of being redundant: a frequent problem that you’ll see is a researcher being tasked with implementing a super-complex computational platform that requires professional software development and testing experience more so than research abilities. Look for plausibility of software development goals given the number of people on the budget–an enormous project is not going to be built with just one software developer, but that may be all that you’ll find on the budget. Look for adequate software testing; look for requirements for special expertise, both in developing requirements, in developing software, and in testing software.
Another point on the entire group of people: if there is already evidence that they can work productively together, you should count that as a strength. Look for papers published together, previous grants involving the same group, etc. Conversely, if you are reviewing a big, complicated proposal with lots of dependencies between groups and potentials for lack of communication and the like, and the people have no history of ever having worked together before, then you should consider pointing that out.

UI design issues can be a big issue. Many people say that they plan to build a user interface, but don’t seem to think about how they will design the interface, or how they will evaluate it. In particular, people often don’t seem to plan ahead to involve users in the interface design and evaluation process. Having developers design user interfaces typically doesn’t work out well, and not involving users in this is a very legitimate weakness.
What counts as success, and how will you know if it’s been achieved? You don’t have to agree with the way that the investigators think that this question should be answered, but the investigators should at least tell you how they think it should be answered. Beware of statements like we will build a system that predicts protein subcellular localization that aren’t followed by some implicit definition of what build and predict mean. How will the program officer know whether or not the protein-subcellular-localization system has, in fact, been built? When the investigators go back for a renewal of the grant, how will the reviewers of the new proposal know whether or not this is something that was accomplished? How will the investigators themselves even know when they’re done? If I write a Python script with a predictProteinSubcellularLocalization() function that works for exactly one hard-coded protein, have I accomplished the aim? What if I build a beautifully engineered system with a user interface so intuitive that 5th-graders are now predicting subcellular localization just for fun from the comfort of their living rooms–but the predictions are never, ever right? What if I build a localization system, but it can only predict a subcellular localization once before the code auto-deletes? What if I build a system that has 2,000,000 users three weeks after I release it and quickly leads to the discovery of cures for every existing type of cancer, but I never publish a paper about it? Does that count? The investigators don’t have to define success the same way that you would, but they should be specific about what they’re going to produce.

The discussion session

There’s a rhythm to discussions of grants. The typical NIH routine is that:

Reviewer 1 presents a synopsis of the proposal, including the goal, the approach, and the main strong and weak points.
Reviewer 2 does not present a synopsis, and typically doesn’t repeat Reviewer 1’s points, although they may begin by saying something like I agree with Reviewer 1’s description of the proposal. Reviewer 2 then usually adds any additional things that they think need to be pointed out.
Reviewer 3 writes up only a short review, and doesn’t typically say much during the discussion, beyond adding any additional things that they think weren’t addressed by Reviewers 1 and 2.

When you’re Reviewer 1, be true to your analysis, but be as humble as you always are (or should be), and give as much real consideration to other peoples’ analyses as you always do (or should). (See this post for information about how to use a lack of humility and the failure to consider that other people might have valid analyses, too, to fully exercise your unemployment benefits.)

Learn from the experience

As with as anything else, you want to learn from the experience of reviewing grants. What you’re going to learn: how to write better grant proposals yourself. Thinking seriously about the critiques that you get on your own proposals is an excellent way to improve them–paying close attention to the issues that an entire room full of reviewers raise about proposals on a wide variety of topics is even more efficient.

From astrophysicist to data scientist

People often sidle up to me at conferences, lab retreats, or receptions at the boss’s house. “I’m thinking about leaving academia and going into industry/Big Pharma/law school. What will that be like?” Here’s a response to that question, not from me, but from a colleague who went from an academic career in astrophysics to a job as a research associate in a biomedical informatics department. Here’s what that experience was like, both for him and for his spouse. Since he wrote this, he’s moved into a faculty position. His wife has gone into the private sector. French vocabulary at the bottom of the post, as usual.

As far as the transition from astrophysics, I’m terrible at giving advice, but I’ll tell you the experience of my wife and I. My wife was a liberal arts college professor and I was on my second postdoc. Both of us were working on high-profile cosmology experiments. Solving the 2-body problem and the stresses of starting a new family were coming into conflict with our careers, and so we realized we had to make a change. We had been doing astrophysics so long we literally had to mourn the loss of our future careers in the field – like we literally went through stages of grief. Part of it was thinking about all the time we had invested and all the connections we had made, part of it was that we couldn’t imagine doing anything else, and part of it was that we were doubtful we had the skill set to compete with statisticians, data analysts and computer scientists who had been doing what they were doing since they were freshmen in college.

Eventually, we realized that the time in physics and in academia wasn’t wasted, that we had learned a lot about how people and organizations function, and how to get things done. We also found that the physics data analysis methods and ways of approaching data were actually refreshing to people outside the field. We further realized that people with statistics degrees know a lot about logistic regression, sample size calculations, statistical tests, .. but they are more clerics – they approach problems from the point of view of “I’ve spent years learning tests, which test best fits the problem?”. This is in contrast to the approach of physicists who are expected to think of things from first principles and build things from scratch – scouring the literature for ways to solve problems, downloading random bits of code in whatever language and modifying it, … This is a huge advantage, because people think we are brilliant when really we are just not doing what the usual statistician would do. As far as computer science, yeah, we realized we weren’t programmers (my wife knows more about programming than anyone I know, but she’d probably have a hard time getting a job as a programmer at Amazon). Further, in talking to people, we found that software certifications, in some circles, are actually taken as a joke. So depending on where you’re applying, they may be more valuable, or less. Obviously, it was good to show we had some knowledge of programming. My only bit of advice: learn R – it is used everywhere. It will literally take you 20 minutes and you can put it on your resume. Eric Feigelson has some nice tutorials (e.g., https://www.google.com/?gws_rd=ssl#q=eric+feigelson+r+tutorial ). Also, Hadoop and Spark are pretty much industry standards, so you might think about learning something about them.

We also found many jobs BUT NOT ALL require you take ‘tests’ – almost like entrance exams filled with brain teasers and data science questions. I could never do them… but almost all of the interviews required us to give talks (an advantage, given our backgrounds in academia).

Anyway, I eventually obtained a job in bioinformatics analyzing language and speech production of patients with mental illness at a teaching hospital, and my wife eventually landed in a company doing consumer data analysis. The challenges are so absorbing, the only time I think about astrophysics is when my boss asks me about some new astronomical discovery.

We were also worried about things like time-flexibility, especially since we have kids– like having to work a standard set of hours and put in for time off, but (1) a huge number of work places now allow you to spend a lot of time working from home (it’s almost expected in some industries), (2) many work places have flexible times when you can come and go, and (3) putting in for vacation has enormous advantages. In astrophysics, I always felt like I had to be ‘on’ even during time off. Vacation is a way of telling everyone, “I’m gone don’t bother me” and they are forced to respect it. Also, speaking of the workplace, a lot of tech companies are still doing this ‘open office’ concept with no walls or anything. It’s annoying and counter-productive, but they often compensate by having spaces where you can hide, and often allow you to work from home much of the time.

We were also worried that we would be working in a place that looked like the movie “Office Space” – with people with ties speaking in cliché BS bureau-speak. There’s some of that (and it can be hilarious), but chances are you’re going to be working with other smart people who see through that stuff.

On a more superficial note, we also found that the word ‘astrophysicist’ carries a lot of weight. I have had Harvard-trained neurologists (the super-brilliant nerds of the clinical world!) who always introduce me as “an astrophysicist” as if I were a brain surgeon.

Anyway, the bottom line is, we realized we were way more valuable than we thought we were. Also, we realized we were looking for employers who were willing to take a chance with a person on a non-traditional background – i.e., non-cookie cutter companies run by people who understood that there would be a learning curve and respected our backgrounds. We discovered that they are few in number, but very much exist.

la science des données ou la science de données: data science.
les mégadonnées: “Big Data.” Le big data, littéralement « grosses données », ou mégadonnées (recommandé³), parfois appelées données massives⁴, désignent des ensembles de données qui deviennent tellement volumineux qu’ils en deviennent difficiles à travailler avec des outils classiques de gestion de base de données ou de gestion de l’information.
(Source: Wikipedia.)

	Anonymous on The many ways to spell “…
	Anonymous on Nightmare after nightmare: How…
	zipfslaw1 on Estimate your vocabulary …
	Anonymous on Estimate your vocabulary …
	Anonymous on Estimate your vocabulary …