What to expect as a graduate student (in my lab, anyways): computational linguistics/natural language processing edition

When you’re looking for an advisor, it’s good to get a realistic picture of what their expectations are. Here’s a computational linguist’s take on that.

I’m meeting with a prospective student today.  That means that I’ll want to know a lot about her motivations and background.  It also means that I’ll want her to walk out of my office with a realistic picture of what my expectations are for students and post-docs.  I thought that I’d share them here not because I’m sure that they’re valuable, but so that I can get feedback from people in similar positions–and from students.  Some of the specifics of this are only relevant to people interested in computational linguistics or natural language processing, of course.  Some of the more general stuff might apply to anyone in graduate school.  You choose.

My expectations basically fall into three categories.

  1. Things that you need to do in order to become an independent researcher, find a teaching job after you graduate, and be professionally successful as an academic.
  2. Things that you need to do in order to be able to get a job in industry, if that’s the direction that you decide to go, and to be an efficient and productive researcher if you stay in academia.
  3. Things that you will need to do in order to participate in the (intellectual) life of the lab.

Things that you need to do in order to become an independent researcher, find a job after you graduate, and be professionally successful as an academic:

  • Learn something about language.  Learn facts about how it works (to the extent that we know how it works), and learn something about what the interesting open questions are–and how you might try to answer some of those questions from a computational perspective.
  • Take a course in natural language processing.  Most people don’t come to this field with a background in hand, at least not in the biomedical world that I live in.  Formal coursework with homework, tests, etc. is a good way to get on your feet.
  • Publications, publications, publications.  Any time that you’re thinking about a project, think about how you’re going to publish it.  That means rotations, comps, and anything that you let yourself get dragged into just because it looks interesting.  If  you can’t even imagine a way to publish the work, consider finding a new topic for your rotation/comps/whatever.
  • When you’re thinking about projects, don’t fall into the trap of “just” developing technology.  There’s almost always an interesting scientific question that your work could be relevant to–figure out what it is.  If you can’t figure out what it is, consider dropping the project.
  • Read How to complete and survive your doctoral dissertation, or some equivalent.  You need more advice than I will ever think to give you.  You also need better advice than I claim to know how to give.  Sternberg’s book is pretty much how I approached my own graduate school experience, and it worked.  It’s also a pretty fair approximation of what I’ll expect from you, and a good approximation of what you should expect from me (and demand, if I don’t give it to you).  I typically keep a couple copies in my office, and hand one to anyone who wants to join my group as a doctoral student.

Things that you need to do in order to be able to get a job in industry, if that’s the direction in which you decide to go, and to be an efficient and productive researcher if you stay in academia:

  • Test and document your code and your projects.  Learn to use a unit testing framework and a version control system.
  • Learn something about databases.  This is the most common thing that I see in job ads that grad students often don’t have.
  • Learn one architecture.  Some choices are UIMA, GATE, and BioC.  Industrial-strength language processing often requires industrial-strength software architectures.  You should know one of them.
  • Learn some programming languages.  You’re very likely to need one object-oriented, compiled language, and one scripting language.  I don’t really care which, but you need to become comfortable in something.
  • Learn some of the useful open source tools of our profession.  You don’t need to know all of these by any means, but learn at least some of the following:
    • Lucene
    • R
    • NLTK, or alias-i, or Stanford CoreNLP (I use all three pretty routinely)
    • See above about databases

Things that you will need to do in order to participate in the (intellectual) life of the lab:

  • Plan to report to me and to everyone else in the lab what you’ve accomplished in the past week and what you plan to do in the week to come.  This will not only shame you into making regular progress, but if you go into industry, this is the best tool that I know of for maintaining a good professional relationship with your boss and with your colleagues more generally.  Even today, I meet with my boss on a weekly basis to report what I’ve done, pass on what I plan to do, and ask their opinion about the latter.  This way, you both have the same conception of what the priorities are, and if you are running into trouble, everyone else will know–and figure out how they can be helpful–early in the game.
  • Learn to be comfortable with asking questions.  Learn to be comfortable with asking for help.  Regarding the former: the absolute best graduates that our program has ever produced have been people who showed up in my office every week not to report on what they’d done, but to ask questions.  Regarding the latter: as a mentor once told me, it’s never a sin to ask for help, but it’s always a sin to wait to ask for help until it’s too close to the deadline for anyone to do anything to help. 
  • Participate actively in lab meetings and our reading group.  That means being prepared to explain what you’re doing and ask for input from your colleagues; taking an active interest in what your colleagues are doing and asking them questions about it; keeping an eye on the literature, finding stuff that you think is interesting, and volunteering to present it.
  • Be prepared to be independent to some degree.  I will do my best to keep on top of what you’re doing and how you’re progressing with it, but you need to be the one who makes sure that you show up for work every day (whether that’s in the lab or at your kitchen table) and put in a solid day’s effort.  (That can and should include spending time discussing your work and your fellow students’ work–that always counts, actually.  But, in order to do that, you usually need to be physically in the department.  Or, the campus pub.  Whatever, as long as you’re physically there, and you’re there to work.)  You need to go into the relationship with your advisor knowing that if they’re worth committing several years of your life to, then they’re probably pretty busy, and you need to be able to drive yourself.

I’d love to hear feedback on this stuff, both from people who educate grad students and from grad students themselves.  What works?  What doesn’t?  What am I leaving out?  As Kurt Vonnegut used to put it: thanks for your time and attention.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Curative Power of Medical Data

JCDL 2020 Workshop on Biomedical Natural Language Processing


Criminal Curiosities


Biomedical natural language processing

Mostly Mammoths

but other things that fascinate me, too


Adventures in natural history collections

Our French Oasis


ACL 2017

PC Chairs Blog

Abby Mullen

A site about history and life

EFL Notes

Random commentary on teaching English as a foreign language

Natural Language Processing

Université Paris-Centrale, Spring 2017

Speak Out in Spanish!

living and loving language




Exploring and venting about quantitative issues

%d bloggers like this: