Dead rock stars and the Poisson distribution

Is there a reason that so many rock stars have been dying lately? Here’s how to talk about it in French.

The Poisson distribution describes the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since the last event (definition from Wikipedia.com).  Who cares?  As Wikipedia puts it, with some highlighting by me: The Poisson distribution can be applied to systems with a large number of possible events, each of which is rare. How many such events will occur during a fixed time interval? Under the right circumstances, this is a random number with a Poisson distribution.  If you’ve been reading this blog for a while, you know that (a) a language has a lot of words, and b) most of the words in a language are rare–that’s why we can use Zipf’s Law to describe the distribution of words in a language, and that’s why I write this blog, which keeps track of the obscure words that I learn in the course of my day.  (Just some of them–there are far too many in any given day for me to track them all.)  So, you could imagine using the Poisson distribution to predict things like how many new words I will run into today.

There are many practical applications of the Poisson distribution.  For example, most of my colleagues work with genomic data of one sort or another.  Say you’re looking at the number of mutations in a particular stretch of DNA.  Mutations are rare.  You have a stretch of DNA that you think has a lot of mutations, and you think that you know what caused them.  Before you draw conclusions about whether or not the mutations were, in fact, caused by that, you need to be sure that the stretch of DNA couldn’t have acquired that large (you think) number of mutations by chance.  The Poisson distribution lets you assign a probability of that number of mutations occurring by chance in that one stretch of DNA.  If the Poisson distribution suggests that the probability of that number of mutations occurring by chance is greater than, say, 5%, then you probably shouldn’t draw the conclusion that you were considering concerning what caused it.  On the other hand, if the Poisson distribution suggests that the probability of that number of mutations occurring by chance is, say, 0.00001%, then you may be onto something.  Poisson distributions have been used in many fields; the most famous application was a study of the number of Prussian soldiers killed by horse-kicks.  Suppose that you suddenly have a large number of soldiers being killed by getting kicked by horses.  Do you need to be training your soldiers differently?  Has someone been selling you lousy horses?  If the incidence of deaths by horse-kicks follows a Poisson distribution (and deaths by horse-kick are rare events that are presumably independent of each other, so they do follow a Poisson distribution), then you can calculate the probability of the aforementioned large number of horse-kick deaths having occurred by chance.  If the probability of them having occurred by chance is large, then you probably don’t need to retrain your soldiers or start looking for a lousy horse-dealer.  If the probability of them having occurred by chance is low, then you might want to look into retraining your soldiers, or reconsidering your horse-buying practices, or whatever.  (I don’t know how the study turned out–see this Wikipedia page for a reference to the book.)

One of the practical consequences of the Poisson distribution is that even rare events will occasionally occur together.  The classic example: three rock stars die in the same month.  Here are some of the rock stars who died last month (January 2016):

…and there’s your classic three-rock-stars-in-one-month phenomenon.  Actually, it’s even weirder—three rock stars actually died on one day that monthJanuary 17th, 2016 saw the loss of Blowfly, Mic Gillette, and Dale Griffin.

What’s going on?  Is someone killing off the rock stars of the Anglophone world?  Probably not–the Poisson distribution tells us that such events, which are both rare and independent, will sometimes occur in bursts, despite their rarity and independence.

Some implications for the world of Zipf’s Law:

  1. I have to admit that I’ve been mischaracterizing the Poisson distribution somwhat in previous posts.  Briefly: I’ve been ignoring the independence assumption.  More on that later, because it’s a really big deal in language in general.
  2. When you’re learning a second language, you’re going to have some good days and some bad days.  On the bad days, you’re going to run across a lot of words that you don’t know.  The Poisson distribution tells you to not get down on yourself about this fact: it’s just the nature of rare events (including words) to show up in clusters sometimes.
  3. All of these dead rock stars have brought a new word into my life: la disparition.  As you probably know, this can mean “disappearance.”  What you might not be aware of is that it can also mean “death, passing,” or “demise.”  So, on the radio this morning, the host of Les Matins de France Culture was talking about la disparition of Umberto Eco.

Reviewing some relevant vocabulary (definitions from WordReference.com):

  • disparaître: to disappear; to die out.
  • disparu (adj.): vanished
  • le disparu: missing person; the deceased.

 

 

5 thoughts on “Dead rock stars and the Poisson distribution”

  1. Weirdly enough I was trying to conjugate” disparaitre” ( can’t find my accents) at two am this morning.Really I was.
    Does this make me a bit odd, or does it just highlight how much I want to improve my French?

    Liked by 1 person

  2. Indeed, when the Poisson distribution applies, you will see clusters every so often. The other questions are:

    what is rock?
    how can you tell whether someone is a rock musician?
    what extra thing has to be true to turn a rock musician into a rock star?
    how many rock stars are there, and how old are they?
    given their age, number and profession, what is a rock star’s risk of dying tomorrow?

    Liked by 2 people

    1. Important definitional and homework questions!

      Regarding the first three questions, I appeal to authority: http://www.phoenixnewtimes.com/music/the-famous-and-forgotten-musicians-who-died-in-january-2016-8000391

      Regarding the 4th question: I don’t even know where to look.

      Regarding the last question: we need to hop over to the Bayesian world. In particular: if you are Keith Richards, your chance of dying tomorrow is indistinguishable from 0.0.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Curative Power of Medical Data

JCDL 2020 Workshop on Biomedical Natural Language Processing

Crimescribe

Criminal Curiosities

BioNLP

Biomedical natural language processing

Mostly Mammoths

but other things that fascinate me, too

Zygoma

Adventures in natural history collections

Our French Oasis

FAMILY LIFE IN A FRENCH COUNTRY VILLAGE

ACL 2017

PC Chairs Blog

Abby Mullen

A site about history and life

EFL Notes

Random commentary on teaching English as a foreign language

Natural Language Processing

Université Paris-Centrale, Spring 2017

Speak Out in Spanish!

living and loving language

- MIKE STEEDEN -

THE DRIVELLINGS OF TWATTERSLEY FROMAGE

mathbabe

Exploring and venting about quantitative issues

%d bloggers like this: