How to figure out whether your data can be described by Zipf’s Law or not

vnsrIt’s way harder than it should be to get Google to point you towards instructions for figuring out whether or not your data fits Zipf’s Law.  Since this blog is all about the effects of Zipf’s Law, this seems like a good place to publicize how to do that.  It turns out to be pretty easy, once you’ve learned what it is that you need to do!

1) You’ll want to use R and import the igraph package.

2) Put your data into a vector.  I sorted mine, but I don’t know whether or not that’s required.

3) Pass your vector to the method.

4) The output will include KS.stat, which is the value for the Kolmogorov-Smirnov test, and KS.p, which is the associated p-value.

5) If your data DOES fit the power law, then your p-value will be greater than .05.  If it’s less than .05, then your data does NOT fit the power law.

For more information on igraph’s function:

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s