Zipf’s Law describes one aspect of the statistical distribution in words in language: if you rank words by their frequency in a sufficiently large collection of texts and then plot the frequency against the rank, you get a logarithmic curve (or, if you graph on a log scale, you get a straight line). In other words, there is a small number of words that occur quite often, and then a very large number of words that occur at the statistical equivalent of zero–but, they do occur. What this means for the second-language learner is that every single day you will come across words that you don’t know.
I’ve been studying French for the past couple years, and this blog is primarily about new words that I came across in the course of my day–mostly when I’m in France, but also just listening to the radio in the US. I also occasionally write about Spanish, since I spend one week a year in Guatemala, and I write about medical vocabulary a lot, since I go there with a group of surgeons. When I started getting quite a few international readers, I began adding links to the definitions of English words, especially obscure and slang ones. If you’re only interested in French or Spanish, use the tags to find posts on one or the other. If you’re only interested in vocabulary or grammar, look for the vocabulary tag or the morphosyntax tag–any post that doesn’t have one of those tags is probably all cultural notes. I’m a computational linguist by training, and every once in a while I write about linguistics and/or about computer programs that process language in some way or another.
Most (although not all) of the definitions that I give are from the WordReference.com web site. I often put a link to the WordReference page–you can recognize linked words by the fact that they are underlined.
If you’re interested in reading more about Zipf’s Law, click here for the Wikipedia page. For some popular posts from this blog, check out the Best of Zipf’s Law page.
how very peculiar! I did “Math Elem” for my bac years ago (still exist?) but never thought of mixing maths and language. This might fit in with all the Big Data stuff we have today, but I won’t go into that, I’ll just follow this blog for a bit and see what’s interesting 🙂
LikeLike
Indeed, there are lots of “Big Data” applications for the kind of work that I do. But, as you said: “I won’t go into that.” 🙂
LikeLike
ah, I thought so!
LikeLike
OMG… I love this shit! Your blog is mindblowing! I like how you think! Keep it up!
LikeLiked by 1 person
That is fascinating! I like the idea of math but I never quite “got” it. Je suis nulle en maths! However, I love language 🙂 Glad I found your site.
LikeLiked by 1 person
I’ve never quite gotten it, either–I’m in my 50s, and I continue to struggle with the mathematical aspects of my professional life!
LikeLiked by 1 person
REALLY interesting! I had never heard of Zipf’s Law before, I’m excited to look through your blog 🙂
LikeLike
WOW just what I was searching for. Came here by searching for token
LikeLike
I’m glad that you found it useful!
LikeLike
I saw on your website an example of a paper that had been “evaluated” by someone: http://www.clipular.com/c/5009767192068096.png?k=IQA_ittEWl6rRwuC3f3J5n2b72Y As a French teacher and a native-speaker, I was astonished to see that the teacher would be correcting uses of words that are absolutely correct: gerunds are used in French and there is nothing wrong with them. Many famous authors used lists like these and there are examples of great literature.
LikeLike
Thanks! It’s nice to have another perspective. (That was a little story that I wrote.)
LikeLike
“There are three types of lies. Lies, damned lies and statistics.” – Benjamin Disraeli
LikeLiked by 2 people