Thursday, April 16, 2009

Don't just say Ah.

Mxrk sent me a link to a simple but interesting page.

Two details that jumped out at me:

It isn't until 79 A's that not a single token shows up. (And that'll change soon because of the page.) But it's surprising that every number from 1-78 is out there. Of course I expect several tokens with 8 A's and a few with 15 A's and a few with 42 A's. Because it makes sense that a bunch of writers decided to just sit on the keyboard and let out a good long 'damn' in response to something craaazy. But every number up to 79? I'm surprised there's no gap earlier than that.

The drop at two A's and the rise back up at three. I would have expected this if only because of my own intuition that when drawing out a word for effect two A's aren't enough. Doubled letters occur naturally in a lot of words that adhere to conventional orthography. (Is conventional orthography redundant?) So a scream of Eek! A mouse doesn't extend the word at all.

And let's say that I send a quick note to a friend expressing sudden realization. If I write 'ooh now i see what you meant' that could almost look like 'ooh' with the same vowel as 'ooze'. (In fact I would probably extend the word by adding the H's for that reason.) And altho <aa> isn't as common, it is out there. Aardvark. Saab. Baal.

But the point of extending the word seems to be that it isn't just a little longer. It's not just a merely lengthened pronunciation. It's a drawn out pronunciation. So while <daamn> does indicate a longer syllable, <daaamn> (to me) indicates a syllable that lasts at least a second and a half. But let's not focus on chronograph arguments here.

Exaggeration is by its nature obvious. If I want to joke about little 12 pound newborn I'm not going to remark that it must weigh 15 pounds. I'll probably pick a number like 30 or 40 pounds to make it clear that I'm not venturing a guess. 'Cause I want to make it clear that I know I'm giving an incorrect number.

And I wonder if the same thing comes into play with the two A's. <Daamn> can look like a typo but <daaamn> is intentional. And consider that the drop in doubled-A spelling almost certainly includes several occurrences that are typos. I would bet that there are more typos in that number than there are in the tripled-A examples.


  1. Mxrk should be writing his prospectus or reading about economics and working on a definition of "Socialism" instead of playing on moderately interesting webpages.

  2. Minor observation:

    It's not surprising that there's no gaps between 1 to 65. I am, however, surprised there aren't any gaps from 65-79.

    You can assume that most of the people who entered a's in this category just sat on the keyboard for a duration of time, dt. The number of a's is then proportional to dt. dt is a random variable drawn from a probability distribution* where high dt's which correspond to 65 and higher are very rare. Thus, from 1-65 you don't expect any gaps, but for a's higher than 65 you would expect there to be a few numbers missing. Indeed, between number 65 and 79 several numbers have a only 1 hit, but it never gets to zero until you get to 79. This is strange. So strange in fact, I'm willing to bet that the creator of the website just searched until she found a number of a's that came up with zero results and stopped there. So there’s probably a few numbers (say, up until 90) that have a non zero number of hits.

    *for those who haven’t been exposed to probability theory, the most common example of a probability distribution is the bell curve. The probability distribution that governs the above example (from say 2 a’s and on, since 1 ‘a’ is standard) is, I think, the Poisson distribution.

  3. Actually, It's a poisson distribution from 3 a's and on. It doesn't apply to 2 a's for reasons explained in the post.

  4. that's excellent eric. do you have some literature on the poisson distribution?

  5. The wikipedia article is a decent source (

    I've spent only a few minutes trying to fit the data to the curve, but have thus far failed. I'll comment in when I've decided whether it's a poisson distribution or not.

    Also, I checked and sure enough, at the high hit end you get a few zeros here and there until you reach the the point (98) where there are no more hits (I checked up until 110 or so) beyond it.

  6. A Poisson distribution can't fit the data. Now that I've read that wikipedia page, I don't think the Poisson distribution should describe this anyways. I'll get back to you if I find the right distribution.

  7. Sorry for the spamming, but I just tried it and a power law distribution ( fits the data very very well from 6 to 30. I'm not sure what the interpretation of this result is though. Power laws are ubiquitous features of complex/chaotic systems, take a look at the wikipedia page.

  8. not spam at all. thanks for the footwork. i'll take a look and see what i'm able to understand.


Thanks for reaching out.

You can also contact me at wishydig[at]gmail[d0t]com.