Fun with statistics 2: The slightly-less-mythical median liblog

A few days ago, I posted Fun with statistics 1: Mythical average liblog.

That post, which like this one is part of a non-series leading up to “But Still They Blog,” gave the metrics for the “average liblog” from the population of 521 fairly visible liblogs in the study. In that case, not only was there no such liblog, it would be impossible for there to be such a liblog, as some of the metrics conflicted with others.

So let’s look at the “median liblog,” taking the median instead of the average for those same metrics.

General characteristics

The blog began in November 2005, it has a Google Page Rank of 5 (or had such a rank in the summer of 2009), and when checked on September 30, 2009, the most recent blog was more than one week but less than two weeks old.


The blog had 19 posts in March-May 2007, totaling 4,028 words in posts averaging 206 words each. There were nine comments, or 0.5 comments per post.


The blog had 17 posts in March-May 2008, totaling 4,098 words in posts averaging 230 words each. There were 11 comments, or 0.7 comments per post.


The blog had nine posts in March-May 2009, totaling 1,990 words in posts averaging 202 words each. There were five comments, or 0.4 comments per post.


There were 23% fewer posts in 2008 than in 2007, 31% fewer in 2009 than in 2008, and 34% fewer in 2009 than in 2007.

While the blog as a whole was 6% shorter in 2008 than in 2007, 31% shorter in 2009 than in 2008, and 35% shorter in 2009 than in 2007, posts were within a percentage point of being the same length in each year.

While there were 31% fewer comments in 2009 than in 2008 and 34% fewer in 2009 than in 2007, comments per post were within one percent of being the same each yer.

I think the individual-year metrics are internally consistent or close to it–but some of the changes aren’t internally consistent. There’s a simple reason for that, having to do with default values for empty cases.


This is all silliness, of course–and the project itself does not include any of this. If I did describe a median blog, it would be done using trimmed medians (omitting empty cases)–but I doubt that I will.

One question is more interesting, and I don’t have an answer yet (and won’t for a while): When you divide metrics into quintiles (e.g., the 20% most prolific blogs, the next 20%, the middling 20%, etc.), will there be blogs that are “middling” on all measures–that fall into the third quintile for all metrics? We shall see…

Comments are closed.