Archive for October, 2009

Fun with statistics 1: Mythical average liblog

Wednesday, October 7th, 2009

[Non-series heading: Yet another in the series of posts related to But Still They Blog: Liblogs 2007-2009.]

Here’s a proper picture of The Average Moderately-Visible Still-Available Liblog (that is, the “average liblog” in my 521-blog study, where “moderately visible” means a Google Page Rank of 4 or higher in recent times and the blog had at least three posts in either March-May 2007, 2008, or 2009…)

General characteristics

The blog began in late 2005. It has a Google Page Rank of 4.6–but, of course, GPR is always a whole number. When checked on September 30, 2009, the most recent post was 17 weeks old.


In March-May 2007, this blog had 40 posts totaling 9,220 words, with 47 comments. Posts averaged 231 words each and there were 1.2 comments per post.


In March-May 2008, this blog had 36 posts totaling 8,574 words, with 41 comments. Posts averaged 277 words each and there were 1.4 comments per post.


In March-May 2009, this blog had 29 posts totaling 6,662 words, with 27 comments. Posts averaged 259 words each and there were 1.3 comments per post.


This blog had 148% more posts in 2008 than in 2007, 72% more in 2009 than in 2008, and 102% more in 2009 than in 2007.

It was 159% longer in 2008 than in 2007, 31% longer in 2009 than in 2008, and 116% longer in 2009 than in 2008. Posts, however, were 133% longer in 2008 than in 2007, 27% longer in 2009 than in 2008 and 119% longer in 2009 than in 2007.

There were 53% more comments in 2009 than in 2008 and 127% more comments in 2009 than in 2007–but there were also 72% more comments per post in 2009 than in 2008 and 152% more in 2009 than in 2007.

Impossible, you say?

Well, yes. It is, in fact, impossible for a blog to have these characteristics–there’s a bunch of internal contradictions. (In fact, not only is it impossible, it’s also atypical in almost every respect.)

But it is, in fact, what Excel will report as averages for each of those figures–and the problem isn’t an Excel problem.

Soon: The Mythical Median Liblog, which is just a little less implausible than the Mythical Average Liblog.

50 Movie Comedy Classics, Disc 11

Wednesday, October 7th, 2009

Behave Yourself, 1951, b&w. George Beck (dir.), Farley Granger, Shelley Winters, William Demarest, Francis L. Sullivan, Margalo Gillmore, Lon Chaney Jr., Hans Conried, Elisha Cook Jr., Glenn Anders, Allen Jenkins, Sheldon Leonard, Marvin Kaplan. 1:21.

The plot: A CPA (Granger), somewhat browbeaten by his mother-in-law, realizes almost too late that it’s his 2nd Anniversary. He goes to a store to buy his wife (a svelte and wonderfully funny Shelley Winters) a nightgown. Meanwhile, a dog (trained to go to a certain spot) has come into town as part of some odd scheme—and, somehow, breaks free and starts following the CPA, in the process demolishing enough of the store so that the CPA flees. And, when the dog keeps following him, pretends that the dog is his present for his wife.

Then an ad shows up about the lost dog, with precise physical description. The CPA wants to do the right thing… and that’s just the beginning of a wonderfully funny, fast-moving blend of caper and farce, with lots of mistaken identities, bad guys getting shot (sometimes with the CPA’s business card in hand), mother-in-law stuff, counterfeit money (that wasn’t supposed to be counterfeit), overeager cops…and one charming dog. It’s a 50’s movie: The married couple have twin beds. But never mind…

The cast is remarkable—William Demarest as a cop, Lon Chaney, Hans Conried, Elisha Cook Jr., Glenn Anders, Sheldon Leonard and Marvin Kaplan as gangsters and other criminals, Margalo Gillmore as the mother-in-law. They all do good jobs (Farley Granger, the CPA, is probably my least favorite character of the lot—he’s OK, but so many others are better). Good print, good sound. Thoroughly enjoyable. $2.00.

The Sin of Harold Diddlebock, 1947, b&w. Preston Sturges (dir. & screenplay), Harold Lloyd, Jimmy Conlin, Raymond Walburn, Rudy Vallee, Edgar Kennedy, Arline Judge, Franklin Pangborn, Lionel Stander, Margaret Hamilton. 1:29.

How’s this for a movie that doesn’t worry about suspension of disbelief: This one begins with almost nine minutes from a Harold Lloyd silent movie, The Freshman, where a waterboy on a college football team somehow becomes the team hero—and that begins with an overlay acknowledging that it’s from an old Lloyd silent. At the end of the game, with sound inserted, a businessman says “Look me up when you’re through here, I’ll have a job for you.”

Cut to the much older Lloyd showing up for that interview. The businessman—owner of an ad agency—doesn’t remember the sport or the incident (apparently he does this a lot) but has a starting position: as an accounting clerk, where Lloyd (that is, Harold Diddlebock) can work his way up. 20 years later, he’s done nothing but work on those books. At which point, the owner notes that he’s a failure and it’s time to cut him loose, with around $2,000. (Diddlebock takes the money in cash—he doesn’t trust anybody at this point—and, as he’s leaving, tells a young woman his sad tale (which she already knows). He’d fallen in love with every sister in that family as they came to work, but never did anything about it—except that he finally purchased a ring with which to propose, and he gives it to the youngest sister so she can keep it for when she meets the right person. Exit this hapless and unmotivated character…

Who we next see chatting with a shifty guy who wants to buy him a drink—and Diddlebock’s never had one. The shifty guy’s also spotted the wad but is impeccably honest. So, into the bar they go (at 11 a.m.), and the bartender makes up a special creation, the Diddlebock, with no apparent alcoholic taste and enough of a kick that Diddlebock’s yelling out, then wondering who made all that noise. Bookie shows up to collect from the shifty guy, Diddlebock decides to bet half his savings on a longshot, wins, bets again…and next we see there’s a brief montage of nightclubs and carousing.

When Diddlebock awakes two days later, he finds that he has no money—but does own a rundown circus with 37 hungry lions and no way to get rid of it. That sets up a lengthy set of scenes involving a well-trained lion, bankers and their reputation, and the kind of physical humor (and physical danger) we’d expect from Lloyd. And, to be sure, there’s an odd happy ending.

I had mixed feelings about this one. There’s some background noise on the soundtrack but that’s not the major issue. I’m not sure what it is—the movie’s amusing, as you’d expect from Sturges and the great cast, but maybe I expected more. Still, it’s not bad, and for fans of Lloyd it’s his last movie (and only movie after 1938). $1.25.

Beat the Devil, 1953, b&w. John Huston (dir.), Humphrey Bogart, Jennifer Jones, Gina Lollobrigida, Robert Morley, Peter Lorre, Edward Underdown, Ivor Barnard. 1:29.

I saw this picture in another public domain collection five years ago (the “DoubleDouble” set of 44 movies sent to subscribers of a long-since-defunct DVD magazine). In that collection, this movie was with a group of “Famous Directors, Cult Classics” flicks. Here, it’s classed as a comedy. Maybe it’s just hard to classify. Back then, I thought the acting was better than the “dubious plot.” I still do.

The plot, such as it is: In Ravello, waiting for a slow boat to Africa, are an odd group of four men (all from different countries), plus a jaded adventurer and his gorgeous Italian wife—and a stiff-upper-lip Englishman and his sharp but perhaps over-imaginative American wife. The adventurer (Bogart) is involved with the odd quartet, apparently out to acquire uranium-bearing lands in British East Africa on the sly: The quartet is providing the funds and Bogart has the contacts. The other couple is off to claim a coffee plantation the Brit has inherited—but if you believe his wife, he’s actually out for uranium as well. Let’s see. Both wives get involved with each other’s husbands. One of the quartet is a murderous type (not Peter Lorre). There’s some romance and lots of double-crossing. There’s a moderately funny sequence involving a broken-down, runaway car and two briefly-presumed deaths. The ship isn’t all it might be—the captain even less so. And, well…while there’s a resolution, I didn’t find it all that coherent. (The sleeve says the movie’s 100 minutes and it actually ran 89 minutes, so I thought there might be ten minutes of coherent plot missing—but IMDB and Wikipedia both show 89 minutes.)

Still…John Huston directing (Truman Capote and Huston writing). Humphrey Bogart. Gina Lollobrigida, Robert Morley. Peter Lorre. Jennifer Jones, all playing it straight and making for an amusing film. How far wrong can you go? Decent print, I’ll give it $1.50.

Passport to Pimlico, 1949, b&w. Henry Cornelius (dir.), Stanley Holloway, Betty Warren, Barbara Murray, Paul Dupuis, John Slater, Jane Hylton, Hermione Baddeley, Margaret Rutherford. 1:24.

While in some ways distinctly a film of its time—post-war rationing in England, unexploded bombs and lots of shortages—this is also a great plot idea, fairly well carried out. In short: In Pimlico (a small area in London, not nearly so grand in this movie as it’s made to sound these days), there’s an unexploded bomb in an excavation (in an open area where a visionary would like to see a Lido, with swimming pool, but the mercenary neighborhood leaders just want to sell it off). Kids playing nearby manage to set off the bomb—and in the process of one person sliding into the excavation and being pulled out, he spots an antechamber opened by the bomb. He goes out with a ladder, climbs down and discovers a treasure trove.

As things develop, the treasure trove includes a document that says the neighborhood was ceded to the Duke of Burgundy, a deed that was never reversed. The residents (19 families) decide this means they’re Burgundians, so they can ignore British pub closing laws, rationing etc. The British government can’t actually fault the finding (aided by authentication by Prof. Hatton-Jones, a winning performance by Margaret Rutherford)—and things escalate from there. Let’s just say that Whitehall comes off neither wise (or in any way reasonable) nor liked by Londoners and the good guys win.

Quite charming, and occasionally a good laugh. I wondered about the “In Memoriam” at the start of the film, followed not by a name but by a wreath surrounding some odd documents—but by the end, I’d figured out that the documents were ration-related.

Very nice. Decent print. $1.75.

Commonalities: The followup

Sunday, October 4th, 2009

A couple of days ago, I posted “Commonalities and generalizations“–offering three lists of liblogs and asking what each list had in common.

I love the responses (see the comments on that post)–and one of them (Steve Lawson), either through deep memory or, ahem, doing a little research, had the right answer: Each list consisted of blogs using one blogging platform (combining MovableType and TypePad into a single software platform).

The first list is blogs using Blogger, most (but not all) hosted at The second list is blogs using MovableType or hosted at The third list is blogs using WordPress, some of them hosted at

Generalizations? Just this: For at least the first and third list, it wouldn’t be all that improbable to do a reasonably broad sampling of liblogs, enough to be statistically valid, and conclude that the software in question was used by the vast majority of liblogs. (It would be a little tougher with TypePad/MovableType, but certainly not impossible.)

And it would be wrong.

Here are the actual figures for 519 of the 521 blogs in the project I’m currently working on–all of them blogs by library people, all with blogs still visible in September 2009 and having had at least three posts in either March-May 2007, March-May 2008, or March-May 2009, all primarily in English and, a serious limiting factor deliberately added to reduce the size of the universe, all of them having a Google Page Rank of 4 or higher in either September 2008 or sometime in Spring 2009. (I was hoping that set of limits would yield around 400 blogs, making the research a lot easier. In practice, it yielded 521…which is still better than the 650-700 I might have ended up with using last year’s rules.)

Program Blogs Percentage
WordPress 245 47.2%
Blogger 190 36.6%
TypePad/MovableType 48 9.2%
Other 24 4.6%
Drupal 7 1.3%
LiveJournal 5 1.0%

Looking back at the 2007-2008 study, which included 607 blogs (but a different sample–with 127 removed and 41 added), the percentage using WordPress has jumped from 37.9% to 47.2% while the percentage using TypePad/MovableType has slightly increased from 8.8% to 9.2%–and the Blogger percentage is unchanged, at 36.6%. This may mean that a lot of “other” blogs changed platforms, it may mean I did a better job of identifying WordPress blogs (I don’t remember looking at source last year), or it may mean nothing at all.

At this point, if this set of 521 blogs (two disappeared between early September and September 30, when I did this scan) is in fact representative of more visible, more active liblogs as a whole–an assertion I am not willing to make–then:

  • You’d be wrong to say that any program is used by “most” libloggers.
  • You’d be awfully close in the case of WordPress, and it’s certainly a strong plurality.

Angel’s comment on the other post intrigued me, so I looked at the subsets of these blogs with Google Page Rank of 5, 6, and 7. (There’s only one with GPR 8 and none with GPR 9). There’s no real correlation–in fact, the percentage of higher-profile blogs using WordPress is slightly lower (44%).

But I can understand how he could arrive at this assumption–because I grabbed each list in descending GPR order. Since there are more WordPress blogs than either Blogger or TypePad/MovableType, the sample for WordPress is skewed toward higher-profile blogs.

Anyway, there’s one table from But Still They Blog: The Liblog Landscape 2007-2009, a work very much in progress. Incidentally, while I use GPR4 as a cutoff, I don’t mention the GPR for any blog within the book; this time, I won’t even have an overall chart of blogs-by-GPR. I think a couple of the very high GPRs are flukes, and I know that moving a blog can wipe out your GPR for some time… It’s not a great metric, but right now it’s the only one available that’s easy enough to use.

Cites & Insights 9:12 (November 2009) now available

Saturday, October 3rd, 2009

Yes, I know it’s pretty early in October for the November issue–but it’s ready, and I wanted to stay well out of the way of Open Access Week, so…

Cites & Insights 9:12 (November 2009) is now available

This 34-page issue (PDF as usual, but an HTML version is available if you plan to read it online) consists of one essay:

Library Access to Scholarship

A year’s worth of source material and commentary, organized into:
Mandates, Policies and Compacts
The Colors of OA
Framing and Mysteries
The Problem(s) with Green OA
Quality, Value and Progress

Chances are, this is the last hurrah for Library Access to Scholarship and my semi-active independent commentary on open access. To coin a phrase, this may be the optimal and inevitable conclusion to close to a decade of work in this area.

One note (repeated at the start of the HTML version): Please don’t use the HTML version if you plan to print more than a small portion of the essay. The PDF issue prints out as 34 pages. Depending on your browser and other settings, the HTML version will require 48 to 51 pages, possibly more. (I happen to think the PDF version is a lot more readable as well, but that’s probably only true if you’re reading in print–which is why I make the HTML version available.)

Commonalities and generalizations

Friday, October 2nd, 2009

This is a two-part post: A question today, the answer in a day or two (or three…)

The question: What do each of these lists have in common, other than all being liblogs, and what could you reasonably generalize from each of them taken in isolation?

  • Catalogablog, Commentary from Carl Grant, The Imaginary Journal of Poetic Economics, Information Literacy Weblog, OA Librarian, Open Access News, CogSci Librarian, Coyle’s InFormation, Digitization 101, Filipino Librarian, The Handheld Librarian, The Information Literacy Land of Confusion, It’s all good, Killin’ time being lazy, The Laughing Librarian, A LIBRARIAN AT THE KITCHEN TABLE, Librarian on the edge, Library Marketing-Thinking Outside the Book, Library Stories: Libraries & Librarians in the News, LibraryThing, Museum 2.0, OPL Plus (not just for OPLs anymore), Out of the Jungle, Peter Scott’s Library Blog, Rambling Librarian :: Incidental Thoughts of a Singapore Liblogarian, ricklibrarian, School Librarian in Action, Stephen Gallant Review, UK Freedom of Information Act (FOIA) Blog, User Education Resources for Librarians, AbsTracked, Baby Boomer Librarian, BookBitchBlog, Borderland Tales, Carolyne’s pages of interest, The Centered Librarian, A Chair, A Fireplace & A Tea Cozy, Connie Crosby, The Cool Librarian, DigiCMB, Eagle Dawg Blog, Friends: Social Networking Sites for Engaged Library Services, frontier librarian, Game On: Games in Libraries, Gather No Dust, The Gypsy Librarian, Heretical Librarian, The In Season Christian Librarian, info NeoGnostic, Information Junk
  • Stephen’s Lighthouse, 025.431: The Dewey blog, ©ollectanea, Academic Librarian, beSpacific, Confessions of a Science Librarian, The Days & Nights of the Lipstick Librarian!, eFoundations,, The Good Library Blog, Government Info Pro, Hectic Pace, Law Librarian Blog, Library 2.0: An Academic’s Perspective, LibraryLaw Blog, mamamusings, Outgoing, RSS4Lib, The Ubiquitous Librarian, Bluestalking, bookshelves of doom, Cataloging Futures, Christina’s LIS Rant, The Distant Librarian, DrWeb’s Domain, The Kept-Up Academic Librarian, Loomware – Crafting New Libraries, Metalogue, Professional-Lurker: Comments by an academic in cyberspace, reeling and writhing, Science Library Pad, SciTech Library Question, Superpatron – Friends of the Library, for the net, Tinfoil + Raccoon, Tom Roper’s Weblog, Creating the One-shot Library Workshop, Feel-good Librarian, Lady Crumpet’s Armoire, Librarian of Fortune, Marcus’ World, Phil Bradley’s weblog, Slow Library, T. Scott, Weibel Lines
  • Arnold Digital, David Lee King, Everybody’s Libraries, Free Range Librarian,, ResourceShelf, The Shifted Librarian, ACRL Insider, ACRLog, alliance virtual library, Be openly accessible or be obscure, Beyond the Job,, BlogJunction,,, Caveat Lector, checking out and checking in, Disruptive Library Technology Jester, The FRBR Blog, The Galecia Group,, iLibrarian, In the Library with the Lead Pipe, Info Career Trends, Infoblog, Information Wants To Be Free, j’s scratchpad, Librarian In Black,,, Library clips, Library Garden, Library Juice, Library Monk, Library Web Chic,,, LITA Blog, Nodalities blog, Online Insider, The Other Librarian, Panlibus, PLA Blog, PomeRantz, ResearchBuzz, The Search Principle blog, Slaw, Tame the Web, UK Web Focus

Guesses? Certainties? Would you be right or wrong about any of the generalizations?

Added October 4, 2009: And here’s the answer

Not giving credit where credit is due?

Thursday, October 1st, 2009

Warning: This is another in what’s likely to be a very long set of posts, over several months, related to the project I’m currently calling But Still They Post: The Liblog Landscape 2007-2009. If you don’t give a damn about liblogs (blogs by library people) or blogging in general, you can stop reading now.


Yesterday, I did a “secondary metric” pass on the 521 liblogs in the study (I finished the primary metrics pass a little while back, and am slightly procrastinating on starting up the long, drawn-out analysis and commentary portion of the project).

The secondary pass involved accessing each blog and determining two things:

  1. Currency of the most recent post, rounded up to specific week limits (for ease of graphing, interpretation and, frankly, recording).
  2. Blogging software used for the blog, if obvious or readily knowable.

I’ll probably talk about the first in a later post. So far, I haven’t even looked at the overall results (but I have results for 519 of the 521–two of them now seem unavailable, although they were available when I was doing the detailed metrics earlier in September).

This post is about the second item.


On the first pass, I recorded only five one-letter values:

  • b for Blogger–that is, any blogs with as part of their address and any others with either the “B” favicon or the “I power Blogger” graphic–or, for that matter, the Blogger toolbar at the top of the blog.
  • l for LiveJournal–and those are always obvious.
  • t for TypePad and MovableType (that could be s for SixApart, but never mind…), either because is part of the address or because MovableType is mentioned somewhere.
  • w for WordPress, which usually shows up either as a favicon or in a direct credit and/or link on the page (of course, blogs hosted at are also obvious).
  • u for Unknown, which actually meant “everybody else.”

At the end of that pass, setting aside the two blogs that seem to have gone into hiding, I had 65 “u” cases. I decided to spend another hour (actually, less than an hour) investigating these a little further–by looking at the page source.

In a few cases, “u” literally stood for “other”–blogs that are explicitly, on the home page, based on some other software (square-space, blog-city, serendipity, etc.).  In 54 cases, you really did need to go to the page-source view to see what was being used (and in 17 of those, I left the value as “u”).

Anyone care to raise a hand as to the most prominent software used that wasn’t identifiable on the home page?

That’s right, you in the second row: WordPress.

In fact, 29 of the 54–or 29 of 50, if you leave out four Library Journal/School Library Journal blogs–use WordPress software but don’t say so on the blog home page itself. (I won’t ask the architect of the LJ/SLJ blog software to come forward…some things are better left anonymous, although if the goal is to maximize the number of clicks and page exposures for a given amount of actual blog reading, it’s brilliantly successful software.)

I wonder about that result. Seems to me you have to go to some modest effort to expunge all direct evidence of WordPress from most of the templates (although some of them might already do that for you).

WordPress is open source software. It’s damn good open source software (I am so happy to be back on WordPress after a few months using MovableType…). It’s good enough that it’s been used as the basis for a contemporary online catalog interface. It’s free…

Seems to me that, given all that, it’s reasonable to leave a credit line in the blog. I guess I don’t really know why people would go to the effort of removing credit. (Maybe commenters can explain.)

Not a big deal. Not to give away figures, but more than 85% of liblogs using WordPress software do include explicit credit lines or favicons or the like. But I do wonder…

(The actual breakdown? Later, certainly in the book–if there is one–and probably in a post and/or C&I article. Maybe soon.)

For anyone who pays attention to categories (I would say “both of you” but that may overestimate), I’m tagging some of the posts for the new project with Liblog Landscape, which recognizes that it is, to some extent, a sequel to the current book.

Update 10/2: When I say “I wonder…” I should clarify that this is mostly idle curiosity.

Idle enough that I didn’t save the list of 29–as soon as I’d gotten that count, I resorted the spreadsheet, and the only way I could restore that list of 29 would be to recheck several hundred blogs.

A couple of people gave me perfectly acceptable reasons that their blogs don’t credit WP on the home page (but do elsewhere). I don’t wish to pursue the matter with all 29 bloggers (and have no desire or ability to even name them!)–and I suspect any attempt to pursue the matter would feel like “blame” to at least some of those bloggers, a blame that would not be intended.

Comments I get here will satisfy the idle curiosity.