Archive for July, 2006

What’s the point?

Monday, July 17th, 2006

Here’s a truly random lunchtime post–or maybe it’s not so random.

An odd interchange of comments took place on one of the most “random” posts on this random blog.

I objected to an absurd statement about a very expensive car (a statement which I quoted and which wasn’t taken out of context). A person who used his full name took issue with part of my comment…sort of.

Foolishly, perhaps, I responded (I truly don’t believe that rich people can reasonably ignore gas mileage; it’s a resources and environment issue, not a money issue). It went back and forth, and in the process the commenter, still using his full name, managed to inform me that I was (a) poor, (b) ignorant or uneducated, and (c) a whiner with an annoying little blog.

No argument about the “little blog” part of it, and annoying is in the mind of the reader, so that’s fine too.

But here’s the thing…

Why on earth was this person, who doesn’t appear to be part of my plausible readership, here in the first place?

The post doesn’t show up in the first 100 results for “CL65” or “Mercedes CL65 ” on Yahoo or Google. Nor should it.

I wasn’t saying that CL65 owners were stupid–only that it was ridiculous for a magazine to say “There is not a single aspect to the vehicle that a reasonable person could find fault with.”

But somehow this person felt such a need to defend this $186K car that he had to comment…

Here’s the other thing: Why would a commenter go out of his way to insult the proprietor of a blog? Does he believe I’m going to stop writing it because I’ve been labeled a whiner or because he doesn’t regard the blog as important? Is he trying to make friends and influence people?

Looking back at it, I realize something a little more bizarre: The comment was on a post that was considerably more than a year old. And has not, of course, been followed by a string of denunciations of this car model or any other car model.

I dunno. As you all know, I avoid controversy and contentiousness at…well, maybe not at all costs. (I still don’t use emoticons.) But even in my feistiest moods, I can’t imagine commenting on a 14-month-old post that didn’t address me by name and that I had to dig pretty hard to find. Go ahead: Write a post saying Honda Civics are for wusses or that Randy Newman is a talentless hack, and post it in some non-library-related blog. If you don’t call me out by name, I’m sure not going to go back a year later and comment on your post, wrong as I may believe it to be. Life really is too short.
By the way: I do owe one minor apology. I assumed that the car in question, given its sheer weight and gas mileage, probably had a mediocre turning radius as well. As the commenter said, that’s wrong: The tested turning radius (37.6 feet) is excellent for a 16-foot-long car.

The general question, I guess: Why would you go to lengths to object to something on a blog by someone you don’t know in a field you’re not involved in? Do people have that much time on their hands?

Bloglines 3: The saga continues

Saturday, July 15th, 2006

This continues the ongoing story that began here and continues here. This blather will be summarized to some extent in the introduction to “Looking at Liblogs: The Great Middle,” scheduled for the September 2006 Cites & Insights.

I actually did the same set of “reach” measures as last year for the full set of 282 potential candidates in the “great middle” category–well, more or less. Having a spreadsheet that includes name and URL for each candidate blog, prepared from the OPML output from Bloglines, made the process of searching for links quite rapid (much more so than however I did it last year).

This time around, I checked the link: count for Google (which is known to be somewhat meaningless–Google admits it’s only a partial result), MSN Search, and Yahoo! (in lieu of last year’s AllTheWeb, noting that Yahoo! owns AllTheWeb). But I also recorded one other figure, which I believe is much more meaningful than link: returns–that is, how many results Yahoo! would actually show me, using its default deduping “very similar” algorithms.

Before offering up some quick ratios for the 282 candidate blogs, remember that these blogs exclude around 90 librarian blogs with more than 196 Bloglines subscriptions as well as close to two hundred with fewer than 19 Bloglines subscriptions. Thus, most of the blogs likely to have the highest “reach” were excluded up front.

  • Google: Results ranged as high as 5,370, with some having no Google links; no ratio is possible.
  • MSN: Results ranged as high as 34,669–and again, some had no links, so no ratio is possible.
  • Yahoo!: Every blog in the great middle had at least five Yahoo! links, with a high of 179,000 or a ratio of 35,800:1, within this “middle” group.
  • Yahoo Results: This is, to be sure, artificially constrained (Yahoo always stops at or before 1000), but only three candidate blogs reached the 1,000 limit. The smallest number was 2, for a 500:1 limit. This number seems to be much less influenced by blogrolls and other factors that artificially inflate link results.
  • “Reach” using 2005 formula: The highest was 13,497; the lowest, 84. That’s a ratio of 161:1, considerably smaller than last year’s 7,778:1.
  • “Reach” using modified formula: When I adjusted deflators for the three link counts to match this year’s totals, the highest came down to 10,590, while the lowest only declined to 82, reducing the ratio to 128:1. Note that the “top 60” last year had a ratio of 65:1 between highest and lowest reach.
  • Plausible reach: I calculated a new ratio based on twice the Bloglines count plus the Yahoo Results count. That yielded a high of 1,387 and a low of 51, for a ratio of only 27.2:1.

I then trimmed the candidate set slightly, by dropping 9 blogs with “plausible reach” counts above 700 and 21 below 70, leaving 253 candidate blogs. Note that the ratio between highest and lowest “plausible reach” is only 10:1, a fairly narrow range–and the same as the ratio between most and fewest Bloglines subscribers.

I’ll make the “reach spreadsheet” available when I publish the article, for those who want to play with the sorting and formulas.


I’m not using any reach factor, including Plausible Reach, as part of the metrics for individual blogs. These factors were only used to narrow the group of candidate blogs to a somewhat manageable number. I believe the new number is more, well, plausible than last year’s–but I don’t believe any “reach” numbers (specifically including Technorati, PubSub, et al) really tell you that much about a blog, particularly if the blogger isn’t aiming to be a superstar.

Now begins the interesting part: Looking at the blogs themselves, preparing brief comments, and preparing a set of metrics that highlights interesting aspects of individual blogs without attempting to rank them. I’ve dropped some metrics (Technorati, BlogPulse, link density). I’m dropping any comments about the “voice” of a blog, which makes particularly good sense since I’ve included blogs in languages other than English.

I’ve added a couple of “interesting items,” one of which is a metric of sorts: The topic of the first post within the test period (March through May 2006), and the topic and comment count for the post with the most comments during that period. The latter (which could disappear during the investigation) is based on a suggestion made on a WebJunction forum.

What order will blog notes appear in? Silly as it may be, the best choice turns out to be alphabetic: It’s not hierarchical and it makes sense to most people who use the latin alphabet. (Hey, my blog–which won’t be part of the study–certainly isn’t advantaged by an alphabetic sequence!)

Two notes in closing, for now:

  • I’m still soliciting feedback from bloggers who can ascertain the average daily number of sessions during May 2006 (or the total for the month) and the number of unique IP addresses during that month. Comment here or send me email; include the blog name. So far, frankly, there are no apparent correlations between these two factors and anything else–and maybe that’s the true result. Deadline: July 31.
  • Is it feasible to investigate 253 blogs? I honestly have no idea. I’ve allowed six weeks, but that’s only evening and weekend time, and there’s at least one column to be written, probably one little issue of C&I to do, maybe some other C&I essays, and maybe even a little vacation within that time. If I can do 10 an hour, I’ve got time. If I can’t do at least 6 an hour… Of course, I don’t know what the final count will be. Of the first six, one turned out to be an official blog, one didn’t begin until April 2006, and one ended in February 2006 (so the candidate pool is already down to 250)–and of the other three, one took five minutes to review and one [in French] took half an hour. So your guess is as good as mine. I’m sure I won’t lose half the candidates across the board, but I wouldn’t be surprised if the total declined to 200, maybe fewer.

Cites & Insights has a new home

Saturday, July 15th, 2006

Cites & Insights: Crawford at Large has moved to

or, if you want to save a little typing,

All issues and essays are now available at the new site (which is a site, as you might guess).

Please update your bookmarks and links accordingly. For the foreseeable future, issues and essays will continue to be available at the old site (but not via the home page).

Thanks to Boise State University Libraries for hosting Cites & Insights over the past few years!

Update: It appears that the HTML essays are showing up oddly in default browser settings (Unicode). I haven’t the vaguest idea why that is, since they were copied unaltered in exactly the same way they were copied to the old site (where they look fine). Suggestions welcome; I’ve asked Blake C. for help. Meanwhile, the PDF issues should be fine.

Update 2: Thanks to helpful readers, we know what the problem is (a character set problem related to the server being used). There may be a quick solution, also suggested by a helpful reader. If so, expect the problem to “resolve itself” this evening (Monday, 7/17). If not, it’s likely to take quite a while for the problem to go away on older essays, as I’ll have to rebuild each .htm file (to get rid of “smart quotes” and dashes). Note that Bibs & Blather in C&I 6.9 has been rebuilt (except for bullets), so it’s no longer an example of the problem.

Update 3, Monday afternoon (7/17): Thanks (enormously!) to Keith Gilbertson, and a little quick Notebook and FTP work, the problem appears to be solved, at least for Windows users. (If you’re a Mac user, you’ve probably had this problem all along. If not, then it should be solved now.) You may need to clear your cache for the solution (adding Windows-1252 character set support) to take effect. I really like smart quotes, so even as I go through and fix the internal URLs in the .html files, I don’t plan to get rid of them.

Losing the Big Bucks

Friday, July 7th, 2006

Those who visit directly (as opposed to reading posts via RSS) may note something missing.

In the sidebar.

Those little text ads from Google.

I just removed that section from the sidebar template.

Why? Because I only get revenue when someone actually clicks on an ad, but those ads show up everytime someone’s here. And, it turns out, almost nobody ever clicks on an ad: I believe there were zero clicks so far during July, and since I turned on the ads, there can’t have been more than a dozen total.

Oh, and I don’t get to collect revenue until there’s $100 worth. At the current rate of earnings, that would be some time in, oh, 2010.

I concluded that, for me, given a total lack of interest in being a Big Deal Blogger, given that my ethics conflict with suggesting that people click on ads they’re not really interested in or clicking on the ads myself (both of which are, morally if not legally, click fraud), it’s a form of commercialization that makes no sense: Ads with no revenue.

This weekend, I’ll try to figure out how to inform Google that I’m no longer participating. Maybe they can contribute my $8 (or whatever) to some good cause. Or it can just sit there…

Am I suggesting that anyone else should or should not have ads? Nah.

(I’m not crazy about getting long text ads interspersed with RSS posts, which I’ve seen from a couple of blogs, but I’m not going to lose any sleep over it either. If the nuisance of the intrusive ads [I don’t consider the sidebar ads intrusive] exceeds the worth of the blog, unsubscribing is even easier than subscribing.)

Am I going to put up some self-congratulatory “ad free zone” icon? Not bloodly likely. Hey, I own stock (via mutual funds and CREF). I know full well that most of the magazines I enjoy most, all of the TV I enjoy, and most of the daily paper are paid for by advertising–and I continue to regard local advertising as one virtue of metro daily newspapers. There’s lots wrong with some forms of advertising. There’s lots right with advertising as well.

This is a narrow decision: For this blog, for this time, the Google ad box was pointless.

Bloglines continued: The starting list

Wednesday, July 5th, 2006

Since I seem to be keeping a running log of progress on “Looking at Liblogs: The Great Middle,” I may as well note the next step.

I’ve completed the initial scan of candidates, after establishing a first-stage low and high cutoff for total Bloglines numbers based on the 240-odd candidates already on my Bloglines list (excluding “corporate,” “official,” non-library blogs). That first cut reduced the 240 to 200.

Here’s the rest:

  • LISWiki Weblogs page, blogs that I hadn’t already looked at (Individual and non-English only): 112 below the low-sub. cut (and a bunch of zero-sub. Persian blogs, and some other zero-sub. blogs), 7 above the high-sub cut, 63 “dead” (no post since March 1) or with no feed, or not blogs; 149 added to the candidate pool.
  • DMOZ/Open Directory, those not looked at in the first two steps: 7 below the low-sub, 0 above the high-sub, 23 dead/no feed/missing in action; 4 added to the pool. (I’d already considered most DMOZ blogs.
  • PubSub library list that hadn’t already been looked at, plus a handful of blogs whose creators sent me information about them: 18 below the low-sub cutoff, 0 above the high-sub., 8 dead/no feed; 15 added to the pool.

Note that the handful of blogs whose creators told me to leave them out–and it’s a small handful–were included in these steps to maintain some integrity; I’d eliminate them later.

That left me with 368 candidates–way too many even for the expanded “look” I was planning. It also means that I checked out something like 650 liblogs in all, of which 554 are still active, aren’t official/corporate/large group, have an RSS feed, and have at least one subscription.

As should be obvious from the above the high and low cuts weren’t symmetric, and that’s not surprising: I try to subscribe to interesting new blogs, but it’s natural that I’d have more very-popular blogs than light-sub. blogs.

After trying a few possibilities, and noting that I had to make cuts at numbers (how do you decide between two blogs that both have, say, 23 Bloglines subscriptions–without doing the kind of extended “reach” investigation I did last year and don’t want to repeat?), I wound up deciding on a “great middle” that’s skewed slightly towards more established blogs.

To wit, my new candidate pool, which will shrink slightly as I do more checking and may be cut more sharply if I just decide I can’t deal with 280 blogs (that would be a long story, but maybe that’s OK), is an arbitrary “half of the upper middle,” eliminating the top 90 and bottom 184 (based purely on Bloglines subscriptions).

Interestingly (or not), that results in just over a 10:1 ratio in subscription counts between the top (196) and bottom (19).

So what happens now? As time permits, and in addition to other writing and a little of that summertime fun I was promising myself, a small amount of “reach correlation” checking early on, and a large amount of metrics over the next month or so. (The metrics will mostly involve posts from March through May, so timing isn’t that important; I’ll try to do the reach correlation within the next week or so, to make it more-or-less comparable to the three-day Bloglines testing.)

Another reassurance: The look itself will not be hierarchical: It won’t be “Walt’s Middle 200” or whatever, and certainly not in reach order. I may not even include reach calculations in the supporting spreadsheet; unclear at this point. Whatever the final order, this is intended as a look at the “Great Middle” of librarian/library person weblogs in the first half of 2006, offering some interesting metrics and possibly pointing out a few that you’ve missed. A few from last year’s study will show up this year (but, I’m guessing, not many, since I cut more than 60 from the high end of the list).

Of course, I could give the whole thing up as too much effort, but…well, it could happen. Don’t bet on it.

Note that people still have a week or so to say they don’t want their blogs included (and since I’m always a little sloppy, the absence will be presumed to be my sloppiness if you’re in that great middle) and a few weeks to send May unique-IP count and average daily sessions. I think there may be some interesting correlations, which won’t be offered by individual blog.

Resolved, that debates are a terrible way to run programs

Wednesday, July 5th, 2006

I didn’t attend the ACRL debate on information literacy. Several of those who did have had snarky things to say about it, apparently well deserved. Here’s a follow-up to an earlier post about the session at A Wandering Eyre–not to pick on Jane, but because she writes well and garnered some interesting comments. (The debate’s been debated elsewhere…)

I did go to the LITA debate on the future of search. And left after 15 minutes…

And then recalled that I’ve turned down more than one speaking invitation for a debate format, after accepting one such invitation (one of only three speeches I’ve done that I regard as failures).

I’m less hard-nosed than some. I’ll be on a panel, as long as it’s not a cry-and-response panel, and I’ve been the speaker being responded to by a panel (and don’t much care for it, not because I don’t like disagreement but because I don’t like being required to write a speech in advance and stick with what I wrote…but that requirement is almost essential for responders to work effectively).

The more I think about it, the more I think I just don’t care for debates as content programs. As carnivals/sideshows, sure; bring on the powdered wigs and gongs to cut off the speakers at the 3-minute mark. Cheer, boo, throw vegetables: Just don’t think you’re communicating meaning or changing anyone’s mind.

Actually, for me, this should come as no surprise. I was never a football player (as anyone who’s seen me could guess), but I spent four years in the NFL–the National Forensic League, that is. That’s the high school public speaking association, a good place for geeks like me to spend weekends. I “topped out” point eligibility in debate, impromptu, and extemp, which means I did a lot of debating. And what struck me as the years went on was that NFL debate is a great way to train value-neutral lawyers: That is, you’re required to be equally effective in arguing for and against a set proposition. Crucial to doing that is not believing either side. (One year, I used the same very effective anecdote on both sides of the same issue. That was the year I realized that treating debate as anything other than a stunt was demeaning my personal ethical sense.)

Maybe it’s just me, but maybe not. Disagreement can be good. Serious discussion can, rarely, change minds: I’ve changed my mind thanks to informed discussion. But debates? I think they’re artificial, tend to force extreme positions, and are valuable only as entertainment, not when there’s something serious to be said. At least that’s been my recent experience.

[Not that anyone was planning to in any case, but I guess this serves as a warning that you shouldn’t invite me to participate in a debate. I’ll turn you down.]

Bloglines upheaval: What’s happening

Tuesday, July 4th, 2006

I swapped out a “selective blogroll” quite a while ago, in favor of the “Blogs I read” link in the right-hand column. That link brings up the public portion of my Bloglines subscriptions, which is about 99% of my total Bloglines set.

If you happen upon that link over the next few weeks, the results may seem more bizarre than usual–and more variable than usual. I wouldn’t be surprised if the list swelled to 400 or 500 entries at some point.

No, I haven’t suddenly gone blog-crazy (or more so than usual). If anything, I have less time for blog-reading: As of yesterday, I’m back to full-time work from the 75% time imposed last fall.

What’s happening is the lengthy process of data gathering for “Looking at Liblogs,” this year’s version of “Investigating the Biblioblogosphere.”

Right now–starting Sunday and, I hope, ending today [we don’t do road trips on long weekends, and I was at work yesterday anyway] or tomorrow–I’m gathering candidates. This year’s version is going to be very different from last year’s (not hierarchical for one thing, and a few people have opted out, for another), and one major difference is that I’m looking at “the great middle,” excluding not only blogs with the fewest Bloglines subscribers but also those with the most Bloglines subscribers.

I’ve already made the first cut, based on checking total Bloglines subscribers for the 240 candidate blogs already in my Bloglines set, assuming that–at least at the high end–these are representative of the field as a whole. [“The field” is roughly as defined last year: Blogs by individual library people and small groups of library people, excluding “official” blogs from libraries, clearly sponsored blogs, and large group blogs.] The current version of Bloglines makes it much easier to estimate total subscriptions, as the subscription window shows counts for each feed Bloglines can identify. (I exclude comment feeds, and if there are more than half a dozen non-comment feeds, I may give up and just take the highest group.)
After determining the apparent subscription count for those 240 candidates (which may or may not have included some that have opted out; that’s irrelevant to this initial calculation), I looked at a first cut in two different ways: the top and bottom 10% in real terms, and the top and bottom 10% in normalized-subscription terms. (That is: For the second cut, I did a quick pivot table on the Bloglines #, thus collapsing multiple cases of a single number.)

I took the outer limit in both cases–actual blogs for the lower limit, # of subscriptions for the upper limit.

Now I’m doing the second pass, checking blogs that I wasn’t already subscribing to in three different sources, although I don’t anticipate picking up much past the first new source. The sources: The LISWiki Weblogs page; then the Open Directory Libraries page (if there are any new ones there); then the Pubsub libraries list (again, if there’s anything new left).

For any blog that’s had at least one post since February 2006, that meets my other criteria, and that has between 16 and 689 Bloglines subscriptions, I’m subscribing and jotting down the subscription total. Then, I’ll do a second cut, since the first cut will clearly leave more blogs than I can possibly deal with.

So the link will yield an ever-growing list, which will include some blogs that aren’t candidates. Then, the list will shrink somewhat, until I start the second, much more extended portion of the data gathering (looking at other reach measures, then looking at metrics for the blog). I’ll delete blogs (or make them private) little by little during that process. Chances are, I’ll wind up with more subscriptions than I started out with.

Note that this year I’m including non-English blogs, at least initially. I may not be able to describe the blogs as well, but this year’s project may not include much descriptive material anyway.

One wholly unanswered question at this point: How I’ll arrange the blogs for the article itself. It won’t be by apparent reach. Alphabetical also favors certain bloggers (not me, to be sure!). Since the article won’t appear until mid-August or later, I can figure that out a whole lot later.

Meanwhile, happy 4th of July to all readers (except those for whom it’s already the 5th). It may be a holiday in the U.S., but it’s the 4th of July everywhere, right?

Oops: Two things I’d intended to mention:

  • Early and maybe unsurprising finding: If given the choice, Bloglines users–at least library types–tend to prefer Atom feeds to other RSS feeds.
  • Turns out I have a lot more subscribers here than I realized…336, where I was counting 137.

50-Movie All Stars Collection, Disc 9

Saturday, July 1st, 2006

They Call It Murder, 1971, color, Walter Grauman (dir.), Jim Hutton, Lloyd Bochner, Jo Ann Pflug, Edward Asner, Jessica Walter, Leslie Nielsen, Vic Tayback. 1:35

Based on an Erle Stanley Gardner story, this appears to be a pilot for a show featuring Jim Hutton as a DA—but not Ellery Queen. Apart from the fine cast, it’s a well-done murder mystery with enough red herrings to keep it interesting. Good picture and sound. $1.75.

Firehouse, 1973, color, Alex March (dir.), Richard Roundtree, Michael Lerner, Paul Le Mat, Richard Jaeckel, Andrew Duggan, Vince Edwards. 1:14

Roundtree plays the first black in a New York firehouse—replacing a firefighter who died in a fire set by black arsonists. Roundtree’s character lets a black arsonist get away at one point, which doesn’t help matters. A great cast, but the script doesn’t work nearly as well as it could. $1.25.

James Dean, 1976, color, Robert Butler (dir.), Michael Brandon, Stephen McHattie, Brooke Adams, Katherine Helmond, Meg Foster, Amy Irving, Jayne Meadows, Heather Menzies. 1:34.

Michael Brandon plays William Bast, an actor who was Dean’s roommate; Bast wrote the biopic and Brandon narrates. While lauding Dean’s acting ability, the picture certainly doesn’t whitewash his character issues. The only reason this doesn’t get a full $2 is some sound distortion early in the flick. Well done, worth watching. $1.75.

Moon of the Wolf, 1972, color, Daniel Petrie (dir.), David Janssen, Barbara Rush, Bradford Dillman. 1:15.

David Janssen makes a great upstanding sheriff in a Louisiana bayou town, coping with odd murders and a town that’s distinctly Upper Crust and Everyone Else—and the returned-home daughter of the Upper Crust family has eyes for him, which her patrician brother doesn’t appreciate. Good cast, well acted, a little talky but compelling, good picture and sound. I’m giving it full value despite one slightly implausible running plot issue: The half-crazed dying old man keeps saying something like “lukearuke,” and nobody recognizes what he’s saying until the upper-crust lady visits him and hears “loupe garou,” which is to say “werewolf,” which [SPOILER] is, of course, who’s been doing the murders. Maybe back in the 1970s, you could reasonably assume that Cajuns wouldn’t recognize that word. I picked it up the first time I heard “lukearuke,” and I sure don’t speak French—but then, I had the title of the TV movie as a clue. $2.