Archive for the ‘Liblogs’ Category

66,504 Facts & Formulas in 1.98MB

Monday, October 11th, 2010

Very quick update on what’s now The Liblog Landscape 2007-2010 (I’m reserving The Way We Blog for a possible five-year overview, maybe, perhaps, if I don’t cure this obsession).

  • I think, believe, hope I have the derivative pages populated on the master spreadsheet–that is, all of the calculated data as well as the observed data.
  • After problems I had last year, I’ve had the good sense (?) to: a. Not try to put everything on one massive page, b. Save a copy of the master spreadsheet, and…c…perhaps most important: Save another copy where every page is values & number formats, without any formulas or references. (The master sheet is lousy with both of them.) This should minimize, maybe even negate, problems with screwing up data while sorting, summarizing, etc…particularly since I won’t actually use the “fixed copy” (the copy with no formulas), I’ll use a working copy of it, feeling free to not only hide but delete columns for convenience…’cuz I can always restore the whole thing.
  • Damn, but “If” formulas with four levels of testing are clumsy to get right…but that’s what I needed, and I got it, eventually. (That is: There are in all four “If” statements within the nested overall statement. Trust me, it’s necessary…partly because I need to distinguish between “0 because none there” and “0 because unable to count” or “0 because the blog didn’t exist yet.”)
  • Excel does reasonably well on compactness. The master spreadsheet includes six pages, each with 1,305 rows (labels and 1,304 blogs), with–respectively–24, 9, 12, 20, 20, and 28 columns. All cells are populated (frequently with “dummy numbers,” which are always negative). There’s a lot of duplication among the columns, in order to make analysis a little less screwy (that is, each major segment of analysis has its own page), but there are, in fact, 51 distinct columns among the–lessee–113 columns. Of those, 24 are observed items, 27 are calculated items.
  • So, depending on how you look at it–and ignoring column headers–there are either 66,504 data items (including blog names and URLs, both of which can be long) or 147,352 data items in the master worksheet.
  • All of that stores in Excel2007, including format information, as a 1.98MB spreadsheet with formulas–and a mere 1.05MB spreadsheet without formulas. That strikes me as pretty efficient storage.

Now, to start messing with the working copy of the “fixed” spreadsheet…and writing it up (well, I’ve already written up most of the metric definitions). In case anybody cares, I’m currently tending toward a “hybrid solution”–publishing most chapters (excluding the first, which will have all the hot overall items and will be developed as I write the book) in C&I, also appearing as 6×9 PDF separates, and–when it’s ready–making the whole book, with index and first chapter, available through Lulu for the few who want it.

Oh, and finishing up/publishing C&I for November 2010–which will not include any of this project.

How many posts? New deadline and re-request

Monday, September 27th, 2010

If you run a liblog or knows someone who does, and if you haven’t seen this post, I’d appreciate it if you could check the post and, if appropriate, respond by October 7, 2010.

This post is partly a re-request, partly a new deadline. Other projects are moving rapidly enough that I might be able to start data analysis on the liblog project on October 8, 2010; since most responses to a post come within the first week, I think it’s reasonable to change the original one-month deadline to a three-week deadline.

(For those not ready to click through, the request is for the total number of blog posts from a blog’s inception through May 31, 2010, for blogs on a list of 180-odd where I don’t already have that information.)

Thanks!

Update October 29, 2010: Comments closed, since the deadline has long since passed–and I’ve done this phase of the analysis.

Liblogs: What am I missing?

Monday, August 30th, 2010

I’ve done all the scanning I’m planning to do, looking for liblogs (that is, blogs by library people, as opposed to official library blogs). I’ve found 1,277 liblogs, and checked and excluded another 1,308 things that aren’t liblogs.

It’s Your Turn

I know I’m not going to get everything. That became fairly clear when, on checking 83 possibilities from eight blogrolls added while checking mostly-defunct liblogs, I came up with 31 legitimate liblogs…

I also know I’m not going to go through any more iterations.

So: If you know of liblogs I haven’t included, you’re invited to let me know between now and September 15, 2010.

What’s a liblog?

For the purposes of the current project, the most inclusive I’ve done, here are the criteria:

  • The blog must be available on the web without using passwords or special permissions.
  • Most posts in the blog must be in English.
  • There must be at least one post prior to June 1, 2010 (that is, no later than May 31, 2010).
  • The blog should be by a self-identified “library person” or group of people, not explicitly identified as not at all relating to libraries, or somehow related to libraries…but:
  • The blog should not be an “official blog” from a library or a library-related group. (If you’re in doubt, include it.)

Please, Before Submitting Any Candidates…

  • Check this pageLiblogs 2010 (with exclusions) — DRAFT. Use your browser’s Find function to check the name. (The list is in alphabetic order, but it’s idiot alpha order, with a few “A ” entries and a lot of “The ” entries. And, of course, cute punctuation can change sorting.)
  • [Added 8/31/10:] If you see a blog twice, under a current name and an old name, that’s OK: In order to maintain my sanity while checking candidates, I include older names of renamed blogs in the exclusions list, under the “Renamed” category. There are 115 older names in the list as of now.

If You Have Candidates…

  • Add a comment, with the blog name and URL–but give the URL as text, not as a link (omit the http://), and don’t combine the blog name with a link. (Why not? Because, particularly if you have more than one, it will cause Spam Karma 2 to flag it as spam–and with more than 100 spamments today, I’m not sure I’ll be able to sort through all the spam looking for legit posts.)
  • Or send me email, waltcrawford at gmail dot com, using the same rules.

Sending candidates doesn’t guarantee they’ll be included–and since this project won’t involve blog profiles, it may not matter as much. Still, it will almost certainly be the most comprehensive look at English-language liblogs ever done, so it wouldn’t hurt…

Thanks. Now back to the Open Access project (and dealing with a keyed car, and, and, and…)


Postscript, September 16, 2010:

I’ve turned off comments, since it’s now past September 15, 2010.

Between comments here and direct email, and excluding blogs that obviously didn’t fit my criteria, I received 33 candidates. Of those, 25 met the criteria and have been added to the study, bringing the new total to 1,296 liblogs (that includes one more blog, noted in a response to another post, that had one single post some years back…). Six of the candidates are too new, having begun in June, July or August 2010. One is an official blog. And one just doesn’t seem to be there.

Thanks!

Progress (regress?): A quick update

Tuesday, August 3rd, 2010

The good news: I’ve started in on The New Project (a fast-turnaround, relatively brief book for a real library publisher, on a topic I’m quite comfortable with–more later). First of six chapters has a good rough draft in place. Second of six chapters has the first half of a very good rough draft, and I expect to do the second half tomorrow.

The odd news: I haven’t entirely set aside the Liblog Project (here’s the most recent post, which links to the others). It was the kind of thing I could work on after mowing the lawn in 88F weather (which was too tiring to focus on real writing) and in logical pauses during the writing. Here’s what’s happened:

  • I decided to change the boundaries for the “deep look” so that it includes blogs with GPR of 3 (of which there were apparently 83, but really 81) and, after looking at them more, blogs with only one post during March-May 2010 (of which there were 67). So I’ve added the comment counts and length totals for 2010 and, where not already there, earlier post counts, comment counts and length totals for March-May 2007, 2008 and 2009 as appropriate.
  • While doing that, I started cutting-and-pasting blogrolls that appeared to be library-oriented and consisted only of blog links (there’s a new breed of blogroll that includes the latest headline and date for each blog; WAY too much work to strip down to links, and glancing at most of them says all or nearly all are already in the study).
  • So far, 22 of the 148 blogs had usable blogrolls (that weren’t obviously all repeats). The list of unsorted, unchecked candidates is 543 blogs. After Chapter 2 is done (and the Wednesday hike, and other stuff, and maybe Chapter 3), I’ll do the sort/dedupe/check step and, assuming anything’s left, check the remaining candidates. If I get more than, say, 4 or 5 new-to-me liblogs out of that process, I might continue picking up blogrolls. If not, not.

I also realized that I really have four groups of liblogs and that some portions of the analysis and narrative ought to treat them as four separate groups, not just two groups. As things stand, the four groups–which don’t include the 720 “not liblogs” that will be treated summarily–look like this:

  • Group 1: Fairly active and fairly visible blogs. Liblogs with a Google Page Rank of 4 or higher that have at least three posts between March 1 and May 31, 2010. So far, there are 394 of those.
  • Group 2: Less active or less visible blogs. Liblogs with Google Page Rank 3 or that had only one or two posts during the quarter. So far, there are 156 of those. The combination of Groups 1 and 2 constitutes the Deep Look pool, 550 liblogs in total. (So the comment somebody made that “there seem to be around 500 active liblogs” seems to be right enough for jazz…)
  • Group 3: Probably alive but relatively inactive or invisible. Basically, these are all the liblogs that don’t qualify for the Deep Look but that seem to still be around. That includes active blogs with Google Page Rank lower than 3, blogs with no posts during the quarter but at least one post since December 1, 2009, and blogs with no posts during the half-year between December 1 and May 31, but with at least one post since May 31, 2010. There are 210 of these.
  • Group 4: Canceled or deeply moribund. (Canceled blogs only appear here if they didn’t qualify for Group 1 or Group 2: A liblog that’s explicitly canceled during or after March 1-May 31 is considered active for that quarter. There are a small handful of those.) These are blogs with either explicit cancellations prior to March 1, 2010 or no sign of activity for at least seven or eight months (that is, nothing since November 2009, given that testing was done in late July and early August 2010). There are 310 of these–and that doesn’t include, to be sure, a few hundred blogs that have disappeared entirely or gone behind firewalls.

I may be almost done with the data gathering. Data analysis and writing up an interesting narrative–and I think there will be lots of interesting things out of this very broad look–will come, well, when it comes.

Now, back to the book project…

As for Cites & Insights, where I’m still apparently on a writing break: There will certainly be a September 2010 issue. Whether it will appear in late August or early September: Still unknown. Whether it will be new material or a large chunk of But Still They Blog: Still unknown.

[Whether even the tiny trip we were planning this month will happen: Still unknown. That’s a whole different can of worms. Meanwhile, I think we’re now safely above 95% electric generation for the year, since the total from PG&E is now down below 150kWh and the photovoltaic system has generated considerably more than 3,000 kWh…even though, thanks partly to a neighboring palm tree that keeps feeding lots of pollen to our cars and to the panels, we never did reach 18kWh per day.]

Liblogs 2010: Update 2

Wednesday, July 28th, 2010

What’s that you say? How can this be Update 2 when you haven’t seen “Liblogs 2010” as the title of any previous post? All I can do is quote ol’ Ralph Waldo:

“A foolish consistency is the hobgoblin of little minds…”

Yes, I know, I’d thought it was the more general (and more nonsensical) “Consistency is the hobgoblin of small minds”; this version has the virtues of (a) apparently being the correct quotation and (b) making a lot more sense.

Anyway…this is an update to “The new project: update 1,” which was an update to “We grow too soon old...”

I guess calling it “Liblogs 2010” means I’m fairly certain I’m not going to abandon the project. Which, at this point, is true–barring some wonderful new paying opportunity that requires too much time. (I hope to start on one paying project very soon and to convert something else into a paying project–but both of those would leave enough time to continue this project after some delay.)

The candidates

I’ve completed the scan of 868 “candidates” from LISWiki, the ODP list of librarian blogs, the LISZen source list, Meredith Farkas’ “Favorite Blogs” list, Davey P’s “HotStuff” list and the Salem Press list–which began as 2,911 possibilities that boiled down to 868 after eliminating duplicates, items among the 606 liblogs (and 71 exclusions) that were either in previous studies or already identified from my Bloglines subscriptions, and items obviously not in English. The breakdown of those directories before deduping:

  • Salem Press: 318
  • LISWiki: 743
  • ODP: 138
  • LISZen: 795
  • Hot Stuff: 748
  • Farkas: 179.

The results

I found 415 more liblogs for the “broad look”–that is, English-language (or predominantly English-language) blogs by “library people” or somehow related to libraries, that aren’t official blogs (or at least don’t function as official blogs) and have at least one post visible on the web, no matter how old. The running total is now 1,021 liblogs for the broad look.

I also added 426 excluded candidates–names/addresses that aren’t visible (either deleted or password-protected), blogs not in English, official blogs, blogs with no apparent relation to libraries or library people, and blogs that have been incorporated into newer, renamed blogs.

Hmm. 426+415=841. The other 27 cases were either my mistakes (failing to delete blogs already there) or naming differences (the same blog appearing two or three times under different names).

Slightly more detailed results

Here’s how the 497 excluded candidates break down by reason, as I recorded them:

  • Ten empty, blank, or dummy pages–some of them potential blogs but with no content at all. One had a single picture, nothing else.
  • One malware site, popping up all sorts of windows asserting that my PC has viruses: Thanks a lot, Information Knot.
  • 26 sites that aren’t blogs–including aggregators, newsletters and a variety of other things.
  • 58 blogs that aren’t in English (as judged based on the first page of posts, except for all the posts in Farsi where the page title was enough… in one or two cases, where there are a handful of English posts among mostly-non-English posts, it was a judgment call)
  • 62 blogs for which I could find no plausible library or librarian connection, either in the author info or categories or posts. I tended to err on the side of inclusion.
  • 164 not viewable–mostly sites that have simply disappeared, but with a large handful of password-protected sites. I’m guessing that nearly all of these were liblogs at some point.
  • 99 official blogs, including both library blogs and association/company blogs that appear to function as official publications. I also tended to err on the side of inclusion here, that is, if a blog had a library abbreviation (or ALA division or whatever) in its URL but was clearly the work of individuals who disclaimed organizational responsibility, I left it in the broad look.
  • 77 renamed–blogs that have been incorporated into, or in a few cases, followed by, newer liblogs with different names.

I think that adds up to around 241 “former liblogs,” but that number might be high, since some not-viewable blogs may also be excludable for other reasons.

What about the broader look? Here’s an early summary:

  • 597 liblogs have two or more posts during the March 1, 2010-May 31, 2010 test period. Another 63 have exactly one post during the quarter; at present, I don’t plan to include those in the deeper look.
  • Of that 597, 417 have a Google Page Rank of 4 or higher–and, currently, those are the 417 I plan to use for the deeper look. Which is to say: I’ve recorded count, length, and comments for March-May 2010 (and going back to 2007) to the extent feasible for each of those blogs. I could add another 71 blogs with GPR 3, if I’m willing to do the extra metrics for those (in 19 cases, most of the metrics are there from earlier studies). It’s very unlikely that I’d add the 21 blogs with GPR 2, the six with GPR 1 (something I’ve almost never seen before, actually), or the 75 with GPR 0 (which can happen because a blog changes platforms or because it’s a corporate-platform blog and gets no link love on its own).
  • So it appears that at least 660 liblogs are at least marginally active in 2010–and that a deep look could involve anywhere from 417 to 492 liblogs. There were 449 liblogs with countable posts in the 2009 study (including some two dozen with just one post), as a point of comparison.
  • I have recorded blogging software when that was visible, but I’m going to recheck two dozen of the blogs where I recorded “other” as the software, before I started viewing source in cases where the software wasn’t obvious….and maybe 26 “unknown” that didn’t seem to be using any canned package.
  • I recorded the country in which the blog was being written, when that was clear, and show 25 different countries.

What’s next?

Right away, nothing at all–I’m going to do some other work.

Then, well, it depends on other projects and energy.

  • I’m likely to do the cut-and-paste trick with the some or all of the directories noted in LibWorld – library blogging worldwide. Although it’s fair to assume most of those blogs are either already in the spreadsheet or are non-English, there might be some exceptions.
  • If I have loads of energy, I might cut-and-paste the first, say, 50 library-related blogrolls from blogs already in the deep study (or otherwise current), and see whether there’s enough yield to be worthwhile.
  • I know that it’s not possible to say “here’s the universe”–but I suspect it will be fair to say that the final broad look will represent a very large majority of the English-language non-official liblog universe, at least of those blogs that have left any trace at all…
  • And then, probably late this fall, possibly in early 2011, I’ll start working with the spreadsheet to prepare a new report, one that will probably come out as a (not quite so thick) book, with some details emerging here or in Cites & Insights. There’s quite a bit to be said about the broad range–after all, all I’m missing is length of posts and comment count for 2010, and pre-2010 metrics–and even more about the deep range and comparisons between the two.

Casual observations

I seem to have encountered a lot of blogs this year that I never encountered or at least didn’t include in earlier studies. I’m guessing that’s partly because blogs tend to gain GPR over time, partly because the Salem list actually includes quite a few blogs that aren’t in other directories, partly because the other directories have improved.

Quick observations:

  • “PLN” seems to be a term that’s automatically understood to mean Personal Learning Network by many (most?) school library bloggers–and not, I think, by most others. I can assure you that the PALINET Leadership Network would not have had “PLN Highlights” as its alert blog name if that TLA (three-letter acronym in this case) was universally understood–or even prevalent outside of school librarianship.
  • Either Will Manley or one of the commenters on his blog made a comment about this being the golden age of book reviews. Quite apart from Amazon and LibraryThing, I think that’s true, based on the number of high-quality, prolific book-review liblogs I’ve encountered this time around…particularly for YA and children’s literature, but also for books in general.
  • Oh, in case you didn’t catch that: I did not require “active since December 2009” this time around, and I’ve recorded the starting month and lifespan for a good many blogs that weren’t around very long. Yes, Will Unwound is part of the deep study despite its January 2010 start date…and a total of ten liblogs that began in 2010 are part of the broad study. Of course, they had to have begun by May 2010, since May 31 is the cutoff point for all observations.
  • What I’ve derided as The World’s Worst Blogging Platform, the LJ/SLJ construct, now turns out to be built on what I regard as the world’s best blogging software–but also the software that can be used to screw up a blog’s presentation perhaps more thoroughly than any other. That’s right: the LJ/SLJ blogs now use WordPress. So does this one, and I have no intention of changing. (I’ve encountered exactly one liblogger who explicitly moved away from WordPress–to Blogger. There may be others.)

Whew. That’s a lot more than I intended to say. So far, my promotional posts for But Still They Blog have been a total washout, with zero additional copies sold. So, you know, I’m aware that doing this 2010 study is unlikely to be remunerative…but it is fascinating.

The new project: Update 1

Friday, July 23rd, 2010

I thought I’d update this post, now that I’ve spent a few hours days on the scan of 868 possible candidate blogs.

Then

I started out with 606 liblogs (English, visible–that is, reachable on the web, not official, somehow related to libraries or librarians) and 71 “rejects” (blogs that had been on my radar at some point but were either non-English-language, had wholly disappeared from the web, or were actually official blogs)–and 868 candidates, combined and filtered from 2,911 listings in six directories.

I guesstimated that I’d find 200 to 400 more liblogs from among those 868, but had no idea what the number would actually be. I also had no idea whether the process would be so grueling that I’d give up partway through–after all, there’s no economic incentive to complete this, just curiosity.

Now

I’m now almost through the “I”s in that crude-alphabetic list of 868 (“crude-alphabetic”: there were several “A something” blogs and there will be a LOT of “The something” blogs).

  • There are 551 candidates left to check, so I’ve apparently done 317. In other words, I’m a little more than a third of the way done.
  • It’s clearly feasible to do this. It’s not fast–I haven’t been doing any other writing this week–but it’s not so grueling as to be hopeless. I’ll certainly finish this scan, although not necessarily this month.
  • It’s essentially impossible to estimate the time required, particularly since I’m backfilling data for newly-discovered blogs that go back more than one year. I might be able to check 30 blogs in one hour (if half of them are non-English, official, or disappeared and most of the rest were only there briefly); I might require more than half an hour just to handle one blog (say a 5-year-old Kidlit or YA lit blog with an enthusiastic audience). I’m guessing it averages about 20/hour overall, but that’s a very crude guess.
  • At this point, there are 749 blogs on the broad-survey list and 229 excluded candidates. That’s an increase of 143 blogs and 158 exclusions. If the same ratio runs through the rest of the candidates, I’ll wind up adding just about 400 total (plus another almost-300 exclusions). That would mean roughly a thousand liblogs in the broad survey.

Future

Since I’ll start working on C&I again soon, and I hope to begin another (more lucrative) project in a week or so, this might go on the back burner–but I’m also interested in seeing how it goes (e.g., what percentage of those 1,000-or-so liblogs will turn out to be currently active?).

Assuming I come back to this, it now seems likely that I’ll make up a new list from the various national liblog directories in LibWorld (assuming some of them are still around and updated) and check that list. It’s less certain that I’ll try blogrolls, but who knows? It’s clearly not possible to be sure I’ve seen the whole universe; it’s not clear whether assembling blogrolls from 100 or 200 or 500 liblogs will yield any significant number of blogs not otherwise discovered.

Meanwhile, I suspect that I will include portions of But Still They Blog in Cites & Insights–but almost certainly not the whole non-profile manuscript, at least not in one big issue. So far, my new attempts at publicizing the book have yielded exactly zero sales, but that could change…

But Still They Blog: Platforms, Currency and a few more profiles

Thursday, July 22nd, 2010

More bits & pieces from But Still They Blog: The Liblog Landscape 2007-2009–this time, pages 9-10 and part of 11, plus three blogs from page 19.

Blogging Platforms and Programs

How do library people blog? Any blanket statement will be wrong, particularly since the whole universe of liblogs may be unknowable. For the blogs in this study, here’s the breakdown.

Program Blogs Percentage
WordPress 245 47.2%
Blogger 190 36.6%
TypePad/MovableType 48 9.2%
Other 24 4.6%
Drupal 7 1.3%
LiveJournal 5 1.0%

Table 1.2: Blogging platforms and programs

WordPress is close to a majority and certainly accounts for a large plurality of the blogs. Oddly enough, as compared to the 2008 study, Blogger has precisely the same percentage (36.6%) albeit of a smaller universe, but WordPress has jumped from roughly 38% to roughly 47% (and increased in real numbers), while TypePad and MovableType decreased in real numbers and increased from 8.8% to 9.2%. (I’ve lumped MovableType and TypePad together since both are similar products from the same company.) “Other” includes a few identified programs and platforms with no more than two blogs each—and a number of blogs that are either handcrafted or shorn of brand identity.

My guess—and it’s only a guess—is that a broader study, including short-lived and less visible blogs, would show a higher percentage of Blogger blogs, most of them on blogspot.com. While it’s trivially easy to set up a hosted blog at Blogspot.com, WordPress.com or Typepad.com, I believe Blogspot is perceived as being the fastest and easiest of the three. (Some Blogger blogs above are not hosted on Blogspot.com—and I’d guess most of the WordPress liblogs use WordPress software on other sites.)

There has been a slow migration of blogs to WordPress. I believe it’s safe to say that WordPress software is the preferred blogging platform for most long-term “serious” bloggers. I see very few people migrating elsewhere (except, for example, forced migrations to MovableType because people move blogs to shared services such as ScienceBlogs). But I have nothing more than anecdotal evidence—that, and the near-majority numbers above.

If you’re a numbers person, you may note that the numbers above don’t add up to 521. In the two weeks between completing the scan of blogs for metrics and doing a second scan for blogging platforms, two blogs had become unavailable, temporarily or permanently.

Currency

How current are liblogs? I used March-May 2009 for metrics—but I recorded those metrics in September 2009. To get a checkpoint, I checked each blog on September 30, 2009, looking for the most recent post but rounding down to week intervals—and beyond that, to spans that seem indicative.

Here are the results, which require some explanation.

Weeks Blogs Percentage Cumulative
1 218 42.0% 42.0%
2 51 9.8% 51.8%
4 56 10.8% 62.6%
8 49 9.4% 72.1%
13 24 4.6% 76.7%
17 14 2.7% 79.4%
26 22 4.2% 83.6%
52 34 6.6% 90.2%
99 29 5.6% 95.8%
Ceased 22 4.2%

Table 1.3: Currency of most recent post as of September 30, 2009

I marked a blog as Ceased if there was an explicit declaration that there would be no new posts—no matter how recent that declaration was. (Here again, the universe is 519, missing two blogs that seem to have vanished.) Other than that:

  • More than 40% of the blogs are robust—they had a post within the most recent week.
  • Just over half the blogs are active—with a post somewhere within the most recent fortnight.
  • Stepping back at larger intervals, it’s interesting that the number with posts sometime during the month (but not in the most recent fortnight) and those with posts sometime in August are fairly close to the “week before last” group.
  • “13” indicates sometime within the last quarter (13 weeks). More than three-quarters of the blogs had a post within the summer quarter (July-September).
  • I include “17” (actually four months) because Technorati uses that cutoff for blogs that could be considered alive. Roughly 80% of liblogs had a post between June 1 and September 30, 2009.
  • The next two levels are half-year and year marks—in both cases representing blogs that are neither active nor clearly dead.
  • “99” really means “more than 52”—that is, blogs that haven’t explicitly ceased but haven’t had a post in more than a year.

Profiles

Lady Crumpet’s Armoire

Began July 2002.

Metrics 2007 2008
Posts 12 2
Quintile 4 5
Words per post 130 40
Quintile 5 5
Comments per post 0.8 0.0
Quintile 3 5

No announcement of hiatus or dropping the blog, but the most recent post was on August 20, 2008.

beSpacific

“Accurate, focused law and technology news.” By Sabrina Pacifici. Began August 2002.

Metrics 2007 2008 2009 C08-09 C07-09
Posts 736 770 733 -5%% 0%
Quintile 1 1 1 2 1
Words per post 122 133 137 3% 13%
Quintile 5 5 5 3 3

Purely professional with no distinctive authorial voice. Does this belong in a liblog study? It’s a prolific set of very brief descriptions of, and links to, news items done by a special librarian. But it’s in blog form, Pacifici calls it a blog, and she is a librarian, so. Note the remarkable consistency over the years.

etc.

“the last, since 2002” By Amanda Etches-Johnson. Began August 2002.

Metrics 2007 2008 2009 C08-09 C07-09
Posts 11 10 1 -90% -91%
Quintile 4 4 5 5 5
Words per post 257 289 250 -13% -3%
Quintile 3 3 3 4 3
Comments per post 3.2 6.8 4.0 -41% 26%
Quintile 1 1 1 4 2

When Etches-Johnson posts, she usually has something interesting to say. Lately, she hasn’t posted much (the single post for this quarter is about the lack of posts and an experiment in “lifestreaming”).

Go buy it!

Lots more in the book–PDF or paperback. And free shipping during the summer.

We grow too soon old…

Tuesday, July 20th, 2010

…and too late smart.

The setup

Four times, I’ve done analyses of liblogs (blogs by library people, as opposed to library blogs)–twice within Cites & Insights, twice as books.

Still available, still great bargains: The Liblog Landscape 2007-2008 and But Still They Blog: The Liblog Landscape 2007-2009. Note that Lulu’s still offering free shipping for any order over $19.95, making these even better bargains.

If you’re wondering: The two C&I analyses were Investigating the Biblioblogosphere in September 2005 and Looking at Liblogs: The Great Middle in August 2006.

In each case, and particularly for the two books (which attempted to cover a very large portion of the English-language “liblog landscape”), one of the biggest time-sinks in the project was the process of finding new liblogs–ones I hadn’t already included in a previous study.

There are several sources for such blogs, and the sources tend to repeat one another (as you’d expect)–and once you’re dealing with more than a hundred blogs or so, there’s no way I could remember which blogs I’d already looked at. I used a variety of techniques to make the situation somewhat manageable–after all, we’re talking about several thousand listings in the primary sources–but it still took scores or hundreds of hours, particularly when I started looking at blogrolls.

Last year, I concluded that, if I ever did do another similar study, I’d probably give up on blogrolls altogether: Too much work for too few new discoveries.

The occasion

Based on sales to date of the two books, it would be insane to do another study.

On the other hand… well, there were still things I wanted to know about the progress of (English-language) liblogs.

So I decided to start another, somewhat different, project and carry it out if it seemed feasible and didn’t get in the way of more directly-useful projects (such as a non-self-published book I hope to be doing later this summer and early fall).

The new project differs from the last two in two key respects:

  • If there’s a book, it’s going to be much shorter–and the obvious way to do that is by leaving out individual blog profiles. Clearly, at least 90% of blog owners aren’t going to pay for a book that includes a profile of their blog, and those profiles are a lot of work (and take up a lot of space).
  • The new project has two levels of inclusion, the first of which makes the project particularly interesting to me.

The two levels

  • The broad look: As comprehensive a survey as possible of English-language blogs by library people (excluding official blogs) that have any visibility at all on the web in mid-2010. I can’t claim it will be a comprehensive survey of liblogs, because (a) quite a few have disappeared entirely, (b) a few are password-protected and won’t be included, (c) there will certainly be dozens or hundreds of blogs that I won’t encounter. But it will be the broadest look I’ve taken–albeit with less information on each blog:
    Birthdate: When it began (year and month)
    Lifespan: How many months it operated (through May 2010)
    Currency: The most recent post (prior to June 1, 2010)
    Nationality: The country (when obvious)
    Program: The blogging software (when obvious)
    Frequency: The number of posts from March 1, 2010 through May 31, 2010.
  • The deep look: A deeper look at a large subset of those blogs, defined as:
    Blogs that have a Google Page Rank of 4 or higher (fairly visible blogs)
    that have at least two posts between March 1, 2010 and May 31, 2010 (active blogs).
    For those blogs, I’m also tracking the same metrics as in last year’s study (when available): Frequency, comments, and total post length–for March-May 2010 and, for blogs new to this year’s study, going back to March-May periods in 2009, 2008 and 2007.

The first part is more ambitious, in that I’m including–potentially–a lot more blogs.

The second part is less ambitious, both because I’m not doing blog profiles (a decision I could only change with up-front sponsorship–it’s a lot of work) and because I’m limiting that level of statistical analysis to blogs that are currently active.

(One difference: I’m not requiring that blogs have started before January 1, 2010. They must have started before June 1, 2010.)

To do this project, I once again need to dive into the directories, at least as a first cut, recognizing that I’d probably pick up some additional blogs from blogrolls…but only if I could take the time.

The breakthrough (the forehead-slap moment)

Last year, I used some teeny-tiny printouts to try to cut down the amount of extraneous checking, but it was still an enormous pain. This year, I was determined to avoid superfluous printouts, even if they only used a page or two of paper.

I had one small bright idea–at each stage (where I’ve finished a pass against a source of blogs), peel off copies of the blog names and excluded blog names to a separate spreadsheet, sort them, and use that spreadsheet in a narrow little column alongside the browser window when I’m looking at a new source. That worked nicely to add new blogs from my own Bloglines list–the process took half a day or less and yielded 43 new blogs (and nine new exclusions).

Well, so, I could do that with the other primary sources (LISWiki, the ODP list of librarian blogs, the LISZen source list, Meredith Farkas’ “Favorite Blogs” list and Davey P’s “HotStuff” list, the Salem Press list)–but that would still be an ordeal.

Or… I could cut-and-paste each of the directories, with HTML included, into a Word document; use global edits to normalize them, sort the blogs…and trim that document by comparing it to my existing list of already-included blogs. Then cut-and-paste the document back to a webpage to make it easy to check new candidates.

Why didn’t I think of this last year or the year before? Maybe because I never thought of Word and HTML in the same space…maybe because I’m getting old.

The results (so far)

This morning–after the usual Friendfeed time and editing for another project–I did the cut-and-paste for these six sources (the Salem Press list required more work than the others, but still not much); within an hour, I had a sorted Word document with–gasp–2,911 candidates.

This afternoon, after lunch and some errands, I trimmed that sorted document by comparing it to the spreadsheet, including special passes for Idiot Sorting (I’m being lazy this time, so there’s lots of blogs in the “A ” and “The ” areas–and some directories normalize those articles away). The process took about two hours, maybe less…and I now have a webpage (private) with 868 liblog candidates.

Which is still a lot of checking to do, but little enough to be feasible. How many of those 868 will I add to the 606 (not including “excluded blogs”) in the current list? I have no idea; I’d guess somewhere between 200 and 400, but I could be wrong.

If this process does turn out to go reasonably smoothly, I might–after taking an appropriate break and working on other stuff such as C&I–even change my mind about blogrolls. After all, they mostly use a consistent format, and I could cumulate a whole bunch of them in a Word/HTML document and… well, we shall see.

No promises

Am I certain there will be a 2010 survey? Not really. I’d say the odds are pretty good, but if paying gigs come up or there are other things that interfere, it could take a long time–and, frankly, I haven’t invested so many hours in it that I couldn’t just abandon it. (Although my track record for abandoning projects doesn’t suggest that this is highly probable.)

And for those of you who say “You idiot, you could have done this much more easily this time and the last two times by…” Well, you may be right. I certainly could have saved a lot of boring and annoying work in 2008 and 2009 if I’d thought of this. There may be an even better way, but this is a good start.

But Still They Blog: Four more profiles and a rationale

Saturday, July 17th, 2010

Here’s what you’ll find at the bottom of page 15 and all of page 16 in But Still They Blog: The Liblog Landscape 2007-2009: (with one subheading eliminated)

The Rabid Librarian’s Ravings in the Wind

“Born, like other comic book characters, out of an otherwise trivial but life-changing animal bite, the Rabid Librarian seeks out strange, useless facts, raves about real and perceived injustices, and seeks to meet her greatest challenge of all–her own life.” By Eilir Rowan. Began October 2001.

Metrics 2007 2008 2009 C08-09 C07-09
Posts 211 170 176 4% -17%
Quintile 1 1 1 2 2
Words per post 170 237 204 -14% 20%
Quintile 4 3 4 4 2

Impressively active, varied, personal blog on all aspects of life and libraries. (There were comments in 2009, but slightly fewer than 0.05 per post.) These aren’t high quarters, by the way: Annual posting totals were 663 for 2008, 653 for 2007, 808 for 2006…and more than 1,000 in 2004.

wiredfu

“another wretched hive of scum and villainy” Began December 2001.

Metrics 2007
Posts 5
Quintile 5
Words per post 53
Quintile 5

The most recent post is dated February 1, 2008. A sidebar item is a little clearer—wishing people Happy 2008 and saying: “I’ve made a New Years resolution to post more. But then again, I made that same resolution in 2007 and 2006. So, at this rate, I hope you enjoy 2009 as well.”

rawbrick.net

“A personal weblog.” Began January 2002.

Metrics 2007 2008 2009 C08-09 C07-09
Posts 58 40 35 -13% -40%
Quintile 2 2 2 2 2
Words per post 225 277 213 -23% -5%
Quintile 3 3 4 4 4
Comments per post 0.1 0.2 0.03 -84% -79%
Quintile 4 4 4 5 5

Mostly, but not entirely, movie (on DVD) reviews, including placeholders.

The Shifted Librarian

“Shifting libraries at the speed of byte!” By Jenny Levine. Began January 2002.

Metrics 2007 2008 2009 C08-09 C07-09
Posts 86 34 8 -76% -91%
Quintile 1 2 4 4 5
Words per post 546 374 342 -9% -37%
Quintile 1 2 2 4 5
Comments per post 2.2 5.9 6.3 6% 184%
Quintile 1 1 1 3 1

This blog has shifted over the years—from a primary focus on “shifted” librarianship, to a primary focus on gaming in libraries, to a mixture of topics.

Who cares? Won’t there be a replacement?

As to whether you should care or not, that’s your call. I’m doing some additional promotion and excerpting from the book to remind people that’s still around. I believe it’s the best look at liblogs (or the biblioblogosphere, if you prefer)–and the only one with brief comments on most, but not all, blog profiles.

To some extent, this book does replace The Liblog Landscape 2007-2008–but only to some extent:

  • The 2009 book does not include measurements on use of illustrations in liblogs.
  • The 2009 book has somewhat more stringent requirements for blog inclusion.
  • If you’re looking for a particular blog profile, you may find the 2008 book–with its 145-page final chapter consisting of all the profiles in straight alphabetic order–easier to use. Profiles are scattered throughout the 2009 book, for logical and stated reasons; you can find them through the index, but that’s a little slower.

As to there being a replacement: Maybe and no.

  • Maybe: I’m thinking about (OK, working on) a new liblog survey that is in some ways more ambitious than the 2009 project, but in other ways much less ambitious. Phase 1 is nearly complete. Phase 2 might take a few weeks or several months (depending on other projects). This survey might yield a (considerably smaller) book. It might not.
  • No: One of the ways in which the new project is much less ambitious is that there will not be profiles of individual blogs, and the index will not include the names of bloggers (I’m not even recording those). The only way there could be a new set of profiles, covering up to four years of blogging activity, is with direct advance sponsorship for the work required; the probability of that happening is somewhere close to the probability of, say, the Dow reaching 20,000 by the end of Summer 2010.

When the 2008 book emerged, a couple of people said they’d find it a lot more interesting if they had my comments on individual blogs. The 2009 book has those comments–brief ones, and omitting any absolutely damning comments, but still. And, of course, I did not include the 2008 profiles in the Cites & Insights version of the 2008 book.

But Still They Blog: Brief Excerpts

Tuesday, July 13th, 2010

When did liblogs begin?

Year Blogs Percentage
1998 1 0.2%
1999 1 0.2%
2001 6 1.2%
2002 20 3.8%
2003 58 11.1%
2004 71 13.6%
2005 127 24.4%
2006 123 23.6%
2007 103 19.8%
2008 11 2.1%

Table 1.1: Blogs by year of origin

Comparing this table to the same table for last year’s larger set of blogs, I note that two of three blogs from 1998 are gone, as are one of two from 1999 and the only one from 2000. Other than that, the pattern is similar—with, again, the peak for new liblogs being in 2005, declining slightly in 2006 and somewhat more in 2007. There’s a huge decline in 2008, down almost 90%. That may mean that few new blogs gain readership, that bloggers aren’t bothering to add their blogs to LISWiki—or that there are simply a lot fewer new liblogs.

Founding year for liblogs

Figure 1.1: Liblogs by year of origin

…and here are two more liblog profiles, this time from pages 13-14

ResearchBuzz

“News about search engines, databases, and other information collections.” By Tara Calishain. Began August 1998.

Metrics 2007 2008 2009 C08-09 C07-09
Posts 58 50 48 -4% -17%
Quintile 2 1 1 2 2
Words per post 259 186 379 104% 46%
Quintile 3 4 2 1 2

What it says in the tagline—and, if you can get past the daily tweet summaries in recent days, the blog includes some fascinating essays.

librarian.net

“putting the rarin back in librarian since 1999” By Jessamyn West. Began April 1999.

Metrics 2007 2008 2009 C08-09 C07-09
Posts 68 34 51 50% -25%
Quintile 1 2 1 1 2
Words per post 308 286 185 -35% -40%
Quintile 2 3 4 5 5
Comments per post 5.9 3.6 4.0 13% -32%
Quintile 1 1 1 3 4

If not the oldest liblog, this is certainly close—and West (OK, Jessamyn) continues to write a wide range of interesting commentaries on the field.

For more…

Buy the book!