Archive for October, 2006

Spam: Better and worse

Posted in Writing and blogging on October 31st, 2006

Just a quick update to this post (and others regarding spamments and spam linkbacks):

The good: In general, Blake Carver’s changes have worked. Virtually no trackback attempts wind up in Spam Karma’s spam report, which–on most days–means that total spamments are down to two-digit numbers, frequently only a dozen or so per day. Sometimes even less.

Have I mentioned that LISHOST.org is a great hosting service? And that Wordpress is great blog software?

The not-so-good: What does wind up in the Spam Karma list–and, fortunately, almost always there, not on the blog (I’ve had to delete two fairly tricky spam comments in the last two months; that’s not bad) is much nastier than before, at least in the portion (poster’s “name” and first couple words of comment) that shows up in the summary list. This is stuff I wouldn’t repeat here…at best degrading, at worst illegal. I’m not a prude, but “filth” sums up most of it. Still, it takes less than a minute a day to check and make sure real comments didn’t get flagged as spam (which has also happened two or three times in the last two months).

And I still haven’t had to go to total moderation or Capcha. It’s fortunate that this is only a midrange blog in terms of traffic (averaging 1275 sessions/day since 9/1/06), and only has pagerank 6, the vast middle ground of Google pagerank, making it a less attractive target. I do not envy “A-listers,” even in the modest realm of library blogs!

Safety and numbers

Posted in Stuff on October 30th, 2006

This morning’s San Francisco Chronicle included a brief piece about the most dangerous/safest cities. You can legitimately argue the methodology of such reports, but at least the publisher that issues them uses consistent methodology and bases reporting on the FBI’s Uniform Crime Reporting numbers.

Thanks to ResourceShelf, I found myself at the summary report itself, and saw something that I’m a bit surprised was not included in the Chronicle piece–but maybe I shouldn’t be, since it’s an AP report.

To wit, the safest large city (population half a million or more) is decidedly within the Chron’s circulation area: San Jose. It doesn’t come as a surprise that San Jose ranks that high (actually, I naively expected Honolulu to be 1st; it’s 2nd), given the crime rate in general in Silicon Valley. (Mountain View’s just a bit too small to be included, with around 72,000 population, but I’d guess the local crime rate is even lower than in San Jose.)

The lists broken out by very large, medium-size, and smaller cities are, I think, more interesting than the overall lists–and particularly the 32 largest, those half a million and over (12 in the middle, including San Francisco and Los Angeles, don’t show up).

A couple of caveats: Because of problematic rape reporting, Chicago isn’t included in the overall rankings–and because of understandably lousy crime reporting in general in [the second half of?] calendar 2005, neither is New Orleans (which is now way under the half million mark in any case).

Contemplating the nature of authorship

Posted in Books and publishing, Scholarly publishing, Writing and blogging on October 30th, 2006

Just another quick link, this time to this post at Improbable Research.

As a usually-solitary author (well, writer: author’s pretty hifalutin’ for my stuff), fortunately not in STM, I’m blown away by the notion that some gargantuan number of “authors” actually wrote an article in ScienceNature. (No, I’m not going to count the list.)

Update: Science, Nature, one of those… And I count (rather, Word’s replace function, replacing a comma with a comma, counts) between 190 and 192 authors for a six-page paper. Which, as Wow!ter’s comment notes, is tiny compared to an IgNobel Prize Winner.

Firefox 2, part 3: Routing one-click feed subscriptions to Bloglines

Posted in Technology and software, Writing and blogging on October 30th, 2006

David Rothman (the library David Rothman) explains it so well in this post that I have very little to add.

Except: Geez, I’m glad there are visually-oriented people who will put together instructions like that. I’m text-oriented and would find it difficult to do so–but I believe that many people will “get it” a whole lot faster from a post like Rothman’s. (As long as you don’t expect your Tools pull-down to look exactly like Rothman’s…chances are, yours will be shorter unless you have a lot of add-ins and extensions.)

Great stuff. Bloglines isn’t your only option. And nobody’s requiring you to call that orange thingy an “RSS subscription icon.” “Feed getter” works just fine for me.

Have I mentioned lately how impressed I am by quite a few libloggers? Not quite impressed enough to give up W.a.r. as a weak effort by comparison; fortunately, the field doesn’t work that way.

What’s on your FireFox search dropdown?

Posted in Technology and software on October 27th, 2006

Part two of a very short “FF2 happy happy joy joy” series, of which part one is right here.

I’m sort of an intermediate customizer–I tend not to use a lot of special hotshot browser addons, etc., so I don’t usually mess with options too much. (OK, except for insisting that pages use my choice of font; I just get so tired of the dreary typefaces that dominate the web).

But with FF2, since the one-click feed subscribe (with Bloglines or your favorite reader–ah, and there’s the real-time spellcheck kicking in, not liking either “Bloglines” or “spellcheck”), I’d already customized the toolbars somewhat–moving the web-search box down to the bookmarks toolbar (along with Gmail, form-fill, search highlight toggle, spellcheck, Google info, and–until I deleted it–the now redundant “Sub with Bloglines”) so that the location bar was wider (I’d already moved PageRank to the navigation toolbar). With the search box more visible, I figured I’d use it more–and that meant customizing the dropdown list.

So here’s my list, after two minutes’ work:

The big four:

  • Google
  • Yahoo!
  • Ask
  • Live Search (formerly MSN Search)

–and I note that FF still calls Ask “AskJeeves,” which is quaint.

Then:

  • Answers.com, which I have yet to try
  • Wikipedia (of course I use Wikipedia)
  • Worldcat.org
  • IMDB
  • Creative Commons
  • Amazon.

The last two are part of the default list (as is Answers.com–but, curiously, not Ask or Live Search). It’s trivially easy to add a site, assuming the creators want it to be added.

I do try to rotate web searches among the big four. I’m hearing that Live Search (vastly improved over MSN Search) is getting “newer” content (but haven’t attempted to prove that). I like Ask for answering questions directly and for a number of other features. Yahoo! and Google are, of course, Yahoo! and Google, and pretty competitive.

Sure I use Wikipedia. Why wouldn’t I? Not as a definitive answer, but as a great starting point, taken with a couple bushels of salt depending on the topic.

One mild annoyance/curiosity: If I’ve used Google as the search engine[apparently, any search engine in the dropdown, since it just happened with Worldcat.org], then go to Gmail, Gmail seems to assume that I want to search my mail archive with the Google search term [I'd used in the FireFox search box]….

So what’s on your search dropdown? Do you send all your searches to one engine? Have you tried Yahoo! or Live Search lately? (Or Ask, but Sarah Houghton-Jan has already–and correctly–noted that you really should give it a try, and I assume you take her more seriously than you do me.)

Here’s a question: What’s the fifth-place open web search engine?

Now to eliminate a bunch of bookmarks, since those are all redundant…

Liblog mortality rate: An interim note

Posted in Cites & Insights, Libraries, Writing and blogging on October 25th, 2006

Given my planned (ha!) “liblog investigation” for 2007 (or one of them, anyway), I’m periodically checking the 213 liblogs from the great middle that I discussed in the August C&I.

Just finished the second such check, and plan to check again every two months or so.

The good news: Only a few of the blogs have explicitly ceased, and even fewer have simply disappeared, being replaced by spam ad pages or other web graffiti.

The perhaps less good news: Taking, say, two months without posts as a sign of possible morbidity, then 31 of 213 have either ceased or gone seriously idle since the end of the scanning period (6/30). That’s just under 15%.

“Why, at that rate, nearly half of them will be gone by the end of June 2007. That’s awful”

(Yes, this is a strawman: The voice of the linear progression believer. Nobody said this. People have certainly assumed either linear or, worse, algebraic progressions in other cases that make no more sense. See the NEA and “the end of leisure reading, just as one bad example.)

I find that highly unlikely. Actually, even if the same percentage disappeared in each third of the year (10/25 is close enough to the end of October, isn’t it?), that would mean that another 26 would go silent over the next four months and another 23 after that, leaving 133–62% of the original 213–not the 55% you might extrapolate from the loss of 15% over four months.

But that’s also unlikely. Some of the 31 will come back to life. Chances are, some significant portion of the 31 went quiet because bloggers graduated or otherwise went through end-of-spring life changes.

Here’s a crude guess: I’ll guess that at least 150 (70%) of the 213 will still be active during the April-June 2007 period. I’m hoping for closer to 165 or 170, but that may be too optimistic.

No halo effect expected: If being mentioned in C&I didn’t encourage people to keep posting (and why on earth should it?), this W.a.R. post certainly shouldn’t have that effect.

The Firefox that didn’t bite

Posted in Technology and software on October 25th, 2006

I installed Firefox 2.0 at work this morning and at home this afternoon.

Nothing happened.

This, in case it isn’t obvious, is Very Good News. My heavily-customized bookmarks toolbar (combining some Bloglines options and some Google Toolbar options) and somewhat-customized navigation toolbar were the same (except, I think, for changes in a couple of icons). My options settings were the same, with sensible options for the new features in 2.0.

The new version replaced the “old” one (1.5.0.7, not exactly ancient) smoothly. The download was reasonably fast at work, very fast at home on our “slow DSL” (we’re too far from the nearest switching center for fast DSL).

I haven’t run into the phishing protection, but I’m pretty careful about suspicious email and the like, so there’s no reason I should.

I do have to get used to the fact that Bloglines now automatically opens posts in a second tab instead of a second window, but since Firefox warns you before letting you close more than one tab, it’s a painless learning process. Otherwise: No news is good news; this seems to be a clean upgrade.

The one oddity, as others have already noted: Firefox 1.5.0.7 does not recognize Firefox 2 as an update. You have to go get it directly. Not hard to do…

Just another vote of confidence from one of the 20%. I’m guessing very few of us are in a hurry to rush back to IE, no matter how good IE7 is. A little diversity in the software arena can’t hurt.

Addendum 10/26: Didn’t bite, clearly an improvement. Subscribing to a feed is now one click faster than with the “Sub with Bloglines” bookmark-tool–and checking today’s gmail, I got a clever phishing attempt (something about finding a flickr photo by accident). I was 99% certain it was phony, particularly since “flickr.com” was not the primary domain–but, given Firefox 2’s phishing protection, I thought I’d see what happened. Sure enough: Big bold warning message and “probable forgery” popped up. Good work, Firefox! (And, CW, thanks for adding your note.)

Printability: It’s not just for Firefox anymore

Posted in Cites & Insights, Net Media, Writing and blogging on October 24th, 2006

The current Cites & Insights begins with a brief Bibs & Blather (the secret real name for C&I, but you already know that) grumping about bloggers who use Six Apart software, write posts more than a few hundred words long, and don’t realize (or care) that, without some tweaks to the templates, Firefox users can’t print the posts except by copying the text into Word or some other program. I questioned whether such writers really didn’t want to be taken seriously…and noted that, of course, one solution was to mark-as-unread and once in a while use IE instead.

Whoops. Along the way, I ran into one interesting blog where that doesn’t work–where, apparently, the width of the banner (or some other setting) causes printed lines in IE to be about half an inch wider than the margins of the paper. And you typically won’t notice that th enough missing every s in the text (sample of phenomenon: “that there’s just enough missing every so often in the text” is what should be there) so as to make the document useless until after you’ve printed it off.

It’s happened again, this time on a very long post with loads of comments (pointed out by StevenB at ACRLog).

I did print preview in IE: 15 pages. Then I looked closely…at the missing ends of lines. Sigh. Mark, copy, paste to Word, print the resulting 17 pages. (8 for the post, 9 for the comments).

I really, truly don’t get it: Do these bloggers never actually look at their own pages? Do they assume that eight-page posts won’t ever be printed out? That advice that’s clearly been thought through and carefully worded isn’t worth printing and saving/savoring?

Of course, Six Apart’s mostly at fault. TypePad doesn’t have to work this way. WordPress certainly doesn’t (although, sigh, I’m seeing more bloggers who manage to screw up tweak their templates sufficiently that the text of a printout won’t start until the second or third page).

End of followon grump.

By the way, I thought I’d start my series of posts commenting on presentations at Internet Librarian, based on what I see in the blog postings on those presentations.
And now I’ve finished my series of posts doing third-hand commenting. Live and learn.

A touch of irony in the morning

Posted in Books and publishing, Net Media on October 23rd, 2006

More than a touch of fog, too, but that’s just part of the strange late-October weather (85 on Saturday, 81 or so yesterday, foggy and cold this morning; in Monterey, 70 miles south, it’s supposed to be foggy much of the time).

Meanwhile, there’s this essay by Jaron Lanier entitled “DIGITAL MAOISM: The Hazards of the New Online Collectivism.” (It actually dates from May, and I should have picked it up in the current Wikipedia essay, but I don’t read Edge and only encountered this third-hand, because Lawrence Lessig noted a post by Nicholas Carr referring to it…the wonders of meta!)

As to the essay itself, apart from the red-baiting title, I have no comment at the moment; it’s long, with a twice-as-long set of responses, and I’ll read it at my offline leisure. I did want to comment on an interesting juxtaposition.

Here are the first two sentences of the essay itself:

My Wikipedia entry identifies me (at least this week) as a film director. It is true I made one experimental short film about a decade and a half ago.

Lanier doesn’t like to be identified as a film director. He disses the short film he made so many years ago. He’s tried changing the Wikipedia entry several times and it keeps being changed back.

And here’s the one-line author bio from Edge at the very end of the long essay (emphasis added):

Jaron Lanier is a film director. He writes a monthly column for Discover Magazine.

Bwahahah…or is the proper response “Doesn’t anyone actually read this stuff before posting it?”

Update: See comments. Apparently I was too obtuse to get the joke.

This shears manifest themselves in uncounted PowerPoint foils

Posted in Scholarly publishing, Technology and software, Writing and blogging on October 19th, 2006

Another vaguely whimsical coffee-break post…

Peter Suber wants to keep us in the know. When an available article about open access isn’t in English, he does the only economically viable thing: Post a link to a machine translation (in this case Google). Here’s the translation (and here, I hope, is the Suber item–I got it today because he updated it today).

Some of the language is simply charming (noting that this is Google, not the writer):

“this shears manifest themselves in uncounted PowerPoint foils” - haven’t you wanted to say that?

“I know myself in the bibliothekarischen Community mentioned not out, do not see it however in principle not as disadvantage that one completely unideologisch, also for purely monetary reasons for open ACCESS its kan.” - Would that I could know myself this well!

“That is common, with thesis 3. that, pardon, stupidest and most arrogant Palaver, which I read in recent time in the academic surrounding field!” - No comment,

“Oh, is hypocritical that! Universities, scientist, studying may get the output of publicly financed research not free of charge, because the bad industry would then get that also free of charge.” - Also no comment.

Leading to the summary:

“Mr. Ball summarizes that open Acces represents definitely no revolution in scientific communication and it at the time is that the bibliothekarische Community is concerned with more important things.
I summarize for me that the Mr. Ball arranged arguments, illusory arguments, prejudices, Ignoranz and half truths to a high song in the production way of the established publishing houses.”

Two serious comments:

  • Despite the sometimes-charming problems of machine translation, it’s not hard to figure out what’s being said in this long and vigorous refutation of an anti-open-access piece.
  • The writer appears to know their stuff, and it’s clear that the myths of anti-OA argumentation are not restricted to the English language.

And one whimsical comment: Google’s translate tools may have a future in Joyce simulation, particularly given that “I know myself…” comment.

850GB for $48!

Posted in Cites & Insights, Technology and software on October 19th, 2006

If you believe this item at PC Magazine’s online comparison service, we have two miracles in one:

  • The biggest single internal disk just jumped from 750GB to 850GB, and it’s not from Seagate, it’s from IBM (also, it’s not SATA, it’s IDE/EIDE).
  • It costs a whopping $48, or less than six cents a gigabyte–quite a drop from the fifty cents or so that you’ll pay for that shrimpy little Seagate 750GB drive.

Ready or order a few? Not so fast. As with most things that look to be too good to be true, there’s a problem here. Set aside the 1.5-checkmark rating for the store. Drop down to the ads below the detailed description. Click on the link for the same model number–from the same vendor.

Hmm. It says 850, to be sure, and it says $48. But the suffix is MB, not GB. (And it’s IBM Lenovo, not IBM, but that’s irrelevant.)

$48 (including shipping) for a teeny-tiny drive (it’s 3.5″, notebook size, but by today’s standards 850MB is pretty scrawny, and you could get a higher-capacity USB flash drive for that money) doesn’t seem all that great: $56 a gigabyte.

This would simply be amusing–except that, if you do a search on the model number, most of the early results show the same “850GB” capacity, even though they lead to lots of “different” shopping services and review compilations. All of which seem to offer up the same item at the same price, some (but not all) with the same single review noting that it’s not as described.

As noted in a commentary on Wikipedia in the new Cites & Insights, at least one writer has praised Wikipedia for making it clear that truth is whatever most people think it is. (I’m paraphrasing, but not by much.)

On that basis, then, if we can substitute “most first-page results from a search” for “most people,” this drive really is an 850GB drive. That’s the truthiness of the situation.

Just don’t try to store more than one-tenth of one percent of that much data on it: That gets involved with that old-fashioned truth, the kind that has to do with physical facts.

Cites & Insights 6:13 available

Posted in Cites & Insights, Copyright, Libraries, Music, Net Media, Technology and software, Writing and blogging on October 18th, 2006

Cites & Insights: Crawford at Large 6:13 (November 2006) is now available for downloading.

The 26-page issue (PDF as always, but major essays are also available as HTML separates from the home page)
includes:

  • Bibs & Blather: Should I Care About What You Write? - printability revisited
  • Net Media Perspective: What About Wikipedia? - The saga of Wikipedia, Britannica, and Nature; various commentaries on Wikipedia; and early stuff on Citizendium (plus two good notes on library-related wikis)
  • Trends & Quick Takes - three mini-essays, four quicker takes.
  • Old Media/New Media Perspective: Tracking Hi-Def Discs - what’s happening with HD DVD and Blu-ray and why you should(n’t) care
  • PC Progress: February-October 2006 - 27 group reviews in 14 categories
  • Copyright Currents - catching up on fair use and infringement, DMCA, orphan works and the analog hole.
  • My Back Pages - three snarky little essays (one of them not really snarky at all)

Editing other people’s words

Posted in Language, Writing and blogging on October 17th, 2006

…is so much easier than editing your own, at least in some cases.

Trivial example: (this is a trivial post, with probably more language-related posts to come)

Reading the October 13 Chronicle of Higher Education–the fun part, Section B–I get to the letters, one of which is about deadly sins of bad writing and adds two more, one of which is “wordiness”

(For example: “Here it is very important to note that in this case the hippopotamus in question was a midget.” How about: “Note that, in this case, the hippopotamus was a midget.”) [Stacey C. Sawyer]

Very good–and I wish I could consistently do as good a job with my own prose. But looking at the particular string of words and any plausible context or meaning, I found myself saying:

This hippopotamus was a midget.

From 18 to 10 to 5. Don’t expect me to do as well on my own stuff. But then, neither have outside editors (although they almost always improve “my” prose).

Sophisticated argumentation

Posted in Language, Libraries on October 17th, 2006

New headnote: I’m reverting most of the other changes because the post gets too confusing. I’ll add my caveats at the end. However, it is now clear, thanks to this excellent comment from Phil Bradley, that I misinterpreted the situation based on sketchy reporting. I’m restoring the original post so that the comment stream makes sense [End of new headnote]:

It seems that a big-name speaker in a big-name conference settled the issue of whether terminology matters, at least within one current movement/set of tools/hypefest/truly good idea set, by displaying a slide containing the Answer:”I don’t care.”

Presumably implying that nobody else should either. Where I’ve seen this noted in reports, it’s with considerable enthusiasm.

It strikes me that sophisticated argumentation at this level deserves appropriate response. To wit, those who think that language doesn’t matter are, to some extent, telling us that their words don’t matter. So an appropriate response to their posts, articles, whatever, might well be

“I don’t care.”

Or is it only language that they disagree with that should be dismissed in such a manner?

Actually, I’m charmed by librarians arguing that language and wording don’t matter. It sets such an interesting tone for the future.

OK, that’s the original post. I did not name the speaker, deliberately…in part because I saw this as another example of what I’d seen much earlier from another source (see the comments for links), and thought it was possible that I was misinterpreting the speaker. It is now clear that this was the case. It’s also clear that the misinterpretation was based in part on the reporting of the session, specifically this commentary:

“My favorite slide was Phil Bradley’s, in response to all the discussion about semantics and buzzwords. It simply said:

“I don’t care”

I LOLed”

[The link is in the comments.] Note “in response to all the discussion about semantics and buzzwords.” Note the lack of “After a slide saying ‘So what do I think?’ and a commentary that made it clear that both sides had merit.” At that point, as Bradley says, the slide wasn’t intended as argument; it was a personal comment. And entirely appropriate as such. I probably would have laughed too.

Note that I did not name Phil Bradley, deliberately. It was a blind item because I was noting a problem I’ve seen more than once. This did not happen to be an instance of the problem.

As for the courtesy of always asking someone before commenting on anything they’ve said in public, or that has been reported that they’ve said, or before interpreting what someone says…well, that’s an interesting idea. It’s certainly not a courtesy I’ve been provided. In fact, I’ve seen deliberate rewordings of what I said. For example, the post above does not say “someone at some conference in some speech attempted to preclude discussion of the language/term.” Nor did I “deny the man a slide with his personal opinion”–where above do I say “The speaker should not have been allowed to put up that slide”? Those are both deliberate misstatements, not just misinterpretations.

To sum up: I misinterpreted what went on at the conference based on (a) selective reporting and (b) my own long experience with the person who’d done the selective reporting. It was a reasoned comment that happened to be wrong. I did not mention the speaker by name because it was used as an example (and because I knew I might be wrong). I wrote a short and angry post because I’m tired of the real instances (which this wasn’t) of argument-by-trivialization.
Again, my genuine apologies to Phil Bradley–not for failing to contact him, but for misunderstanding the sketchy report. And my genuine thanks for his clear, calm, lucid commentary. Next time I see reporting on his speeches that seems askew, I will check first.

PageRankled: A Friday post

Posted in Cites & Insights on October 13th, 2006

Update: This essay is mostly pointless, for reasons explained by Seth Finkelstein. See his comments or the update in midstream. (A pointless essay at W.a.R.? Well, no earthshattering surprise here…) For the record, though, I’ll leave the essay in place.

It’s been almost three months since Cites & Insights moved to its new home.

Three months and three issues.

Readership? OK. Hard to be sure how it compares; Urchin and Weblog Expert measure things differently. I know that the old site’s still getting a fair number of hits–actually, the average visits per typical day hasn’t dropped all that much, but there aren’t the usual issue-publication spikes. That’s reasonable: The new issues aren’t available at the old site. (Overall traffic on the old site is only down about 20% in the months since the move as compared to the months prior to the move.)

I think there are fewer readers for the most recent issues; it’s hard to be sure. There are more than enough to keep me writing. And readership for any given issue continues to grow over time. It’s likely to be a very long time before I ever have another essay that appears to have more than 19,000 readers (most of you can guess which 23,000-word, entire-issue essay that was).

But I was reminded today of just how much of one non-negotiable currency went away with the move. I got around to loading the Google Toolbar on my current version of Firefox (and have since moved the desired Google Toolbar items to the Bookmark toolbar or the Navigation toolbar, so I can keep the number of open toolbars down).

One amusing/impressive/terrifying portion of the Google Toolbar is the PageRank item.

This blog has a surprisingly high PageRank (6 at the moment), just as it has a whole bunch more daily visits than make any sense to me.

Cites & Insights had an even higher PageRank: 7, the number that search engine optimizers are supposedly willing to donate limbs to reach. Why not? Librarians are heavy linkers, and C&I has been around for a while.

The new site? Zero. Nada. Not even up to 1. See next paragraph: It’s really dropped from 7 to 6.

Update, Sunday, October 15: While I’ll leave this essay in place, turns out that the toolbar PageRank is out of date. Seth Finkelstein pointed me to a tool that checks Google’s data centers; they consistently show a PageRank of 6 for http://citesandinsights.info. I’m not quite sure what 6 means in the scheme of things, but I know it’s plenty good enough. (This blog and my home page have the same rank. Eventually, I expect that the new C&I site will make it back to 7. No hurry.) Thanks, Seth–and as for the essay in general, the right summary may be “Never mind.”

I‘ll check every six months or so and see how long it takes to reach a nominal PageRank.

Fortunately, PageRank really is non-negotiable in this case. I’m not planning to add external ads to the C&I home page (if there are ads, they’ll be for my own books, if I ever get around to doing them). I took the ads off W.a.R. because they were taking up space and not yielding worthwhile revenue. For the highest-readership issues, most people don’t arrive via the front page in any case: They go directly to a PDF download or an HTML essay.