Public library data analysis: Is there a need?

As I’ve thought about ways to keep my hand in–that is, ways to regain some level of earned income that justifies my continued interest in and work on library issues–I wondered for a while whether public library data analysis might be such a way. This rambling post is about that question. At this point, I think the answer is “probably not, but maybe I’m too pessimistic” rather than the “possibly so, if there was a plausible way to get funding” that I started out with. So instead of a neat & tidy prospectus, it’s a messy ramble.


For those who like context, this post grows out of several related backgrounds:

  • The 38-state universe study of public library use of social networks I did as research for Successful Social Networking in Public Libraries–brand-new research that I would dearly love to continue on a whole-nation basis, but that I can’t justify doing (it would take a few hundred hours to do and report) without five-digit revenues. That issue’s documented in various posts–and I’d still love to get some guidance on how this can work (other than “give it up. nobody cares” which I may figure out for myself).
  • My recent query as to how many public library systems have actually closed and remained closed, the conversations I’ve had with a number of library researchers in attempting to answer that question, my actual research for the [crude] answers for 2008 and 2009 (which will appear in the April 2012 Cites & Insights) and my future research, with considerable assistance from others, for the [also crude] answers for 1999 through 2008, which will probably appear in the May 2012 Cites & Insights unless I can find a paying market for the results.
  • Time spent with the IMLS annual reports and databases on public libraries in the United States while doing both of these projects, and with some state library reports as well…and going back to some comments from library researchers in email and blog post comments.
  • Time spent examining the HAPLR Index and other sources (the LJ 5-star site seems to be unreadable at the moment, so I won’t link to it) and pondering their strengths and weaknesses, from my admitted bias of not wishing to Point Out Top Achievers so much as to provide overall patterns that allow libraries to gauge themselves.

Why Me?

Recent success in getting people to pay for Cites & Insights at all, or to sell my writing directly to librarians (C&I Books) as opposed to indirectly through publishers (admittedly with professional editing, indexing, cover design and marketing) is tending me toward this feeling:

Any idiot can write, and with the web most idiots do.

But there’s something else:

Most idiots aren’t any good with numbers–and very few of us are good with both numbers and words.

While I’ve been a writer longer than I was a systems analyst/designer/programmer, I’ve always been a “numbers person” to some extent: At Cal, I had an informal math minor along with my formal Rhetoric major.

I love to find patterns in numbers. I like transparency and honesty. I dislike chartjunk–and I’m probably more hardassed about chartjunk than serves me well, specifically the most common form of misleading charts: Those with non-zero axes. Even with labels, such charts inherently exaggerate differences and trends: You may glance at the numbers on the left axis, but what you see first is the enormous change.

As perhaps the most widespread and one of the worst examples: Stock market daily-change charts. Most stock-market quick charts I see are about as bad as they can get, as the axis and scale are always designed to make the day’s movement–no matter how minor–take up the full chart. The only leavening factor is that most charts are hour-by-hour during the day.

Yes, I know, given that stock charts tend to be fairly small, they’d be too boring to even publish if they used zero axes, and excitement usually trumps meaningfulness. Such are media.

I believe I could do both derivative and longitudinal studies (and I hear some minds snapping shut right there–“derivative” in this case is things like circulation per capita or calculated return on investment; “longitudinal” is looking at change over time) that would be honest, transparent, and written well enough to be usable by librarians who are nervous about numbers–but perhaps not by those who are “anti-numerate” rather than just innumerate.

I know there are interesting numbers that I don’t believe are well-reported.

I suspect that one issue in getting librarians to pay attention to numbers might be that some of the best reports–including those from IMLS (although they’re prone to non-zero axes)–are overwhelming. A 230-page PDF may be what’s needed to report properly, but that’s huge.

I wonder whether librarians would be well-served by, and would be responsive to, relatively brief reports, one for each state and library size group (the 10 HAPLR divisions work well), that provide some quick longitudinal charts, some information that looks not only at totals and averages but, more usefully (in my opinion), at medians and 80% figures (and the like), and both of those on some derivative measures that could be useful to see where your own library stands. (The brief reports backed up by a clear, honest, transparent description of what’s behind them.)

And I think I could do those, and do them well, with clear language supplementing a relatively small number of graphs and even smaller number of tables in each report.

On the other hand…

I wonder whether there’s much demand for this sort of thing.

Using the table of state public library data sources provided by Colorado’s Library Research Service, I clicked through to data for the first 30 or so states. I found a wide variety of results–from Connecticut’s first-rate (but long) reports with strong longitudinal work, Colorado’s excellent (if limited) reporting, California’s very good 5-year longitudinal work and Kentucky’s 25-year graphs and generally solid discussions, to a number of states with no summary reports at all, and a few with no apparent data. I’d say six of the 30 (or so) had some sort of longitudinal data.

And I wonder whether the rest don’t have such reports or data because there’s really no demand for it. (I wonder how many librarians actually read the IMLS reports–you don’t really need to go through the full 200+ pages!)

Not that I don’t see things missing in general–but I come away thinking that there’s really not enough interest to pursue this idea at all.


Am I wrong? If so, and if there’s a path to actual compensation, I’d love to hear about it (via email to waltcrawford at–spamments are still averaging >100 per day, so I don’t check them carefully).

Otherwise, well, “Walt, idiot, nobody really wants this, and they certainly won’t pay for it.” Heck, I can tell myself that.

[Note that “nobody wants this” isn’t quite the same as “nobody needs this,” but it might as well be from my perspective. In either case, I’d be a greater idiot to keep pursuing it.]

2 Responses to “Public library data analysis: Is there a need?”

  1. kathleen says:

    Maybe insight from–“Robert E. Molyneux, library data hero”–see this post:

    or here:

  2. walt says:

    Bob Molyneux, who is indeed a library data hero, has been in touch with me (already) and has been helpful.