Two days ago (Monday, May 7, 2012), I asked for instant feedback on a possible quick-book project providing a detailed set of charts and percentiles related to a “value ratio” for public libraries–that is, the ratio between library expense per capita and library value to its users and community (per capita).
A number of states have done studies of this ratio (usually defined as ROI, return on investment), some of them using fairly sophisticated models and surveys. Consistently, however, such studies yield a single number.
What I have in mind is both cruder (far less sophisticated and relying entirely on public information) and more detailed–namely, yielding a variety of percentile ranges that break down the overall picture into more detailed forms. To quote a little of the other post:
The four axes or, if you will, chapters, following an overall look:
- Clumps of libraries by LSA size (using the 10 HAPLR divisions)
- Clumps of libraries by expense/budget ranges
- Clumps of libraries by per capita expense ranges
- State-by-state analyses (one clump for states with few libraries, probably three by broad size categories for states with many libraries)
For each clump, as for the overall figures, I’d provide correlations as appropriate, plus mean, median, and percentile levels in two ways–the 90th, 80th, 70th, etc. percentiles, but also the percentage of libraries exceeding certain value ratio set points.
I wanted feedback on whether this might be useful, useless or even a bad idea. I wanted that feedback by the end of today (May 9, 2012), since my plan was to have the book available by ALA Annual.
Early feedback
I’ve received half a dozen responses. While most of them are mildly positive, by far the most detailed response–and the only one clearly coming from a state library employee–was less so. I’ll quote that feedback,, from Steve Mattthews, here:
For what it’s worth, it seems it would only by useful IF the Value Ratio to which you refer is an empirical statistically useful number. Simply calling some number “value” doesn’t make it valid. Are you using contingent valuation (CV) either willingness-to-pay (WTP) or willingness-to-accept (WTA)? Or are you using some arbitrarily assigned “value”?
It also seems like a no-brainer premise to say that “libraries that are better funded provide better service.” One of the pitfalls of using data collected for one purpose to apply to a different purpose is that it requires numerous questionable assumptions to make the numbers meaningful. How would this data be more valid or more valuable than the results from several states that conducted their own ROI study within the past few years? The 21st Century Library is More:
I replied, probably defensively, after looking at all of the ROI studies available from that link and from links within those links. My plan is to do something entirely different, possibly more valuable only because it’s something more than One Big Number for an entire state–but certainly less valid based on that commentary.
Extending the feedback request
If Steve Matthews’ response is typical of what well-informed librarians would say, or typical of how state libraries would feel about what I have planned, then I should abandon the idea: Way too much work for what may be perceived as valueless or even harmful.
I’m not quite ready to do that. Instead, I’ll ask for more feedback, through the weekend (that is, through Sunday, May 13, 2012). Apologies in advance if I respond in what seems a defensive manner; maybe I’ll try to avoid responding at all. Except, of course, to come to a decision on Monday.
So, either by mail to waltcrawford at gmail.com or in a comment here (noting that comments with multiple links sometimes get trapped as spam, and I get WAY too much spam to check each one):
Good idea? Lousy idea? Pointless idea?
[If you think my investigation into public library closings was actually a bad thing, you will most assuredly think this project is a bad idea as well–but you can think the first was good and that the second is pointless.]
Additional information (added 5/11/12)
I’ve now read or at least skimmed all of the studies linked to from the page Michael Golrick identifies (in one of the comments below), re-read some other sources, and modified (or clarified) the numbers I’ll use if (and it’s still if) I do this instabook–which, it should be clear, is intended to be complementary to the state ROI studies, the HAPLR ratings, the LJ Star ratings, and all those other things, not a competitor to any of them.
Here’s what I would be using to establish the Value Ratio and the two subratios (Explicit Value and Implicit Value):
Explicit Value
- $8.30 times the sum of Circulation, ILLTo, and ILLFrom. (This is
one ofthetwoBig Numbers–although ILLTo and ILLFrom are a small part of it. After looking at a wild diversity of arguably sound amounts for average circulation worth, I used the average price of mass market paperbacks in 2008.) - $15 per reference question. (Assumes that, these days, there are relatively fewer but relatively more important/difficult reference questions.)
- $10 per program attendance. (Given the relatively small number of reported program attendances, changing this number wouldn’t have much effect on anything.)
- The larger of: $8 times the number of counted PC uses or $2.66 times (the number of Internet-connected computers available for users times
the number of open hoursthe average number of open hours per outlet).This is the other Big Number.However calculated, it’s also a partial standin for uncounted items–database use, wifi use, etc. [If you’re wondering: That $2.66 figure is based directly on one of the carefully-done ROI studies, and assumes $4/hour value and that 2/3 of computers, on average, are in use. The $8 figure assumes an average of 2 hours per use, also at $4/hour.)
Implicit Value
- $60 per open hour (partly a standin for in-house use and the many uses of library as place–arguably, this figure should be much higher).
- $5 per visit (also partly a standin for the above).
Right now, it looks as though the median is a little more less than 5 and the mean is very close to the median (which is a good thing)–and EV is anywhere from two to three times as much as IV overall (about 2 for median, more than 3 for mean).
This information may or may not help late comments.
And no, I still haven’t made up my mind. The three or four additional hours spent yesterday in checking sources I hadn’t previously encountered, and the half-hour redoing overall numbers, don’t matter much in the overall scheme of things–they’re sunk costs. If this wouldn’t be valuable to libraries and the library community, I won’t do it.
Strikeouts and changes, Monday, May 14, 2012
As I was going over the various comments, asking for help with a good title for this study (from the Library Society of the World, and I’ve received some excellent ideas), and checking one item for interest (namely: how often is “actual PC use” a larger figure than “potential PC use”), I realized that there was one fundamental error in my ratios–it became fairly obvious when one multibranch library had more than $2 billion in PC use.
Namely, the original formula only works for central libraries with no branches. For multibranch libraries, it overstates the availability of PCs by a factor equal to the number of branches. That is: If a three-outlet library (central and two branches) has 200 PCs in central library (open 70 hours per week) and 100 each in the two branches (each open 50 hours per week), the original formula yields $2.66 * 400 * 170, or $180,880–and assumes that there are 400 PCs available 170 hours per week. (I would have discovered this error as soon as I started playing with the spreadsheet for the first chapter: It would be really obvious that something was wrong.)
The new formula is the best I can do, and it somewhat understates availability for libraries in which Central has many more PCs and is open many more hours than branches. To wit, it divides hours per week by the number of outlets–so, in the three-outlet case, the formula is $2.666 * 400 * 56.7, or $60,293.
If that seems like a big difference, consider a library with 89 outlets, open a total of 242,424 hours and with a total of 3,609 PCs: The original formula overstates the availability of PCs by close to 88 times. (Yes, there is such a library. As it happens, once the correct formula is applied, the two possible PC use benefit numbers–that is, $8 times reported PC uses and $2.66 times PCs times average hours–are very similar for that library.)
This change reduces the median ratio to 5.00 for 9,102 libraries or 4.88 for the 8,535 libraries included in most of the analysis (excluding 415 “libraries” with less than one quarter of a librarian, 152 libraries with less than $5 per capita funding, and 27 libraries with more than $300 per capita funding). Notably, the correlation between per capita expenditures and per capita benefits is now 0.63, which is a very strong correlation–stronger than it was with the erroneous calculation. It also means that circulation is The Big Number, representing 71% of direct benefits.
Does all this sound as though I’m almost certainly going ahead? That’s true, based on the whole set of comments received here and directly via email. And my thanks to all who commented. Look for an announcement, significantly before ALA Annual if all goes well–and probably in the next Cites & Insights.
It is tricky to thread the needle here. I think there is great value if you are able to discern actual value provided by libraries in way that compares them equally. Most of the studies put out fall victim to “libraries that have more money have a higher score”. The exception, of course, is Ohio since a great deal of their funding comes from the state. How does one value a library in a poor area? How can you come up with a fair statistic?
I’d love to see it though. I am a stats guy so I figured there must be a way. Perhaps the State Librarian staff member was venting frustration on this issue?
I looked at library value calculators right when I started as an SDC
and came to the conclusion that they can be a good PR tool, but are
not particularly valid. For things such as programs, values were
pretty much “estimated” — plucked out of the air, seemingly. (Not to
mention that there are few parallels in the private sector to a
library storytime.) Even assigning values to books was problematic.
Sure, you could simply use the average cost of a book, but that cost
can vary widely depending on where it is purchased. In addition, it
has a residual value to the purchaser — they can take it to a used
bookstore and get something out of it, where they cannot with a
library book. Do you then subtract that from the value of the library
book? I never did come up with a good formula that I felt I could stand behind from a statistical standpoint.
I’m delighted to receive this feedback (and email I received). I won’t comment–other than to agree that validity (esp. statistical validity) is a tricky issue. Transparency, I can guarantee.
You got me to thinking, and I did let my fellow SDCs know about your request, about whether you had found all the state produced ROI sites. I had compiled them at one time, but that library has taken them down. Wisconsin has a listing here: http://dpi.wi.gov/pld/econimpact.html which seems pretty complete.
There is a massive amount of data out there (in the IMLS collection) having some of it “pre-sorted” by size groups is probably most useful to the smallest libraries, unless they are blessed with a particularly talented staff member. I will be interested to see what you decide to do.
Michael: Thanks. That link yields a few studies that I hadn’t seen. I’ll read through them.
I’ll be interested to see what I decide to do also. Right now, it’s up in the air.
I have now read (or at least skimmed) all of the links on Wisconsin’s page that I hadn’t already read, and looked at some other sources. After all this, I’ve refined the model I was planning to use. A new section of the post provides that model.