PPPPredatory Article Counts: An Investigation, Part 1

November 9th, 2015

If you read all the way through the December 2015 essay Ethics and Access 2015 (and if you didn’t, you really should!), you may remember a trio of items in The Lists! section relating to “‘Predatory’ open access: a longitudinal study of article volumes and market characteristics” (by Cenyu Shen and Bo-Christer Björk in BMC Medicine). Briefly, the two scholars took Beall’s lists, looked at 613 journals out of nearly 12,000, and concluded that “predatory” journals published 420,000 articles in 2014, a “stunning” increase from 50,000 articles in 2010—and that there were around 8,000 “active” journals that seemed to meet Jeffrey Beall’s criteria for being PPPPredatory (I’m using the short form).

I was indeed stunned by the article—because I had completed a complete survey of the Beall lists and found far fewer articles: less than half as many. Indeed, I didn’t think there were anywhere near 8,000 active journals either—if “active” means “actually publishing Gold OA articles” I’d put the number at roughly half that.

The authors admitted that the article estimate was just that—that it could be off by as much as 90,000. Of course, news reports didn’t focus on that: they focused on the Big Number.

Lars Bjørnshauge at DOAJ questioned the numbers and, in commenting on one report, quoted some of my own work. I looked at that work more carefully and concluded that a good estimate for 2014 was around 135,000 articles, or less than one-third of the Shen/Björk number—and my estimate was based on a nearly 100% actual count, not an estimate from around 6% of the journals.

As you may also remember, Björk dismissed these full-survey numbers with this statement:

“Our research has been carefully done using standard scientific techniques and has been peer reviewed by three substance editors and a statistical editor. We have no wish to engage in a possibly heated discussion within the OA community, particularly around the controversial subject of Beall’s list. Others are free to comment on our article and publish alternative results, we have explained our methods and reasoning quite carefully in the article itself and leave it there.”

I found that response unsatisfying (and find that I’ll approach Björk’s work with a much more jaundiced eye in the future). As I expected, the small-sample report continued (continues?) to get wider publicity, while my near-complete survey got very little.

The situation continued to bother me, because I don’t doubt that the authors did follow appropriate methodology and wonder how the results could be so wrong. How could they come up with more than twice as many active OA PPPPredatory journals and more than three times as many articles?

So I thought I’d look at my own work a little more, to see whether sampling could account for the wild deviation.

First Attempt: The Trimmed List

I began by taking my own copy of Crawford, Walt (2015): Open Access Journals 2014, Beall-list (not in DOAJ) subset. figshare. The keys on each row of that 6,948-row spreadsheet are designed to be random. The spreadsheet includes not only the active Gold OA journals but also 3,673 others, to wit:

  • 2,045 that had not published any articles between 2011 and 2014, including eight that had explicitly ceased.
  • 183 that were hybrid journals, not gold OA.
  • 413 that weren’t really OA by my standards.
  • 279 that were difficult to count (more on those later).
  • 753 that were either unreachable or wholly unworkable.

There were two additional exclusions: I deleted around 1,100 journals (at least 300 of them empty ) from publishers that wouldn’t provide hyperlinked lists of their journal titles—and I deleted journals that are in DOAJ because there were even more reasons than usual to doubt the PPPPredatory label. (Note that the biggest group of that double-listed category, MDPI, has more recently been removed from Beall’s list.)

I wound up with 3,275 active gold OA journals, what I’ll call “secondary OA journals,” since I think of the DOAJ members as “serious OA journals” and don’t have a good alternative term.

As I started reworking the numbers, I thought there should be some accounting for the opaque publishers and journals. In practice, I knew from some extended sampling that most journals from opaque publishers were either empty or very small—and my sense is that most opaque journals (usually opaque because there are no online tables of contents, only downloadable PDF issues, but sometimes because there really aren’t streams of articles as such) are also fairly small. But still, they should be included. Since these two groups (excluding the 300-odd journals from opaque publishers that I knew were empty) added up to 32% of the count of active journals, I multiplied article and revenue counts by 1.32. (I think this is too high, but feel it’s better to err on the side that will get closer to the Shen/Björk numbers.)

I did not factor in the DOAJ-included numbers, but the total of those and other already-counted additional articles (doubling 2014 since I only counted January-June) is around 43,000 for 2014; around 39,000 for 2013; around 37,000 for 2012; and around 28,000 for 2011. You can add them to the counts below if you wish—although I don’t believe these represent questionable articles.


Since 613 was the sample size in the Shen/Björk article, I took a similar size sample as a starting point, then adjusted it so I could take five samples that would, among them, include everything: that is, a sample size of 655 journals.

For each sample (sorting by the pseudorandom key, then starting from the beginning and working my way down), I took the article count for each year, multiplying by appropriate factors, and the revenue counts for 2013 and 2014 (determined by multiplying the 2014 APC by the annual article counts, then applying the appropriate multipliers—I didn’t go back before 2013 because APCs were too likely to have changed). I calculated average APC per article for 2014 and 2013 by straight division—and also calculated the average article count (not including zero-count journals because the cells were blank rather than zero) and median article count (also excluding zero-count journals). I also calculated standard deviation just for amusement.

“Zero-count journals? Didn’t you eliminate zero-count journals?” I eliminated journals that had no articles in any year 2011-2014, but quite a few journals have articles in some years and not in others—including, of course, newish journals. For example, there were only 2,393 journals with articles in the first half of 2014; 2,714 in 2013; 1,557 in 2012 and 996 in 2011.

I also calculated the same figures for the full set.

Looking at the results, I was a little startled by the wide range, given that these samples were 20% of the whole: the 2014 projected article totals (doubling actual article counts, of course) ranged from 5,755 to 180,229! Now, of course, even that highest count is still much less than half of the Shen/Björk count—and just a bit over half if you add in the DOAJ-listed count.

So I added another column and assigned a random number to each row, using Excel’s RAND function, then froze the results and took a new set of five samples. The results were much narrower in range: 99,713 to 136,660. The actual total: 121,311 (including the 1.32 multiplier but not DOAJ numbers).

Table 1 shows the projected (or actual) article totals year-by-year and sample-by-sample, sorted so the lowest 2014 projection appears first. Note that samples 1-5 use the assigned pseudorandom keys, while samples 6-10 use Excel RAND function for randomization. Clearly, the latter yields more plausible results.

Sample 2014 2013 2012 2011
4 5,755 21,734 15,959 10,223
5 91,067 85,734 66,594 51,473
8 99,713 84,797 55,209 33,733
7 115,368 91,964 57,664 27,595
Total 121,311 99,994 64,325 34,543
6 123,050 104,808 57,295 22,605
9 131,762 106,181 82,790 53,869
10 136,660 112,220 68,666 34,914
3 159,284 121,097 75,933 27,628
1 170,148 138,890 87,371 56,027
2 180,299 132,515 75,768 27,364

Table 1. Estimated article counts by year

Adding the 43,000-odd articles from DOAJ-listed journals would bring these totals (ignoring samples 1-5) to around 143,000 to around 180,000 articles, with the most likely value around 165,000 articles: more than one-third of the Shen/Björk estimate but a lot less than half.

Note that “120,000 plus or minus 25,000” as an estimate actually covers all five samples that used the RAND-function randomization. Figure 1 shows the same data as Table 1, but in graphic form.

Figure 1. Estimated article counts by year

How much revenue might those articles have brought in, and what’s the APC per article? Keeping the order of samples the same as for Table 1 and Figure 1, Table 2 and Figure 2 show the maximum revenue (not allowing for waivers and discounts).

Sample 2014 2013
4 $2,952,893 $10,473,269
5 $1,677,496 $3,322,988
8 $30,184,480 $23,906,771
7 $35,939,416 $35,825,909
Total $31,863,087 $28,537,554
6 $31,010,206 $27,926,897
9 $31,165,754 $29,071,218
10 $31,015,578 $25,956,975
3 $82,610,167 $65,930,614
1 $34,247,360 $32,892,328
2 $37,827,517 $30,068,570

Table 2. Estimated maximum revenue, 2014 and 2013

This time there are two extremely low figures and one extremely high figure—with samples 6 through 10 all within $4.1 million of the actual maximum figure (for 2014: for 2013, the deviation is $7.3 million). Compare the $31.86 million calculated costs here with the $74 million estimated by Shen/Björk: the full-survey number is less than half as much.

Figure 2 shows the same information in graphical form.

Figure 2. Estimated maximum revenue, 2014 and 2013

Looking at APC per article, we run into an anomaly: where the Shen/Björk estimate is $178 for 2014, the calculated average for the full survey is considerably higher, $262.66. The range of the ten samples is from a low of $18.42 to a high of $513.08, but the five “good” samples range from $226.95 to $302.71, a reasonably narrow range.

Finally, consider the mean (average) number of articles per journal in 2014, in journals that had articles. The Shen/Björk figure is around 50; my survey yields 36.8. In fact, I show only 327 journals with at least 25 articles in the first half of 2014 (and only 267 with at least 50 articles in all of 2013).

The median is even lower—12 articles, or six in the first half—and that’s not too surprising. The standard deviation in most years was at least twice the average: as usual, these journals are very heterogeneous. How heterogeneous? In the first half of 2014, three journals had more than 1,000 articles each (but fewer than 1,300); six more had at least 500 articles; 16 had 250 to 499 articles—but at the same time, only 819 of the total had at least 11 articles in the first half of 2014, and only 1,544 had at least five articles in those six months.


I could find no way to get from these samples to the Shen/Björk figures. Not even close. They show too many active journals by roughly a factor of two, too many articles by a factor of close to three, and too much revenue by a factor of two—and too many articles per journal as well.

[Part 1 of 2 or 3…]

Note: This and following posts will also appear, probably in somewhat revised form, in the January 2016 issue of Cites & Insights.

Gunslinger Classics Disc 12

November 7th, 2015

As usual for these 12-disc fifty-movie sets, one disc has six short movies: this one. These are all oaters, B-movie programmers of an hour or less, mostly low-budget short-plot flicks. Four with John Wayne; one each with Bob Steele and Crash Corrigan.

Texas Terror, 1935, b&w. Robert N. Bradbury(dir. & screenplay), John Wayne, Lucile Browne, LeRoyMason, Ferm Emmett, George Hays. 0:51.

Wayne’s the newly-elected sheriff. The man who pretty much raised him comes by the office, shows the wad of cash he’s withdrawn from Wells Fargo to restock his ranch now that his daughter’s coming home in a few months, notes that he’d tied his horse up behind Wells Fargo, and rides off. Almost immediately thereafter, three gunmen rob Wells Fargo; in chasing them, Wayne winds up in a shootout with results that make him believe (a) that he—Wayne—shot the old man (we know it was one of the gunmen) and (b) that the old man might have been one of the bandits, since they dumped the money bag and one wad of bills on his corpse. After the town (jury?) concludes that the old man
had to have been a bandit—after all, people saw him tie up his horse behind Wells Fargo—Wayne resigns his position, turning it back over to the old sheriff (George Hayes, not in the Gabby persona). Wayne goes off, grows a beard, and becomes…well, that’s not clear.

Lots’o’plot, much of it involving the daughter, and most of it makes just as much sense as the idea that Wayne wouldn’t mention during the court hearing that the old man had told him his horse was tied up where it was. But hey, if you like lots of riding, some shooting, and a band of friendly Indians saving the day, I guess it’s OK. Generously, $0.75.

Wildfire, 1945, color. Robert Tansey (dir.), Bob Steele, Sterling Holloway, John Miljan, Eddie Dean. 0:59

An unusual entry: late (1945) and in color, but still a one-hour flick with lots of riding, lots of shooting, a couple of good fights—and a singing cowboy (actually sheriff in this case, Eddie Dean) who gets the girl. The plot, not in the order it unfolds: a gang is rustling all the horses from ranches in one valley and blaming it on Wildfire, a wild stallion—and it turns out horse theft is a sideline: the motivation is for one gang member to buy up the ranches cheap, since he already has a contract to sell them to a big ranch for a big profit. Two itinerant horse-traders with a tendency to stay on the right side of the law wind up in the middle of this and expose it.

The color’s a little faded, but the whole thing’s good enough that I’d probably give it six bits—except for one thing: however they “digitized” this, at several points it looks like a projector losing its grip on film sprockets, losing chunks of the action and disrupting continuity. With that, it goes down to $0.50.

Paradise Canyon, 1935, b&w. Carl Pierson (dir.), John Wayne, Marion Burns, Reed Howes, Earle Hodgins, Gino Corrado, Yakima Canutt. 0:53.

John Wayne again, this time as a government agent sent to investigate counterfeit traffic that may be connected to a medicine show. (One person went to jail for ten years for counterfeiting, and may be running such a show.) He finds the show—which has a habit of leaving towns suddenly, either for not paying debts or because the proprietor tends to drink his own tonic, go to town, bust things up and not pay for them (his tonic is “90% alcohol,” which is 180 proof and should make it flammable). For that matter, he helps the show evade arrest by getting them across the Arizona/New Mexico border just ahead of the law, and joins the show as a sharpshooter.

The next town is a New Mexico/Mexico border town—and turns out the medicine show’s not really involved any more: instead, the counterfeiter, who framed the medicine man, is now operating out of a saloon on the Mexican side. One thing leads to another with lots of riding, lots of shooting and some true sharpshooting, and of course both the good guys winning and John Wayne getting the girl—with a mildly cute surprise ending.

The highlight is probably the medicine man’s pitch, a truly loopy piece of speechifying, including his assurance that he once knew a man without a tooth in his head…and that man became the best bass drum player he ever knew! All it takes is determination, and Doc Carter’s Famous Indian Remedy.

Not great, not terrible. Once again we have Yakima Canutt doing something more than trick riding—he’s the villain in the piece. (Wayne does not sing; the two singing entertainers in the medicine show are…well, that’s six minutes I’ll never get back again.) I’ll give it $0.75

The Lucky Texan, 1934, b&w. Robert N. Bradbury (dir. & writer), John Wayne, Barbara Sheldon, Lloyd Whitlock, George Hayes, Yakima Canutt. 0:55.

This time, John Wayne’s Jerry Mason just out of college and returned to the ranch of old geezer Jake Benson, who more or less brought him up—and finds that the ranch’s cattle have all been rustled, but Benson’s opening up a blacksmith shop in town. Wayne immediately starts working there, and an early customer’s horse had picked up a stone—a stone that, when Wayne looks at it, seems to have gold in it. (It must have been a thriving smithy, since the geezer refuses payment for dealing with the horse’s problem…) Oh, and Benson’s pretty young granddaughter’s about to finish college (thanks in part to the geezer’s monthly checks) and returning soon.

One thing leads to another, and we have Wayne and Benson (not a TV series, but it could be) getting really good pure gold out of the site where they figured the horse had been; when they go to sell it, the assayer pays them…and then notes to his sidekick that he now “owned” most of Benson’s cattle.

More plot; the villains trick the geezer into signing a deed to the ranch; the sheriff’s son shoots the banker in a holdup just after Benson pays off the loan for the blacksmith shop (and Benson seems like a likely culprit until John Wayne Saves the Day)…and more. As always, it all works out in the end, which involves the usual Wayne-and-the-girl wedding. No singing; lots of fist fights (with no phony sounds—lots of grunting, but not much more); oddly enough, although two men are shot (and two others are shot at), there’s not a single death in the movie. There is, on the other hand, Wayne surfing down a sluice riding on a tree branch—and a chase scene involving Hayes semi-driving a car (he’d never driven before) and the villains on a powered railway car, in an almost slapsticky sequence. (That long chase is also the only time in an old Western I’ve ever seen The Hero, Wayne in this case, jump from his horse to tackle the villain on his horse…and miss, tumbling down a hill.)

George Hayes gets to show his dramatic abilities pretending to be his sister (you’d have to see it—he’d played the lead in Charley’s Aunt many years before, and does a good job in drag), and although he now has Gabby Hayes’ intonation and look, he’s not playing the fool by any means, and not even the sidekick—after all, it’s his ranch and his blacksmith shop. Another one with Yakimah Canutt doing more than stunt riding (although he did plenty of that—apparently chasing himself at one point), once again playing a bad guy (something he was very good at). (I would note that many of the reviews at IMDB call George Hayes “Gabby” or “Gaby” Hayes—but he didn’t become Gabby Hayes until later on in his career.)

Maybe I’m getting soft as I near the end of this marathon, but this one seemed pretty good; I’ll give it $1.

Riders of the Whistling Skull, 1937, b&w. Mack V. Wright (dir.), Robert Livingston, Ray Corrigan, Max Terhune, Mary Russell, Roger Williams, Yakima Canutt, Fern Emmett, Chief Thundercloud. 0:58 [0:53]

A few archaeologists and a trio of cowboys known as The Three Mesquiteers are out to plunder a lost Indian city, or as they put it, rediscover it and recover all the golden treasure. A bunch of Native Americans don’t like this idea, and attempt to discourage them. One half-Native American, who passes himself off as one of the party, had previously kidnapped the father of the beautiful young (female) anthropologist and has been torturing him to reveal the location of the treasure.

Of course, this being a B Western from the 1930s, the plunderers are the heros and it’s a great thing that they manage to shoot at least half a dozen Native Americans and bury more of them under a wildly implausible collapse of half a mountain. Naturally, it all ends “well,” with the most handsome of the Mesquiteers getting the girl and an older and plainer woman (another sort-of archaeologist) getting the less handsome of the Mesquiteers. (In this one, Yakima Canutt plays the American Indian guide who’s in cahoots with the half-Native American.)

Reasonably well staged and with continuous action, but it’s also blatantly offensive. If you can ignore that, maybe $0.75.

Randy Rides Alone, 1934, b&w. Harry L. Fraser (dir.), John Wayne, Alberta Vaughn, George Hayes, Yakima Canutt, Earl Dwire. 0:53.

This cowboy riding along tops a ridge and spots the roof of a building—a halfway house saloon. He hears the honky-tonk piano and goes in…only to discover that everybody’s dead and the piano is a player piano. As he looks over the situation, including an open safe, the sheriff and his posse show up…and, naturally enough, arrest the cowboy. But we saw eyes moving in a painting on the wall…and after they’ve gone, a young woman steps out and inspects the scene.

Thus begins a story involving a hearing mute who runs a local store, the young woman breaking the cowboy out of jail so he can find the real killers, a gang hideaway for a gang run by…oh, let’s not give it all away. Lots of riding, a fistfight or two, some shooting, and of course all ends well. This time, George Hayes (not at all in the “Gabby” persona) plays the lead villain (and the—spoiler—mute shopkeeper) and Yakima Canutt plays the chief henchman.

The flick seems padded at 53 minutes, and Wayne is notable mostly for his young good looks. Generously, $0.75.

Double digits!

November 6th, 2015

I am delighted to say that The Gold OA Landscape 2011-2014 is now in the double digits, with two Ingram paperback sales and one Amazon paperback sale reported. (I’m guessing that I only see Ingram and Amazon numbers once a month. In terms of progress toward $ goals, three Ingram/Amazon sales equal one 1.3 Lulu sales, but I’m nonetheless delighted to see them.)

The balance still heavily favors print: ten paperback http://www.lulu.com/shop/walt-crawford/the-gold-oa-landscape-2011-2014/ebook/product-22353903.htmlcopies, two PDF site-licensed ebooks. (The ebooks are only available through Lulu because the global marketing channel will only accept ePub ebooks. Don’t ask me.)

Added a bit later: And thanks to worldcat.org, I see that five universities have the book–and that it’s available from Barnes & Noble as well. I think Ingram, B&N and Amazon are the totality of Lulu’s global marketing arrangements…

Linguistics, OA, $430 and $1,400–and a bit about The Gold OA Landscape 2011-2014

November 5th, 2015

I thought it might be interesting to glance at some existing gold OA journals at least partly devoted to linguistics in light of editorial goings-on at a notable subscription “hybrid” journal in the field.

This is a very incomplete group: it’s only journals I’d grouped into Language & Literature and that showed “linguis” somewhere within the DOAJ record (usually in the subject or keyword fields). That omits journals partly devoted to linguistics that fell into any number of other primary subject areas such as anthropology. But it’s a start…

The Basic Numbers

This group of journals consists of 275 journals (including only those graded “A” and “B” in The Gold OA Landscape 2011-2014). The journals published 5,954 articles in 2011; 6,725 in 2012; 6,973 in 2013; and a slight drop to 6,415 in 2014.

Article Processing Charges

Twelve of the 275 journals have article processing charges; the remaining 264 are funded through other means.

Those twelve journals did publish more articles per journal than the others: in total, 1,007 in 2011; 1,298 in 2012; 1,418 in 2013; and 1,493 in 2014.

APCs range from $37 to $600, but only one journal charged more than $400 and only three charged more than $300. (The only fee-charging journal with more than 200 articles in 2014 charged $40.)

The maximum paid for APCs in the twelve fee-charging journals in 2014 was $364,146; that comes out to a weighted average of $244 per article. (The average for all articles in these journals is $56.76.)

Grades and Fees

Of the 263 no-fee journals, 250 don’t have any obvious problems. Of the thirteen graded B, two have problematic English; three have garish sites or other site problems; one features a questionable impact factor; six have minimal information; one had other issues.

Of the dozen fee-charging journals, seven don’t have obvious problems. Of the five graded B (obviously a much higher percentage than for no-fee journals), one has a questionable impact factor and four make questionable claims–actually, the same questionable claim in all four cases: they claim to be Canadian but show no indication of significant Canadian editorial involvement.

Anyway…that’s a little information about a few existing gold OA journals that are at least partially devoted to linguistics.

The Gold OA Landscape 2011-2014: Language and Literature

Just a few notes in addition to what’s in the excerpted version–hoping this might encourage a few people and libraries to buy the paperback or site-licensed PDF, or find ways to help me continue this research.

  • Most journals in this field are small, even by the standards of humanities and social sciences: 350 published 18 articles or fewer in 2014, as compared to 91 with 19 to 30 articles, 51 with 31 to 50 articles, 24 with 51 to 120 articles…and eight journals with more than 120 articles in 2014. (Seven of those eight journals charge APCs–but the one that doesn’t published one-quarter of all the articles in the big eight journals.)
  • Journals in 55 countries published articles in 2014. Only one country–Brazil–accounted for 1,000 or more articles. United States and Canada followed (with more than 900 articles each–although that includes the Canadian journals that aren’t very Canadian). Spain was the only other country with more than 660 articles.

As always, there’s more in the book.

Quick status report: as of this morning (November 5, 2015):

  • At least 2,306 downloads of the Cites & Insights issue have happened
  • Seven copies of the book have been purchased, in addition to my own copy: Six paperback, one PDF ebook. That’s one copy for every three hundred downloads. [Note added November 6, 2015: PDF ebook sales have now doubled–another copy was purchased. Total sales are still single-digit, but it’s progress.]




Why I’m not joining AAAS (a silly little post)

November 4th, 2015

Once in a while–maybe twice a year, and only since we moved to Livermore–I get a shrink-wrapped copy of Science that’s perhaps a month old, with an envelope enclosed inviting me to join AAAS for the super-low introductory price of $99. (Note that “join AAAS” is pretty much synonymous with “subscribe to Science,” and the discount seems to be honoring my nonexistent status as a scientist.)

Wonder why this has only happened since we moved to Livermore? I’m sure it has nothing to do with being in a small city of 85,000 people that includes two major labs–Lawrence Livermore and Sandia–employing more than 10,000 scientists and support staff between them. Maybe it’s purely coincidental.

Anyway, it happened again this week. After looking at the offer, I recycled it…and kept the magazine to read. (You can call Science a journal if you wish; to me, it comes off as a serious science-oriented magazine that happens to include a few peer-reviewed papers.)

I recycle the offers for two reasons:

  • It offends me that I’m offered Science for $99, with a renewal price that wouldn’t be higher than $153 (and probably lower), while if my library wants to subscribe to the print edition, it will cost them $1,282. I don’t know of very many magazines with the effrontery to charge a library nearly nine times as much for a print magazine as they charge an individual, although for scholarly journals that may be typical. Or not.
  • The less serious reason: I love magazines. I love books. I love some TV and movies. I love doing stuff on the computer. If I took Science with its weekly schedule and fairly meaty content, I’d have to stop taking at least half of the other magazines I read or give up on books altogether. Not gonna happen. (If anyone wonders why I don’t subscribe to The New Yorker, just reread this bullet. Also one reason I didn’t renew The Economist, although in that case going from free-for-airline miles to $100 or so made the decision easy.)

No deeper message. Just a quick note.


Cites & Insights 15 now available in paperback

November 3rd, 2015

Cites & Insights 15: 2015 is now available as a 354-page 8.5″x11″ paperback, combining all eleven issues plus indices (exclusive to the book).

As usual, the price is $45.00 (of which roughly half goes to support Cites & Insights).

This year is especially strong on open access (including the most complete survey ever done of gold OA activity) but also includes major essays on the Google Book Project, books, social networks, fair use and more.

(If you buy it today or tomorrow, you can get free shipping by using oupon code USMAIL11–capitals do count and the last two characters are ones. The coupon code is good through November 4, 2015.)

As close as I’ll get to NaNoWriMo

November 2nd, 2015

Or, as I like to think of it, the Misspelled Robin Williams Memorial process…

Anyway, you could think of the December 2015 Cites & Insights as my NaNoWriMo with just tiny little deviations. After all, it is novel-length (as defined by the Science Fiction & Fantasy Writers of America, SFWA, as far as I know the only list of lengths for this sort of thing: if it’s over 40,000 words, it’s a novel), and it’s appearing in November.

The only little tiny deviations from NaNooNaNooNoWriMo:

  • The issue isn’t quite 50,000 words long–it’s 48,012.
  • It’s nonfiction.
  • Although it appears in November, I wrote it in October, and OctNonWriMo doesn’t exist. Yet.
  • A large portion of it isn’t my writing, it’s excerpts from other writing. (How large a portion? To my surprise, apparently less than half–deleting every quoted paragraph that’s not quoting me brings the word count down to 26,851 words.)

But hey, other than those four tiny quibbles…

In any case, it’s as close as I’m ever likely to get to NaNoWriMo.

Cites & Insights 15:11 (December 2015) available

November 2nd, 2015

The December 2015 issue of Cites & Insights (15:11) is now available for downloading at http://citesandinsights.info/civ15i11.pdf

This issue is 58 pages long. If you plan to read it online or on an ereader (ebook, tablet, whatever), you may prefer the single-column 6″ x 9″ edition, 111 pages long, at http://citesandinsights.info/civ15i11on.pdf

This issue contains one essay:

Intersections: Ethics and Access 2015  pp. 1-58

No weird old tricks for reducing belly fat, but 102 items worth reading in a baker’s dozen of subtopics related to ethics and access (open and otherwise)–and #25 may astonish you! Or not.

No, it’s really not a listicle–otherwise I’d have to find 102 ads and free (or plagiarized) illustrations. It’s a bigger-than-usual roundup, with just a little humor (and a few exclamation points–and one interrobang).


Gold OA: the basis for going on (2 of 2)

October 27th, 2015

I’ll keep this one relatively short, as it’s about more direct appreciation of the gold OA research: namely, money. I’ve already responded to two people who might, conceivably, have money available for this research (neither one even suggested that it could happen), giving the amount I’d want–so I might as well be up-front and provide the options here.

1. The Donations + Purchases Route: Milestones

  • $1,500 total: the 2011-2014 spreadsheet, anonymized slightly, goes up on figshare.
  • $2,500 total: I give serious thought to renewing the project for 2015 data, using DOAJ’s journal list as of the first week of 2016.
  • $5,000 total: I’d definitely do the 2011-2015 version and make the spreadsheet available on figshare.

That total includes donations to Cites & Insights since the 2011-2014 project was announced and net proceeds from sales of all of my self-published books since September 1, 2015 (and, for that matter, the honorarium portion of expenses-paid speaking engagements related to this work, but I’m not holding my breath for any of those).

As previously noted, through right now, we’re more than one-third of the way but less than halfway to the first milestone.

(If the second milestone isn’t reached by April 2016, I don’t think this would happen–I’d have moved on to other things by then.)

2. Direct Grant Funding or Consulting Contract: Annual Costs

This is the set of numbers I sent back to two interested parties. It would cover another round of research, including rechecking APC status and amount for all listed journals, tweaking the grading criteria slightly, writing up the research, and making the anonymized spreadsheet available on figshare and the PDF version of the results available for free. (The paperback version would be priced at very close to production costs, quite probably less than $10.)

My price would be, at minimum, $0.50 per journal in DOAJ in the first week of 2016, plus $1,000 for the analysis/writeup phase. Right now, that would come to about $6,332.

I’d be delighted to discuss this with any possible agency or agencies (actually, there’s one exception–not the one in Ohio–but I don’t think that’s likely to be an issue). If the money was secure before 2016, I could do some of the APC/site rechecking before 2016. If more discussion and tweaks are desired, the price might be higher.

Obviously, the sponsor(s) would or could have their names on the results or could even handle distribution.

3. Part-time Consulting Research

I believe this project will require at least 500 to 600 hours to do properly, so if somebody wanted to hire me as a quarter-time consulting researcher to carry on this project (for one or more years), I’d certainly consider it. (I’m assuming that nobody hiring a consultant or researcher in California pays less than $26,000/year, esp. since California minimum wage is likely to be $30,000 before too long.)

Obviously, I’d expect to discuss possible expansions and tweaks, and the agency could release the report under its name, with me credited somewhere.

Oh, one more thing:

4. Redoing the Beall’s Lists Investigation

That would cost a lot of money because it’s neither interesting nor fun nor, I believe, especially useful. If someone was determined, I’d consider it for $1 per journal within Beall’s lists plus $2,000 for analysis and writeup–that is, a minimum of $13,000 (and going up all the time!). But I’d probably turn it down even then: life really is too short.

[Oh, by the way: if you’re interested in funding this research, contact me at waltcrawford@gmail.com]

Gold OA: The basis for going on (1 of 2)

October 27th, 2015

At this point–seven weeks after The Gold OA Landscape 2011-2014 was published–it seems like a good time to discuss the issues surrounding possible continuation of this full-survey research for another year (that is, covering 2015, done in 2016).

Part 2 will deal with finances: what it would take to make it happen.

This part deals with a related question: Since I’m not depending on this revenue to keep meals on the table or a roof over our heads, why do I need any revenue for it at all?

[No, nobody’s said that quite so flatly. Still: every time somebody says “there’s something wrong with charging for a writeup about open access or the research it took to do that writeup, because OA’s supposed to be free,” or something of the sort–which has happened every time I or ALA (or MIT) has published something on OA that carries a price–once I calm down, I turn it into the question above.]

Turns out, this is a philosophical question of sorts: Namely, what motivates me to do anything (other than lie around the house, do some housework, read books, watch TV, go for walks and like that)?

That question’s been clarified in my own mind over the years since it’s become clear that Cites & Insights itself is unlikely to attract significant contributions (the total has never reached the high three figures in a year, much less four figures). Here’s how I’ve worked it out in my own head, although I’m sure it’s an incomplete model.

I see four factors: Fun, Interest, Worth/Usefulness/Effectiveness, and Appreciation. Two are internal, two external.


I do some essays in Cites & Insights because they’re fun or amusing to me. Certainly true of The Back, The Middle, most Media essays (esp. old movies). That’s part of why I started looking at liblogging, library blogging and library slogans (and, for that matter, library use of social media): it was fun.

“Fun” and “interesting” can overlap in slightly unpredictable ways. It was, initially, fun to unveil the realities behind Beall’s lists, and in some ways it’s been fun to see how well Chrome/Google does or does not translate non-English journal websites (and to appreciate some of the blank verse generated by some translations).


I have lots of interests, and I’ll pursue an interest to what might possibly be considered extremes–I’m a completist in some areas. It has certainly been interesting to examine the Gold OA landscape in detail, and once I got well into it I realized that I wanted to see it through.

Interest certainly explains some ongoing features in Cites & Insights. I don’t find copyright discussions particularly amusing, but they’re interesting, just as one example.

But I have lots of interests, and could readily cultivate more. And time eventually does become a limiting factor. At this point, I don’t expect to live for more than 30 years or so–possibly quite a bit less, probably not much more. (For a long time, I’d pegged 93 as my desirable stopping point; I’ve moved that to 98–which gives me 28 more years–as long as I’m im good mental and reasonable physical health. I have no desire to live to 103 or 108 or some extreme old age–but ask me again 20 years from now, I suppose.) There are a lot of books I’d like to read and quite a few I wouldn’t mind rereading; there are a lot of movies I want to watch; I read and enjoy quite a few magazines (and one daily “paper”); there’s a fair amount of TV I enjoy watching (although probably very little by most people’s standards); lots of music to pay attention to; and… and… and…

So at a certain point I have to balance competing interests, especially since time is finite and some significant portion of it is taken up with household maintenance, family life, sleep (yes, I get 7.5 to 8 hours a day; no, I’m not willing to reduce that much), vacations, exercise and long walks/hikes, etc…

Balance isn’t much of an issue when I’m choosing a book that may take 4-5 hours to read or an essay that may take 5-10 hours to write. It’s a lot more of an issue when I’m contemplating a project that would probably take 500 to 600 hours over the course of six or seven months.

Which is to say: I find the ongoing story of gold OA interesting. Do I find it interesting enough to give up 500-600 hours per year of other stuff? Which brings us to:


When something’s fun and not too time-consuming, this and the final factor don’t come into play.

When it’s a question of balance and which projects are worth starting or continuing, this and the final factor definitely do come into play.

To wit: what is this worth (and how useful is it) to me and other people?

(Yes, this and the final factor overlap a lot. That’s how life is.)

I look at readership, citations, and things like that as indications of worth and usefulness. If an issue of C&I is only read 200 times over the course of three months, it apparently wasn’t found to be worthwhile or useful; if it’s read 2,000 times over three months, it apparently was worthwhile or useful.

Of course, worth can also have a financial aspect, which gets more into appreciation: do people find something sufficiently useful or worthwhile to pay for it?

I recognized that my series of books on liblogging had ceased to be worthwhile/useful about a year too late, when sales declined to pretty much nothing and readership for related C&I issues declined substantially. But I did eventually recognize it and stopped doing the series. (A ten-year recap might or might not happen; if it does, it will be at a “this might be fun/interesting” level, not a “people might be willing to buy this” level–there wouldn’t be a book.)

There have been other themes in Cites & Insights that have disappeared because it appeared that people didn’t find them useful or worthwhile. Indeed, I stopped doing individual HTML essays because there didn’t seem to be much demand for them (and it was clear nobody found them worthwhile enough to pay for) and they were never interesting or fun to do–while the single-column version of C&I has proven to be useful enough to keep doing.

As to effectiveness: that’s so hard to measure that I generally ignore it–but I do have to mention it within this discussion.

So how does the OA research fall on the interesting/worthwhile axis?

Journal Readership

Looking at OA-related issues of Cites & Insights over the past two years, including research-based ones and others, I find the following download numbers through this morning at 5:30 a.m. (but missing most of the last day of each month):

  • April 2014 (“The Sad Case of Jeffrey Beall”): 10,576, one of the highest total downloads figures ever — but in terms of effectiveness, I look at how often the lists continue to be used as the basis for policy or, sigh, “research,” and have to wonder whether there’s been any real effect at all.
  • May 2014 (“The So-Called Sting”): 4,126 downloads, a high figure.
  • July 2014 (“Journals, ‘Journals’ and Wannabes”): 5,121–a high figure, and since this was a full-issue essay, I can reasonably assume that the readership was entirely related to this essay.
  • August 2014 (“Access and Ethics 3”): 1,643, a decent-but-not-great figure.
  • October/November 2014 (“Journals and ‘Journals’: Taking a Deeper Look”): 1,704, another decent-but-not-great figure.
  • December 2014 (…Part 2): 1,669, another decent-but-not-great figure.
  • January 2015 (“The Third Half”): 2,783, a good solid figure, especially since it represents less than a year.
  • March 2015 (“One More Chunk of DOAJ”): 2,281, a good solid figure, but in this case the essay taking up most of the issue–“Books, E and P, 2014”–probably accounts for much of that, since that’s always been a hot topic.
  • April 2015 (“The Economics of Open Access”): 2,476, a good solid figure–and this one’s a single-essay issue.
  • June 2015 (“Who Needs Open Access Anyway?”): 1,595, a decent figure for five months.
  • July 2015 (“Thinking About Libraries and Access, Take 2”): 839 downloads–and this one’s a little disappointing because that essay was my own take on/beliefs about OA. This suggests that people are a lot less interested in what I think than in what I’ve found out through research. That’s OK, of course…but…
  • October 2015 (“The Gold OA Landscape 2011-2014”): 2,169 downloads in the first seven weeks or so, which I regard as very good numbers, especially for the first couple of months.



This shows up in citations elsewhere, tweets and the like, but also in donations and sales (and, heck, speaking invitations–one of the coins of the realm, but there haven’t been any in a couple of years–certainly none related to this research).

When it comes to citations, I don’t have any real complaints; ditto tweets.

As far as donations: still in the low three digits, and that was mostly when I was offering a free ebook and production-priced paperback. None since the project was completed (other than two very small recurring donations that are for C&I, not OA research.)

As for sales…

Book Purchases

For the same period–the books appeared a couple of days before the October 2015 issue did–here’s what I show, not including my own copy: Seven paperback copies, one site-licensed PDF ebook. Total: Eight copies.

In other words, not even one-half of one percent of those who’ve downloaded the October 2015 issue have, so far, found the research sufficiently worthwhile to buy the full story.

Of course, there could be dozens, nay, hundreds of orders just waiting to go to Amazon or Ingram.

So where does this leave me? Wondering whether the effectiveness and demonstrated worth is enough to justify doing it again.

(If you’re wondering, I’d say total revenue counted toward this project–including all donations and all self-published book sales of any sort since September 1, 2015–is more than one-third of the way, but considerably less than halfway, toward being enough to make the anonymized spreadsheet available on figshare. It’s a bit more than one-fifth of the way toward making me think seriously about doing it again.)

Which brings us to Part 2, later today or maybe another day.