Mystery Collection, Disc 46

December 9th, 2015

Murder Once Removed, 1971 (TV movie), color. Charles Dubin (dir.), John Forsythe, Richard Kiley, Reta Shaw, Joseph Campanella, Barbara Bain, Wendell Burton. 1:14.

A junkie vet (Burton) who’s trying to kick the stuff and go to college, a doctor (Forsythe) who’s helping out—and who’s got the hots for the wife ( Bain) of a local businessman (Kiley), and a police detective (Campanella, of course). Those are the key players—well, those and the doctor’s nurse (Shaw) and the nurse’s dog (uncredited), who howls whenever there’s been a death.

See, the wife and the doctor are seeing each other—innocently, so far, but the doctor wants to change that—and the businessman’s looked into the doctor’s past in another town, where his mother-in-law died of a heart attack and, not too much later, his wife died of a heart attack, leaving him the money to come back home and buy out his father’s medical practice. The businessman—a patient of the doctor, as are all the other characters—believes the doctor did it and tells him so, thinking he’s taken precautions to assure that the same fate doesn’t befall him.

That’s the setup. The rest involves the doctor murdering the businessman (but not by inducing a heart attack), his careful framing of the young vet, the detective being suspicious of it all being too pat…and a little stage acting that results in the doctor confessing all.

Except…well, there are two more twists in the last five minutes of the flick (which has all the characteristics of a TV movie). I won’t give them away, but will note that one of them makes an earlier scene seem entirely phony and implausible. Incidentally, the plot summary on IMDb is wrong: the wife did not plot the murder with the doctor. At least not directly…

When I write the review, I don’t know whether it’s a TV movie, but can’t explain this one any other way. Good cast, decent movie. $1.25.

Hollywood Man, 1976, color. Jack Starrett (dir.), William Smith Jennifer Billingsley, Ray Girardin, Jude Farese. 1:37 [1:24]

This seems to be a no-budget movie about making a no-budget biker movie and the perils of getting most of your absurdly inadequate financing from someone you know is out to screw you, and who can claim all of your assets if the flick doesn’t get made rapidly. (Really: the obviously-connected “financier” turns them down, hands them another guy’s card and says “If I was you, I wouldn’t call him.” Sounds like a sure winner to me! On the other hand, that was the dramatic highlight of the portion of the film I watched.)It was written by four of the “stars” with assistance from the cast and crew; it was produced by two of the “stars.” (OK, maybe William Smith really was a star at some point, famous for Grave of the Vampire and Nam’s Angels, two other flicks I’ll probably never see.) It seems to be mostly a bunch of badly-filmed stunts done by people who don’t much give a damn.

Within ten minutes, I realized that I couldn’t tell which group of mumbling lowlife asshats were the good guys and which group were the bad guys and that I didn’t care one way or the other. Within 20 minutes, I recognized that this was one of those just plain incompetent movies, not one that’s so incompetent—but with such good intentions—that it’s amusing (e.g., Plan 9 from Outer Space).

Apparently, the stupidity escalates to beatings, murders and rapes further in the movie; I didn’t encounter that (well, maybe one murder: it was hard to tell, frankly) because the movie was such crap that I didn’t get that far. Maybe it’s because I’m now officially Old (at 70): With only 25-30 years to go, life really is too short for this garbage.

I never look at IMDB reviews until I’ve written mine—but this “review,” from Ray Girardin, may say all that needs to be said about the flick:

Hi, I’m Ray Girardin. I wrote “Stoker” (which became “Hollywood Man”) along with my friend Bill Smith in 1976. We wrote it mainly so we could do a movie together, and it worked out. He played the lead, Rafe Stoker, and I played the heavy, Harvey. There were problems along the way, as there always are with low-budget films, but we enjoyed doing it. If you’ve seen it, I’d welcome your comments, pro or con.

I stopped watching about 20 minutes in, and have no plans to resume. If you’re so inclined, you can apparently watch it for free on Youtube or download it from the Internet Archive. As the first financier might say, “If you’re smart, you won’t.” $0.

Dominique, 1979, color. Michael Anderson (dir.), Cliff Robertson, Jean Simmons, Jenny Agutter, Simon Ward. 1:40 (1:35)

The wealthy (but nervous) wife of a stockbroker (who seems to need money, although they live in a mansion with several staff members) witnesses some odd incidents—she’s apparently being gaslighted by her husband. Eventually, she commits suicide—but then her husband starts having incidents that lead him to believe that her ghost has returned. An oddly substantial ghost, capable of paying for a dual headstone (with his side having “soon” as the death date), playing piano and more.

Lots of odd incidents, eventually involving the murder of the family doctor (who certified the wife as being dead) and the semi-accidental death of the husband. Both wills are read at the same time, and other than minor bequests, her money all goes to the chauffeur and his all goes to the half-sister, despite his business partner’s assurance that most would go to the business.

The reveal, such as it is, is mostly annoying, especially as it winds up badly for everybody (and leaves a number of key plot points unresolved). Perhaps the missing five minutes would have helped.

Slow-moving, plodding at times, not terrible but certainly not great. Good cast; odd that it’s in this set, although it was apparently never released in the U.S. Maybe $1.25.

Julie Darling, 1983, color. Paul Nicholas (dir & screenplay), Anthony Franciosa, Sybil Danning, Isabelle Mejias, Paul Hubbard, Cindy Girling. 1:40 [1:30\

Julie just wants to be with her father. Not so much her mother, and she finds a way to take care of that, thanks to a delivery boy who finds the mother hot enough to turn him rapist and, more or less accidentally, killer.

Ah, but the father’s been seeing somebody else, a young widow, and soon enough…well, Julie fails to kill off the widow’s son, but is determined to do in the woman who’s now her stepmother. I won’t go through the whole plot, except to note that some stepmothers ought not to be messed with (and the last thing you want to be is Julie’s girlfriend from school!).

A tawdry little movie (badly panned-and-scanned) that earns its R with nudity, both gratuitous and not quite so gratuitous, plus of course violence. The missing ten minutes might help but wouldn’t make it less tawdry. After watching this, I really feel the need for a shower—but lovers of tawdry noir might give it $0.75.

Cites & Insights 16:1 (January 2016) available

December 2nd, 2015

It’s an odds-and-ends issue, and what may be oddest of all is that it’s still around…

The January 2016 Cites & Insights (16:1) is now available for downloading at

The two-column print-oriented issue is 26 pages long. If you’re reading it online or on a tablet (or whatever), you might prefer the 51-page single-column 6×9″ version at

The issue includes:

The Front  pp. 1-2

Starting the Volume: notes on the annual edition of Volume 15, The Gold OA Landscape 2011-2014, and “plans” for the year.

Intersections: PPPPredatory Article Counts: An Investigation  pp. 2-10

The series of four blog posts, put together and slightly edited. Why I believe the numbers in a published study of “predatory” article volume are wrong and how they might have gotten that way–with the lagniappe of a first-cut study as to how often the lists of ppppredators actually makes a case.

Media: 50 Movie Gunslinger Classics, part 2  pp. 10-19

After a mere two years, here’s the second half. Roy Rogers, Gene Autry, John Wayne, George Hayes (before and after his “Gabby” persona), Yakima Canutt and many others…

The Back  pp. 19-26

This year’s installment of The Low and the High of It, now including portable systems, with a mere 551 to 1 ratio between the cheapest and most expensive CD-only stereo system consisting entirely of Stereophile-recommended components (only 37 to 1 for all-Class-A components) and, wait for it, 1,224 to 1 between the cheapest and most expensive CD-and-LP stereo systems. Also a baker’s dozen of other items.

So: how many people downloaded this issue between its actual upload (at around 3 p.m. Tuesday) and this post, and how many will download it between this post and social media publicity? I’ll have an idea of the first number (if I had to guess, I’d guess 10 or fewer) but not the second…

Why you should buy The Gold OA Landscape, for various values of “you.”

December 1st, 2015

The PDF ebook version of The Gold OA Landscape 2011-2014 appeared on September 10, 2015. To date (nine days short of three months), it has sold three copies.

The paperback version of The Gold OA Landscape 2011-2014 appeared on September 11, 2015. To date (eight days short of three months), it has apparently sold nine copies (but it’s possible there are November sales on Amazon, Ingram and Barnes & Noble that haven’t yet been reported).

My September 10, 2015 post offered seven good reasons why libraries, OA advocates and OA publishers might want to buy the book. Those reasons are still a good overall set, so I’ll repeat them here, followed by a little comment on “various values of ‘you’.”

Overall reasons “you” should buy this book

  1. It’s the first comprehensive study of actual publishing patterns in gold OA journals (as defined by inclusion in the Directory of Open Access Journals as of June 15, 2015).
  2. I attempted to analyze all 10,603 journals (that began in 2014 or earlier), and managed to fully analyze 9,824 of them (and I’d say a fully multilingual group would only get 20 more: that’s how many journals I just couldn’t cope with because Chrome/Google didn’t overcome language barriers).
  3. The book offers considerable detail on 9,512 journals (that appear not to be questionable or nonexistent) and what they’ve published from 2011 through 2014, including APC levels, country of publication, and other factors.
  4. It spells out the differences among 28 subject groups (in three major segments) in what’s clearly an extremely heterogeneous field. The 28 pictures of smaller groups of journals are probably more meaningful than the vast picture of the whole field.
  5. If enough people buy this (either edition), an anonymized version of the source spreadsheet will be made available on figshare.
  6. If enough people buy this (either edition), it will encourage continuation of the study for 2015.
  7. Mostly, it’s good to have real data about OA. Do most OA articles involve fees? It depends: in the humanities and social sciences, mostly not; in STEM and biomed, mostly yes. Do most OA journals charge fees? It depends–in biology, yes, but in almost all other fields, no.

Other stuff

Since those first posts, I’ve offered a number of specifics from some chapters (and published an excerpted version of the book–about one-third of it, with none of the graphs–as the October 2015 Cites & Insights. Through yesterday (November 30, 2015), that issue has been downloaded 2,686 times: 1,992 in the single-column format (decidedly preferable in this case), 694 in the traditional print-oriented two-column format.

If one of every ten downloads resulted in a purchased copy (through Lulu), the continuation of this project would be assured for the next two years (assuming I’m still around and healthy). Thar is:

  • An anonymized version of the current spreadsheet would be up on figshare, available for anybody to use.
  • I would carry out a full 2015 study (and update of the existing study) based on DOAJ as of early January 2016.
  • The PDF version of the results would be available for free and the anonymized spreadsheet would be on figshare.
  • The paperback version would be available at a modest price, probably under $25.
  • For 2016 data (DOAJ as of early 2017), the same thing would happen.

Heck, if one out of every fifty downloads resulted in a copy purchased through Lulu, an anonymized version of the current spreadsheet would be up on figshare. (If one out of every ten downloads resulted in an Ingram/B&N/Amazon sale, the spreadsheet would be up and I’d certainly carry out the 2015 study and make the spreadsheet available, but perhaps not the free PDF or minimally-priced paperback.)

Where we are, though, is at a dozen: twelve copies to date. Now, maybe all the advocates and publishers are at the seemingly endless series of open access conferences (or maybe it just seems that way from OATP and twitter coverage) and haven’t gotten around to ordering copies.

It’s interesting (or not) to note that currently shows that 1,230 libraries own copies of Open Access: What You Need to Know Now. Which is still, to be sure, a relevant and worthwhile quick summary of OA.

“It’s early yet,” I continue saying, albeit more softly each time. I don’t want to believe that there’s simply no real support for this kind of real-world detailed measurement of serious Gold OA in action (where “support” has to be measured by willingness to contribute, not just willingness to download freebies), but it’s not looking real promising at the moment. I’ve already seen that a tiny sampling regarding an aspect of OA done by Respectable Scholars will get a lot more coverage and apparent interest than a complete survey, to the extent that disputing the results of that sampling begins to seem useless.

Various values of “you”

What do I believe the book has to offer “you”? A few possibilities:

You, the academic library

If your institution includes a library school (or an i-school), it almost seems like a no-brainer: $55 buys you campuswide electronic access to an in-depth study of an important part of scholarly publishing’s present and future–showing how big a part it already is, its extent in various fields, how much is or isn’t being spent on it, what countries are most involved in each subject, and on and on…

For the rest of you, it seems like you’d also want to have some detailed knowledge of the state of serious gold OA, since that has the best chance of increasing access to scholarly publications and maybe, perhaps, either slowing down the rate of increase in serials costs or even saving some money.

For that matter, if your library is either starting to publish open access journals or administering an APC support fund, shouldn’t you know more about the state of the field? If, for example, you plan a journal in language and linguistics, it should be useful to know that there are more than 500 of them out there; that almost none of them charge APCs; that of those that do, only six charge more than $353; that the vast majority (350) published no more than 18 articles in 2014; and that Brazil is the hotbed of gold OA publishing in these areas. (Those are just examples.)

You, the open access advocate

You really should have this book at hand when you’re reading various commentaries with dubious “facts” about the extent of OA publishing and charges for that publishing.

Too bad there’s no open access activities in the humanities and social sciences? Nonsense! While most serious gold OA journals in this field are relatively small, there are a lot of them–more than 4,000 in all–and they’ve accounted for more than 95,000 articles in each year 2012-2014, just under 100,000 in 2014. More than three-quarters of those articles didn’t involve APCs, and total potential revenues for the segment didn’t reach $10 million in 2014, but there’s a load of activity–with the biggest chunks in Brazil, the United States, Spain, Romania and Canada, but with 22 nations publishing at least 1,000 articles each in 2014 (Singapore is the 22nd).

Those are just a few data points. This book offers a coherent, detailed overview, and I believe it would make you a more effective advocate. And if you deeply believe that readers should never have to pay for anything involved with open access, well, I invite you to help find me grant or institutional funding, so that can happen.

You, the open access publisher

Surely you should know where your journal(s) stand in comparison to the overall shape of OA and of specific fields? Just as surely, you should want this research to continue–and buying the book (or contributing directly) is the way that will happen. (On the other hand, if you publish one of the 65 journals that appear to have malware, you really, truly need to take care of that–and I’ve already published that list for free.)

You, none of the above

If you’re a library person who cares about OA or about the health of your libraries, but you’re not really an advocate, chances are you stopped reading long ago. If not, well, you should also find the book worthwhile.

Otherwise? I suspect that at this point I’m speaking to an empty room, so I’ll stop.

The next update will probably appear when Amazon/B&N/Ingram figures for November appear in my Lulu stream, some time in the next week or two.

Oh: one side note: I mentioned elsewhere that the back cover of the book is just “OA gold” with the ISBN. What I mean by “OA gold” is the precise shade of gold uses in the OA open-lock logo as it appears in Wikimedia. I downloaded the logo and used’s color chooser to make that the background color for the entire cover. (I never was able to get a suitable shade of gold/orange using other techniques.)

Here’s the book cover, in case you weren’t aware of it:



One-third of the way there!

November 22nd, 2015

With today’s French purchase of a PDF copy of The Gold OA Landscape 2011-2014, and including Cites & Insights Annual purchases, we’re now one-third of the way to the first milestone, at which I’ll upload an anonymized version of the master spreadsheet to figshare. (As with a previous German purchase, I can only assume the country based on Lulu country codes…)

Now an even dozen copies sold.

One sale gone, another started: 25%

November 20th, 2015

When you go to buy my books, always check the Lulu home page for discounts. Just a reminder…

I’m guessing there will be a series of brief sales for a while, but can’t be sure. In the meantime:

SHOP25 as a coupon code gets you 25% off print books (and calendars, if you’re so inclined) from now through November 23, 2015.

Coupon codes are case sensitive.

Another reminder: you’re not decreasing my net revenue (counted toward future research) by using these sale codes–I get the same net revenue.

For various reasons, I took a look yesterday at all-time Lulu sales (it takes me one minute to generate that spreadsheet and not much longer to go through it). I noticed something that, because it’s at such a low level, had slipped my attention.

To wit: yes, occasionally somebody does buy a Cites & Insights Annual edition. Excluding my own copies, there have been sixteen such sales over the years, with the most being 2007 (4 copies) and 2008 (3 copies); the only one with no outside sales to date is the latest, 2015. Since I produce these so I’ll have my own copy (if I include cost of paper and inkjet ink, it’s actually cheaper for me to buy one at my author’s price than it is to print out a new copy of each issue and have Fedex Kinko’s bind it in an ugly Velobind binding–and the result is both more handsome and more usable), this is a nice extra. Of course, it’s also a great way to have past issues on hand…

Five thousand pages!

November 19th, 2015

I maintain a little spreadsheet to track word and page counts for Cites & Insights [with the slightly-out-of-date name “first10 length”]. I print it out every month ortwo but I don’t look at it very often.

And I missed a milestone of sorts: through the December 2015 issue (not including phantom issues that are only in the annual paperbacks), C&I has passed the 5,000-page mark: in all, 5,002 pages. (If you’re wondering, the longest volume was volume 9, 2009, with 418 pages; the shortest were volume 1 [252 pages including the preview issue], volume 2 [262 pages], and volume 11 [274 pages: the year C&I almost shut down for good].

Word count’s not at a milestone; it should hit four million words in two to four months.

No deeper meaning; just marking a wordy milestone. It’s a handsome set of paperbacks on one of my bookshelves–although the first five volumes are sort of ugly, being Velobound things produced at Kinko’s. In case you weren’t aware, volumes 6 through 15 are all available, $45 each [with occasional Lulu discounts: check the front page], with roughly half the proceeds going to continue C&I and my OA research. Oh, and on most of them you get a huge photo from our travels–all of them have such photos, but in all but two the photo’s a wraparound, 11″ high and close to 18″ wide. More information here.

Cites & Insights Annual, Two-Day Sale and a Non-Update

November 18th, 2015

I have it down to do another teaser post to help convince folks that there’s loads of great stuff in The Gold OA Landscape 2011-2014, either paperback or site-licensed PDF ebook–but given that there’s only been one copy sold in November to date, and indeed only one since October 22, maybe that’s a waste of my energy.

That’s the non-update: the total continues to be nine paperback copies and two PDF ebooks, with five copies showing up in Special arrangements (grants, donations, consulting, etc.) unchanged.

Meanwhile: if you do want the paperback–or any or all of my other self-published books–you can buy them today and tomorrow (November 19, 2015) for 20% off using the coupon code PRESALE20

[Any time you do buy stuff at Lulu, check the home page: it should show current offers.]

And then there’s the Cites & Insights Annual edition for 2015; I’ve now received my copy (and modified the cover, since the title was a little too far down the page).

Here’s the skinny:

Volume 15 is 354 pages long (including table of contents and indices) and, as usual, $45 (or $36 today and tomorrow).

Highlights of this 11-issue volume include:

  • Three full-issue essays related to Open Access: Economics, The Gold OA Landscape 2011-2014, and Ethics
  • A fair use trilogy: Google Books, HathiTrust and miscellaneous topics
  • More pieces of the OA puzzle, mostly leading up to The Gold OA Landscape
  • The usual: Deathwatch, Ebooks & Pbooks; a eulogy to FriendFeed and some notes on Twitter; and more

And the indices that aren’t otherwise available.
The photo: the library at Ephesus–a familiar view if you own Public Library Blogs: 252 Examples but this is a slightly different photo and a considerably larger view

Oops: while Public Library Blogs: 252 Examples used a different picture of The Library At Ephesus, The Liblog Landscape 2007-2010 used the same picture–but much larger, with a little more touchup, and using’s auto-equalization, which yielded a slightly different color range.

Lagniappe: The Rationales, Once Over Easy

November 13th, 2015

[This is the unexpected fourth part of PPPPredatory Article Counts: An Investigation. Before you read this, you should read the earlier posts—Part 1, Part 2 and Part 3—and, of course, the December 2014 Cites & Insights.]

Yes, I know, it’s hard to call it lagniappe when it’s free in any case, I did spend some time doing a first-cut version of the third bullet just above: That is, did I find clear, cogent, convincing explanations as to why publishers were questionable?

I only looked at 223 multijournal publishers responsible for 6,429 journals and “journals” (3,529 of them actual gold OA journals actually publishing articles at some point 2011-2014) from my trimmed dataset (excluding DOAJ journals and some others). I did not look at the singleton journals; that would have more than doubled the time spent on this.

Basically, I searched Scholarly Open Access for each publisher’s name and read the commentary carefully—if there was a commentary. It there was one, I gauged whether it constituted a reasonable case for considering all of that publisher’s journals sketchy at the time the commentary was written, or if it fell short of being conclusive but made a semi-plausible case. (Note the second italicized clause above: journals and publishers do change, but they’re only removed from the list after a mysterious appeals process.)

But I also looked at my own annotations for publishers—did I flag them as definitely sketchy or somewhat questionable, independently of Beall’s comments? I’m fairly tough: if a publisher doesn’t state its APCs or its policy or makes clearly-false statements or promises absurdly short peer review turnaround, those are all red flags.

Beall Results

For an astonishing 65% of the publishers checked there was no commentary. The only occurrences of the publishers’ names were in the lists themselves.

The reason for this is fairly clear. Beall’s blog changed platforms in January 2012, and Beall did not choose to migrate earlier posts. These publishers—which account for 41% of the journals and “journals” in my analysis and 38% of the active Gold OA journals—were presumably earlier additions to the list.

This puts the lie to the claims of some Beall fans that he clearly explains why each publisher or journal is on the list, including comments from those who might disagree. That claim is simply not true for most of the publishers I looked at, representing 38% of the active journals, 23% of the 2014 articles, and 20% of the projected 2014 revenues.

My guess is that it’s worse than this. I didn’t attempt to find individual journals, but although those journals only represent 5% of the active journals I studied, they’re extremely prolific journals, accounting for 38% of 2014 articles (and 13% of 2014 potential revenue).

If Beall was serious about his list being a legitimate tool rather than a personal hobbyhorse, of course, there would be two ongoing lists (one for publishers, one for authors) rather than an annual compilation—and each entry would have two portions: the publisher or journal name (with hyperlink), and a “Rationale” tab linking to Beall’s explanation of why the publisher or journal is there. (Those lists should be pages on the blog, not posts, and I think the latest ones are.) Adding such links, linking to posts would be relatively trivial compared to the overall effort of evaluating publishers, and it would add considerable accountability.

In another 7% of cases, I couldn’t locate the rationale but can’t be sure there isn’t one: some publishers have names composed of such generic words that I could never be quite sure whether I’d missed a post. (The search box doesn’t appear to support phrase searches.) That 7% represents 4% of active journals in the Beall survey, 4% of 2014 articles, but only 1.7% of potential 2014 revenue.

Then there are the others—cases where Beall’s rationale is available. As I read the rationales, I conclude that Beall made a sufficiently strong case for 9% of the publishers, a questionable but plausible case for 11%–and, in my opinion, no real case for 9% of the publishers.

Those figures break out to active journals, articles and revenues as follows:

  • Case made—definitely questionable publishers: 22% of active journals, 11% of 2014 articles, 41% of 2014 potential revenues. (That final figure is particularly interesting.)
  • Questionable—possibly questionable publishers: 16% of active journals, 16% of 2014 articles, 18% of 2014 potential revenues.
  • No case: 14% of active journals, 7% of 2014 articles, 6% of 2014 potential revenues.

If I wanted to suggest an extreme version, I could say that I was able to establish a strong case for definitely questionable publishing for fewer than 12,000 published articles in 2014—in other words, less than 3% of the activity in DOAJ-listed journals.

But that’s an extreme version and, in my opinion, dead wrong, even without noting that it doesn’t allow for any of the independent journals (which accounted for nearly 40,000 articles in 2014) being demonstrably sketchy.

Combined Results

Here’s what I find when I combine Beall’s rationales with my own findings when looking at publishers, ignoring independent journals:

  • Definitely questionable publishers: Roughly 19% of 2014 articles, or about 19,000 within the subset studied, and 44% of potential 2014 revenue, or about $11.4 million. (Note that the article count is still only about 4% of serious OA activity—but if you add in all independent journals, that could go as high as 59,000, or 12%.) Putting it another way, about 31% of articles from multijournal publishers in Beall’s list were in questionable journals.
  • Possibly questionable publishers: Roughly 21% of 2014 articles (34% excluding independent journals) and 21% of 2014 potential revenues.
  • Case not made: Roughly 22% of 2014 articles (36% excluding independent journals) and 22% of 2014 potential revenues.

It’s possible that some portion of that 22% is sketchy but in ways that I didn’t catch—but note that the combined score is the worst of Beall’s rationale or my independent observations.

So What?

I’ve said before that the worst thing about the Shen/Björk study is that it’s based on a fatally flawed foundation, a junk list of one man’s opinions—a man who, it’s increasingly clear, dislikes all open access.

My attempts to determine Beall’s cases confirmed that opinion. In far too many cases, the only available case is “trust me: I’m Jeffrey Beall and I say this is ppppredatory.” Now, of course, I’ve agreed that every journal is ppppredatory, so it’s hard to argue with that—but easy to argue with his advice to avoid all such journals, except as a call to abandon journal publishing entirely.

Which, if you look at it that way, makes Jeffrey Bell a compatriot to Björn Brembs. Well, why not? In his opposition to all Gold OA, he’s already a compatriot to Stevan Harnad: the politics of access makes strange alliances.

Otherwise, I think I’d conclude that perhaps a quarter of articles in non-DOAJ journals are from publishers that are just…not in DOAJ. The journals may be serious OA, but the publishers haven’t taken the necessary steps to validate that seriousness. They’re in a gray area.

Monitoring the Field

Maybe this also says something about the desirability of ongoing independent monitoring of the state of gold OA publishing. When it comes to DOAJ-listed journals, my approach has been “trust but verify”: I checked to make sure the journals actually did make APC policies and levels clear, for example, and that they really were gold OA journals. When it comes to Beall’s lists, my approach was “doubt but verify”: I didn’t automatically assume the worst, but I’ll admit that I started out with a somewhat jaundiced eye when looking at these publishers and journals.

I also think this exercise says something about the need for full monitoring, rather than sampling. The differences between even well-done sampling (and I believe Shen/Björk did a proper job) and full monitoring, in a field so wildly heterogeneous as scholarly journals, is just too large: about three to one, as far as I can tell.

As I’ve made clear, I’d be delighted to continue such monitoring of serious gold OA (as represented by DOAJ), but only if there’s at least a modest level of fiscal support. The door’s still open, either for hired consultation, part-time employment, direct grants or indirect support through buying my books (at this writing, sales are still in single digits) or contributing to Cites & Insights. But I won’t begin another cycle on spec: that single-digit figure [barely two-digit figure, namely 10 copies] after two full months, with no apparent likelihood of any other support, makes it foolhardy to do so. (

As for the rest of gold OA, the gray area and the questionable publishers, this might be worth monitoring, but I’ve said above that I’m not willing to sign up for another round based on Beall’s lists, and I don’t know of any other good way to do this.

PPPPredatory Article Counts: An Investigation Part 3

November 11th, 2015

If you haven’t read Part 1 and Part 2—and, to be sure, Cites & Insights December 2015—none of this will make much sense.

What would happen if I replicated the sampling techniques actually used in the study (to the extent that I understand the article)?

I couldn’t precisely replicate the sampling. My working dataset had already been stripped of several thousand “journals” and quite a few “publishers,” and I took Beall’s lists a few months before Shen/Björk did. (In the end, the number of journals and “journals” in their study was less than 20% larger than in my earlier analysis, although there’s no way of knowing how many of those journals and “jour*nals” actually published anything. In any case, if the Shen/Björk numbers had been 20% or 25% larger than mine, I would have said “sounds reasonable” and let it go at that.)

For each tier in the Shen/Björk article, I took two samples, both using random techniques, and for all but Tier 4, I used two projection techniques—one based on the number of active true gold OA journals in the tier, one based on all journals in the tier. (For Tier 4, singleton journals, there’s not enough difference between the two to matter much.) In each tier, I used a sample size and technique that followed the description in the Shen/Björk article.

The results were interesting. Extreme differences between the lowest sample and the highest sample include 2014 article counts for Tier 2 (publishers with 10 to 99 journals), the largest group of journals and articles, where the high sample was 97,856 and the low—actually, in this case, the actual counted figure—was 46,770: that’s a 2.09 to 1 range. There’s also maximum revenue, where the high sample for Tier 2 was $30,327,882 while the low sample (once again the counted figure) was $9,574,648: a 3.17 to 1 range—in other words, a range wide enough to explain the difference between my figures and the Shen/Björk figures purely on the basis of sample deviation. (It could be worse: the 2013 projected revenue figures for Tier 2 range from a high of $41,630,771 to a low of $8,644,820, a range of 4.82 to 1! In this case, the actual sum was just a bit higher than the low sample, at $8,797,861.)

Once you add the tiers together, the extremes narrow somewhat. Table 7 shows the low, actual, and high total article projections, noting that the 2013, 2012, and 2011 low and high might not be the actual extremes (I took the lowest and highest 2014 figures for each tier, using the other figures from that sample.) It’s still a broad range for each year, but not quite as broad. (The actual numbers are higher than in earlier tables largely because journals in DOAJ had not been excluded at the time this dataset was captured.)

2014 2013 2012 2011
Low 134,980 130,931 92,020 45,605
Actual 135,294 115,698 85,601 54,545
High 208,325 172,371 136,256 84,282

Table 7. Article projections by year, stratified sample

The range for 2014 is 1.54 to 1: broad, but narrower than in the first two attempts. On the other hand, the range for maximum revenues is larger than in the first two attempts: 2.18 to 1 for 2014 and a very broad 2.46 to 1 for 2013, as in Table 8.

2014 2013
Low $30,651,963 $29,145,954
Actual $37,375,352 $34,460,968
High $66,945,855 $71,589,249

Table 8. Maximum revenue projections, stratified sample

Note that the high figures here are pretty close to those offered by Shen/Björk, whereas the high mark for projected article count is still less than half that suggested by Shen/Björk. (Note also that in Table 7, the actual counts for 2013 and 2012 are actually lower than the lowest combined samples!)

For the graphically inclined, Figure 4 shows the low, actual and high projections for the third sample. This graph is not comparable to the earlier ones, since the horizontal axis is years rather than samples.

Figure 4. Estimated article counts by year, stratified

It’s probably worth noting that, even after removing thousands of “journals” and quite a few publishers in earlier steps, it’s still the case that only 57% of the apparent journals were actual, active gold OA journals—a percentage ranging from 55% for Tier 1 publishers to 61% for Tier 3.


It does appear that, for projected articles, the stratified sampling methodology used by Shen/Björk may work better than using a pure random sample across all journals—but for projected revenues, it’s considerably worse.

This attempt could answer the revenue discrepancy, which in any case is a much smaller discrepancy (as noted, my average APC per article is considerably higher than Shen/Björk’s)—but it doesn’t fully explain the huge difference in article counts.

Overall Conclusions

I do not doubt that Shen/Björk followed sound statistical methodologies, which is quite different than agreeing that the Beall lists make a proper subject for study. The article didn’t identify the number of worthless articles or the amount spent on them; it attempted to identify the number of articles published by publishers Beall disapproved of in late summer 2014, which is an entirely different matter.

That set aside, how did the Shen/Björk sampling and my nearly-complete survey wind up so far apart? I see four likely reasons:

  • While Shen/Björk accounted for empty journals (but didn’t encounter as many as I did), they did not control for journals that have articles but are not gold OA journals. That makes a significant difference.
  • Sampling is not the same as counting, and the more heterogeneous the universe, the more that’s true. That explains most of the differences, I believe (on the revenue side, it can explain all of them).
  • The first two reasons, enhanced by two or three months’ of additional listings, combined to yield a much higher estimate of active journals than my survey: more than twice as many.
  • The second reason resulted in a much higher average number of articles per journal than in my survey (53 as compared to 36), which, combined with the doubled number of journals, neatly explains the huge difference in article counts.

The net result is that, while Shen/Björk carried out a plausible sampling project, the final numbers raise needless alarm about the extent of “bad” articles. Even if we accept that all articles in these projections are somehow defective, which I do not, the total of such articles in 2014 appears to be considerably less than one-third of the number of articles published in serious gold OA journals (that is, those in DOAJ)—not the “nearly as many” the study might lead one to assume.

No, I do not plan to do a followup survey of publishers and journals in the Beall lists. It’s tempting in some ways, but it’s not a good use of my time (or anybody else’s time, I suggest). A much better investigation of the lists would focus on three more fundamental issues:

  • Is each publisher on the primary list so fundamentally flawed that every journal in its list should be regarded as ppppredatory?
  • Is each journal on the standalone-journal list actually ppppredatory?
  • In both cases, has Beall made a clear and cogent case for such labeling?

The first two issues are far beyond my ken; as to th first, there’s a huge difference between a publisher having some bad journals and it making sense to dismiss all of that publisher’s journals. (See my longer PPPPredatory piece for a discussion of that.)

Then there’s that final bullet…

[In closing: for this and the last three posts—yes, including the Gunslingers one—may I once again say how nice Word’s post-to-blog feature is:? It’s a template in Word 2013, but it works the same way, and works very well.]

PPPPredatory Article Counts: An Investigation Part 2

November 9th, 2015

If you haven’t already done so, please read Part 1—otherwise, this second part of an eventual C&I article may not make much sense.

Second Attempt: Untrimmed List

The first five samples in Part 1 showed that even a 20% sample could yield extreme results over a heterogeneous universe, especially if the randomization was less than ideal.

Given that the most obvious explanation for the data discrepancies is sampling, I thought it might be worth doing a second set of samples, this time each one being a considerably smaller portion of the universe. I decided to use the same sample size as in the Shen/Björk study, 613 journals—and this time the universe was the full figshare dataset Crawford, Walt (2015): Open Access Journals 2014, Beall-list (not in DOAJ) subset. figshare. I assigned RAND() on each row, froze the results, then sorted by that column. Each sample was 613 journals; I took 11 samples (leaving 205 journals unsampled but included in the total figures). I adjusted the multipliers.

More than half of the rows in the full dataset have no articles (and no revenue). You could reasonably expect extremely varied results—e.g., it wouldn’t be improbable for a sample to consist entirely of no-article journals or of all journals with articles (thus yielding numbers more than twice as high as one might expect).

In this case, the results have a “dog that did not bark in the night” feel to them. Table 3 shows the 11 sample projections and the total article counts.

Sample 2014 2013 2012 2011
6 88,165 72,034 40,801 20,473
10 91,186 75,025 50,820 31,523
5 95,338 93,886 56,047 27,893
4 97,313 80,978 51,343 36,039
1 99,956 97,153 83,606 52,983
2 105,967 87,468 50,617 20,880
7 106,693 72,658 40,119 29,055
Total 121,311 99,994 64,325 34,543
9 127,747 100,653 73,326 32,075
3 140,292 122,128 77,958 36,634
8 154,754 114,360 79,323 35,632
11 160,591 143,312 91,011 53,579

Table 3. Article projections by year, 9% samples

Although these are much smaller samples (percentagewise) over a much more heterogeneous dataset, the range of results is, while certainly wider than for samples 6-10 in the first attempt, not dramatically so. Figure 3 shows the same data in graphic form (using the same formatting as Figure 1 for easy comparison).

Figure 3. Estimated article counts by year, 9% sample

The maximum revenue samples show a slightly wider range than the article count projections: 2.01 to one, as compared to 1.82 to 1. That’s still a fairly narrow range. Table 4 shows the figures, with samples in the same order as for article projections (Table 3).

Sample 2014 2013
6 $27,904,972 $24,277,062
10 $32,666,922 $27,451,802
5 $19,479,393 $20,980,689
4 $24,975,329 $25,507,720
1 $30,434,762 $30,221,463
2 $30,793,406 $25,461,851
7 $30,725,482 $21,497,760
Total $31,863,087 $28,537,554
9 $29,642,696 $24,386,137
3 $39,104,335 $41,415,454
8 $36,654,201 $29,382,149
11 $35,420,001 $34,710,583

Table 4. Estimated Maximum Revenue, 9% samples

As with maximum revenue, so with cost per article: a broader range than for the last five samples (and total) in the first attempt, but a fairly narrow range, at 1.75 to 1, as shown in Table 5.

Sample 2014 2013
6 $316.51 $337.02
10 $358.25 $365.90
5 $204.32 $223.47
4 $256.65 $315.00
1 $304.48 $311.07
2 $290.59 $291.10
7 $287.98 $295.88
Total $262.66 $285.39
9 $232.04 $242.28
3 $278.73 $339.12
8 $236.85 $256.93
11 $220.56 $242.20

Table 5. APC per article, 9% samples and total

Rather than providing redundant graphs, I’ll provide one more table: the average (mean) articles per journal (ignoring empty journals), in Table 6.

Sample 2014 2013 2012 2011
6 27.85 20.59 20.66 16.79
10 29.35 20.75 22.73 23.10
1 30.06 25.54 38.13 38.41
5 30.26 27.63 27.18 20.88
4 31.46 22.86 23.42 29.90
2 33.94 24.79 25.08 15.14
7 34.66 20.68 20.17 22.48
Total 36.80 27.47 30.08 25.51
3 42.01 34.90 38.63 27.13
9 42.10 29.75 35.82 26.30
8 43.86 31.25 38.20 26.39
11 47.88 40.12 47.13 38.04

Table 6. Average articles per journal, 9% samples

Note that Table 6 is arranged from lowest average in 2014 to highest average; the rows are not (quite) in the same order as in Tables 3-5. The range here is 1.72 to 1, an even narrower range. On the other hand, sample 11 does show an average articles per journal figure that’s not much below the Shen/Björk estimate.

One More Try

What would happen if I assigned a new random number (again using RAND()) in each row and reran the eleven samples?

The results do begin to suggest that the difference between my nearly-full survey and the Shen/Björk study could be due to sample variation. To wit, this time the article totals range from 64,933 to 169,739, a range of 2.61 to 1. The lowest figure is less than half the actual figure, so it’s not entirely implausible that a sample could yield a number three times as high.

The total revenue range is also wider, from $22.7 million to $41.3 million, a range of 1.82 to 1. It’s still a stretch to get to $74 million, but not as much of a stretch. And in this set of samples, the cost per article ranges from $169.22 to $402.89, a range of 2.38 to 1. I should also note that at least one sample shows a mean articles-per-journal figure of 51.5, essentially identical to the Shen/Björk figure, and that $169.22 is similar to the Shen/Björk figure.


Sampling variation with 9% samples could yield numbers as far from the full-survey numbers as those in the Shen/Björk article, although for total article count it’s still a pretty big stretch.

But that article was using closer to 5% samples—and they weren’t actually random samples. Could that explain the differences?

[More to come? Maybe, maybe not.]