Here’s the tl;dr version: Go buy The Gold OA Landscape 2011-2014, either the $60 paperback or the $55 site-licensed PDF ebook (the contents are identical other than the copyright page/ISBN). I try to be wholly transparent about my investigations, and I’m confident that TGOAL represents the most accurate available count for serious gold OA publishing (excluding non-DOAJ members, “hybrids” and other stuff). Oh, and if enough copies are sold, I’ll keep doing this research…which I don’t think anybody else is going to do and which, as far as I can tell, can’t really be automated.
Running the Numbers
Now that I’ve said that, I won’t repeat the sales pitch. You presumably already know that you can get a hefty sampling of the story in Cites & Insights 15:9–but the full story is much more complete and much more interesting.
Meanwhile, I’ve gotten involved or failed to get involved in a number of discussions about numbers attached to OA.
On September 30, I posted “How many articles, how many journals?,” raising questions about statistics published in MDPI’s Sciforum asserting the number of OA journals and articles–numbers much lower than the ones I’ve derived by actual counting. I received email today regarding the issues I raised:
Thank you for passing this on. I think it’s quite difficult to pin down exactly how many papers are published, never mind adding in vagueries about the definition of ‘predatory’ or ‘questionable’ publishers. The data on Sciforum are taken from Crossref and, on http://sciforum.net/statistics/open-access-papers-published-per-year, shows about 300,000 OA articles published in 2014. The difference may depend on correct deposition (including late or not at all), article types or publishers just not registered with Crossref. I think ball-park figures are about the closest we can get as things stand.
Well…yes and no. I think it’s highly likely that many smaller OA journals aren’t Crossref members or likely to become Crossref members: for little journals done out of a department’s back pocket, even $275/year plus $1/article is a not insignificant sum.
What bothers me here is not that the numbers are different, but that there seems to be no admission that a full manual survey is likely to produce more accurate numbers, not just a different “ball-park figure.” And that “pinning down” accurate numbers is aided by, you know, actually counting them. The Sciforum numbers are based on automated techniques: that’s presumably easy and fast, but that doesn’t make it likely to be right.
Then there’s the Shen/Björk article…which, as I might have expected, has been publicized all over the place, always with the twin effects of (a) making OA look bad and (b) providing further credibility to the one-man OA wrecking crew who shall go nameless here. The Retraction Watch article seems to be the only place there’s been much discussion of what may be wrong with the original article. Unfortunately, here is apparently the totality of what Björk chooses to say about mine and other criticisms:
“Our research has been carefully done using standard scientific techniques and has been peer reviewed by three substance editors and a statistical editor. We have no wish to engage in a possibly heated discussion within the OA community, particularly around the controversial subject of Beall’s list. Others are free to comment on our article and publish alternative results, we have explained our methods and reasoning quite carefully in the article itself and leave it there.”
Whew. No willingness to admit that their small sample could easily have resulted in estimates that are nearly three times too high. No willingness to admit that the author-nationality portion, based on fewer than 300 articles, is even more prone to sampling error. They used “standard scientific techniques” so the results must be accurate.
No, I’m not going around to all the places that have touted the Shen/Björk article to add comments. Not only is life too short, I don’t believe it will do much good.
The best I can do is transparent research with less statistical inference and more reliance on dealing with heterogeneity by full-scale testing, and hope that it will be useful. A hope that’s sometimes hard to keep going.
Meanwhile: I continue to believe that a whitelist approach–DOAJ‘s tougher standards–is far superior to a blacklist approach, especially given the historical record of blacklists.