The Size of the Open Access Market (and an admission)

On October 29, 2014, Joseph Esposito posted “The Size of the Open Access Market” at the scholarly kitchen. In it, he discusses a Simba Information report, “Open Access Journal Publishing 2014-2017.” (I’m not copying the link because it’s just to the blurb page, not to any of the info that Esposito provides.) The 61-page Simba report costs a cool $2,500 (and up), so I can’t give you any detail on the report itself other than what Esposito passes along.

The key portion of what he passes along, quoting Esposito directly:

Simba notes that the primary form of monetization for OA journals is the article processing charge or APC. In 2013 these fees came to about $242.2 million out of a total STM journals market of $10.5 billion. I thought that latter figure was a bit high, and I’m never sure when people are quoting figures for STM alone or for all journals; but even so, if the number for the total market is high, it’s not far off.  That means that OA is approximately 2.3% of the total journals market (or is that just STM . . . ?)….

And, quoting from one of the comments (it’s a fascinating comment stream, including some comments that made me want to scream, but…):

If those numbers are roughly right, then 2.3% of the scholarly publishing revenue equates to something like 22% of all published papers.

That comment is by Mike Taylor, who’s active in this comment stream.

I had no idea whether the Simba numbers made any sense and what magic Simba performed to get numbers from the more than two thousand Gold OA publishers (my own casual estimate based on DOAJ publisher names), but hey, that’s why Simba can get $2,500 for 61 pages…

The admission

There turned out to be a mistake or, if you will, a lie in the December 2014 Cites & Insights, on the very last page, top of the second column, the parenthetical comment. When I wrote that, I fully intended to sample perhaps 10%-20% of the 1,200+ bio/biomed/medical DOAJ journals not in the OASPA or Beall sets to get a sense of what they were like…

…and in the process realized what I should already have known: the journals are far to heterogeneous for sampling to mean much of anything. And, once I’d whittled things down, 1,200+ wasn’t all that bad. Long story short: I just finished looking at those journals (in the end, 1,211 of them–of the original 1,222, a few disappeared either because they turned out to be ones already studied or, more frequently, because there was not enough English in the interface for me to look at them sensibly).

Which means that I’ve now checked–as in visited and recorded key figures from–essentially all of the DOAJ journals (as of May 7, 2014) that have English as the first language code, in addition to some thousands of Beall-set journals and hundreds of OASPA journals that weren’t in DOAJ at that point.

Which means that I could do some very rough estimates of what a very large portion of the Gold OA journal field actually looks like.

Which means I could, gasp, second-guess Simba. Sort of. For $0 rather than $2,500.


The numbers I’m about to provide are based on my own checking of some absurdly large number of supposed Gold OA journals, yielding 9,026 journals that actually published articles between January 1, 2011 and June 30, 2014. The following caveats (and maybe more) apply:

  • A few thousand Gold OA journals in DOAJ that did not have English as the first language code in the downloaded database aren’t here. Neither are some number that did have English as the first language code but did not, in fact, have enough English in the interface for me to check them properly.
  • So-called “hybrid” OA journals aren’t here. Period.
  • Journals that appeared to be conference proceedings were omitted, as were journals that require readers to register in order to read papers, journals that impose embargoes, journals that don’t appear to have scholarly research papers and a few similar categories.
  • Some number of journals aren’t included because I was unable or unwilling to jump through enough hoops to actually count the number of articles. (See the October/November and December issues for more details; including the additional DOAJ bio/biomed/medical set, it comes to about 560 journals in all, most of them in the Beall set.)
  • I used a variety of shortcuts for some of the article counts, as discussed in the earlier essays.
  • Maximum potential revenue numbers are based on the assumptions that (a) all counted articles are in the original-article category, (b) there were no waivers of any sort, (c) the APC stated in the summer of 2014 is the APC in use at all times.

All of which means: while these numbers are approximate–the potential revenue figures more so than the article-count figures, I think, since quite a few fee-charging journals automatically reduce APCs for developing nations (as one example). On the other hand, some of the differences mean that I’m likely to be undercounting (the first four bullets) while the last bullet certainly means I’m overstating. Do they balance out? Who knows?

Second-guessing Simba

OK, here it goes:

Given all those caveats, I come up with the following for 2013:

  • Maximum revenue for Gold OA journals with no waivers: $249.9* million
  • Approximate number of articles published: 403* thousand

And, just for fun, here’s what I show for 2012:

  • Maximum revenue for Gold OA journals with no waivers: $200.2 million
  • Approximate number of articles published: 331 thousand

Here’s what’s remarkable: that maximum revenue of $249.9 million, which is almost certainly too high but which also leaves out “hybrid” journals and a bunch of others, is, well, all of 3.2% higher than Simba’s number.

Which I find astonishingly close, especially given the factors and number of players involved (and Simba’s presumed access to inside information, which I wholly lack).

(The 22% of all published papers? Close enough…although it should be noted that 403 thousand includes humanities and social sciences.)

Incidentally, 33 journals account for the first $100 million of that 2013 figure, including one that’s in the social sciences if you consider psychology to be a social science. Not to take away too much from what will appear elsewhere eventually, but if you sort by three major lumps, you get this:

  • Science, technology, engineering and mathematics (excluding bio/biomed/medicine): $66.0 million maximum potential revenue in 2013 for 170 thousand articles; $54.3 million maximum in 2012 for 138 thousand articles. Around 3,500 journals.
  • Biology and medicine: $174.5 million maximum potential revenue in 2013 for 180 thousand articles; $139.0 million maximum in 2012 for 150 thousand articles. Around 3,100 journals.
  • Humanities and social sciences (including psychology): $9.4 million maximum potential revenue in 2013 for 55 thousand articles; $6.9 million maximum in 2012 for 45 thousand articles. Around 2,400 journals.

Those are very raw approximate numbers, but I’d guess the overall ratios are about right. The gold rush is in bio/biomed/medicine: is anybody surprised?

What’s coming

I probably shouldn’t post this at all, since it weakens the January 2015 Cites & Insights, but what the heck…

In any case, now that I’ve looked at the 1,200+ additional journals, I will, of course, discuss those numbers.

(Credit to the late great Tom Magliozzi) The third half of the Journals and “Journals” deeper look will appear in part in the January 2015 Cites & Insights, out some time in December 2014 (Gaia willing and the creeks don’t rise).

That third half will be part of a multipart Intersections essay that also offers a few comments on the current DOAJ criteria (a handful of nits with a whole lot of praise) and considers the possibility that there’s a (dis)economy of scale in Gold OA publishing.

“In part”? Well, yes. I’ll do a discussion of the bio/med DOAJ subset that’s comparable to what I did for the other three sets of Gold OA journals, and I might include a few overall numbers. [See second postscript]

But there may be some more extended discussion of the overall numbers and how they break down (and maybe what they mean?), and that discussion might appear as a special section in the 2014 Cites & Insights Annual paperback, offering added value for the many (OK, maybe one so far) who purchase these paperbacks. It’s also possible that a complete retelling of this story will come out as a print on demand book, one that most definitely won’t be free, if I think there’s enough to add value. [See second postscript]

(Projections? I don’t do projections. I can say that, if the second half of 2014 equals the first half, there would be about 12% more Gold OA articles this year than last. I believe the Great OA Gold Rush of 2011-2013 is settling down…and that’s probably a good thing.)

Postscript, noon PST: I’ve enabled comments. I post so rarely these days that I’d forgotten that they’re now off by default.

Postscript, November 20, 2014:
After writing the abbreviated discussion (not that abbreviated: 14.5 C&I pages) and the full version, and letting it sit for a day or two, I’ve concluded that the full version doesn’t really add enough value for me to make a serious case that people should spend $45 for the paperback C&I Annual if they wouldn’t buy it otherwise. I think the Annuals are great and worth the money, but it’s pretty clear nobody else does.

So the full version–19 pages in the two-column format–will be the primary essay (or set of related essays) in the January 2015 volume, and the 2014 Annual will only add a wraparound cover and an index to the contents of the eleven 2014 issues. I’ve added strikeouts to the text above as appropriate.

As for a possible PoD book on Journals and “Journals”: still thinking about it.

*Additional postscript, December 27, 2014:

I’ve now gone through the rest of the DOAJ entries that offer English as one language possibility–another 2,200-odd, of which around 1,500 actually offered enough English for me to make sense of them. I’ve also gone through DOAJ itself for journals where I found it difficult to count articles directly (e.g., undated archives or archives consisting of whole-issue PDFs).

The bottom-line counts for articles and possible revenue for 2013 now come out to around 448,000 articles and around $261 million. Of that, around 366,000 and $231 million are from journals in DOAJ; Beall journals that aren’t in DOAJ–theoretically a larger number of journals, actually not–account for another 76,000 articles in 2013 (around 21% of DOAJ’s numbers) and around $22 million in potential revenue (around 9% of DOAJ numbers). The few hundred OASPA journals that aren’t in DOAJ account for fewer than 6,000 articles (less than 2% of DOAJ) and around $9 million (4% of DOAJ).

Some additional figures may appear in the March 2015 Cites & Insights; a coherent writeup of the whole OA journal scene (2011 through the first half of 2014)–or at least the very large portion of it I could investigate, essentially everything except 2,000-odd DOAJ journals that do not provide any form of English access–will appear next summer. More details later.

Comments are closed.