Changes for the Better?

Do you have suggestions that will help make Gold Open Access Journals 2011-2015 even better than The Gold OA Landscape 2011-2014?

If so, now’s the time to suggest them—any time between now and May 1, 2016 (the earliest date I’m likely to start working on data analysis and the book manuscript). Suggestions should go to me at

You say you haven’t purchased the book yet, either in paperback or PDF ebook form? You still can, and it will still be worthwhile when the new book comes out.

Alternatively, you can get a good idea of the general approach and tables used in the excerpt published as the October 2015 Cites & Insights, although that version lacks any graphs.

I’ve appended pages 39 through 73 of The Gold OA Landscape 2011-2014 to the end of the next Cites & Insights, probably out in late March 2016. That segment includes almost all varieties of tables and graphs used in the book. The online version is an exact replica of the print book; the print (two-column) version is just slightly smaller, so that four pages of the 6×9″ book fit on each 8.5×11″ sheet rather than having loads of waste space.

The Basics

Basically, the data used for analysis includes for each journal the year reported to DOAJ (which is not always the start of publication), the country of publication (again as reported to DOAJ), one of 28 subjects and three broad areas that I’ve derived from the subjects, keywords and journal/article titles for the journals, and the data I went looking for: whether there’s an author-side fee (usually called an APC or Article Processing Charge but they’re not all that straightforward) and how much it is, and the number of published articles (and similar items) for each year 2011 through 2015. There’s also a two-letter code (or “grade and subgrade”) for special cases, but most journals don’t have special codes. I also derive some measures: the peak article number during the five years and, if there are APCs, the maximum revenue for 2014 (2015 this time around).

Last year, after an overall discussion of maximum revenues, overall article counts, and special cases, I looked at journals by annual article volume for each of the three major areas (which have very different characteristics), fee and revenue levels, starting dates for free and APC-charging journals, and a number of measures by country of publication. I also provided one set of pie charts breaking down free and pay journals by major area.

For each of the three major areas (biomed, STEM, and humanities and social sciences) I looked at cost per article by year, journal and article volume by year (and free percentage of each), revenue brackets for journals, article volume brackets, and APC level brackets. A bar graph showed free and pay articles for each year.

For each subject within an area—using the revenue and article volume brackets appropriate for that area—I showed journals and articles for each year (and free percentage), the free/pay article bar graph, journals by article volume (and percent free), journals and articles by APC range, a line graph showing free and pay journals by starting date, and a table showing the countries with the most published 2014 articles for that subject.

At the end of the book, I provided a few subject summaries—percentage of free journals, percentage of articles in no-fee journals, change in article volume, change in free article volume, journals changing article volume by 10% or more from 2013 to 2014, average APC per paid article and for all articles, median APC per paid article and all articles, and the median, first quartile, and third quartile articles per journal for 2014.

Data Changes for 2015

There’s another year of data—more journals and more data for existing journals. I’m taking some pains to include more journals (and defining “articles” somewhat more inclusively and, I believe, consistently).

Beyond that, there may be one new category of derived data: a publisher category—breaking journals down into what seem to be five reasonable groups based on what’s in the DOAJ publisher field:

  • Academic, published by universities and colleges, including university presses.
  • Society, published by societies and associations.
  • Traditional*, published by publishers that also publish subscription journals.
  • OA publisher*, published by groups that don’t appear to publish subscription journals (and that publish at least a handful of journals—see notes on the “*” below)
  • Miscellany, everybody else.

About the asterisk on Traditional and OA publisher: there are 5,983 different “publisher names” (that is, distinct character strings in the DOAJ publisher field). That’s more than one “publisher” for every two journals. The vast majority of those, all but 919, publish a single DOAJ-listed journal.

I think it’s reasonable to limit the two “publisher” categories (Traditional and OA) to firms that publish at least a handful of journals, and lump the others in as Miscellany. (If nothing else, it makes this added data feasible.)

What’s a handful? If the cutoff is “five or more,” it involves only 221 publishers in all, accounting for 4,128 journals. If the cutoff is “four or more,” it involves 316 publishers—and, naturally, adds 380 journals for a total of 4,508. Dropping it to “three or more journals” brings us up to 486 publishers and 5,018 journals. I suspect the final cutoff will be either four or five

Incidentally, if I add that column, it will be in the anonymized spreadsheet made publicly available at the end of this project. Other than the list of journal titles apparently containing malware, it will be possible for anybody else to replicate any or all of the graphs and numbers in the book.

Probable Changes

I believe it will make sense to devote a chapter to publisher categories—whether there are major differences in article volume, APC charges (existence and amount) and, possibly, domination in some countries.

I’m fairly certain the pie charts will go away: I don’t believe they add enough information to justify the space. I could be convinced otherwise. (Note that the print paperback will, of necessity, be black and white to keep production costs down, so really attractive pie charts aren’t feasible.)

Possible Changes

What else should I consider? Which existing tables and graphs don’t seem especially valuable—and what would work better? (Assume that this year’s book can be larger than last, but not enormously larger.)

I’m open to suggestions, which I’ll discuss with my contacts at SPARC (and I anticipate suggestions from SPARC as well).

I would offer a free PDF version of this year’s book as a reward for good suggestions—but since this year’s PDF version will be free in any case, that’s

