Ideas for Gold Open Access Journals 2011-2015: Second Call

April 21st, 2016

If you have opinions on what was great or not so great in The Gold OA Landscape 2011-2014, or ideas on how the book-length analysis and presentation could be better for the new, much more complete Gold Open Access Journals 2011-2015, I’d like to hear from you–ideally before May 1, 2016 (I’ll start working on the book right around then). (Note that the PDF ebook version will be free and freely available with a CC BY license; the paperback will be priced at roughly production cost.)

Which tables and graphs seem especially worthwhile? Which writeups were more or less informative?

Since most of you haven’t seen the full book, there are two resources to base your feedback on:

  • The October 2015 Cites & Insights includes about half the text and around half the tables from the book, but none of the graphs.
  • The April 2016 Cites & Insights includes an introductory essay but mostly consists of pages 39 through 73 of the book, chapters 5 through 9, showing exactly what’s in the book.

(Note: you can reasonably ignore the “Why Anonymize?” section of the introductory essay in that issue: in consultation with SPARC, I’ve decided to make the non-anonymized spreadsheet openly available when the analysis is complete, One very minor consequence of non-anonymity: seven small journals that I’d flagged as questionable for judgmental reasons are no longer flagged. That doesn’t affect the analysis at even 0.1% levels.)

Both links are to the 6×9″ “online” versions, which better reflect the book pages.

You can comment directly on this post (for a week or two) or, better yet, send email to I don’t promise to use your suggestions; I do promise to think about them seriously.

(I’ll be asking for feedback on one very new and fairly distinctive aspect of the 2011-2015 survey, which arose from a decision to look at countries by region–but I’ll have more to say about that next week, I think, in a blog post and as part of a short Cites & Insights.)


Minor updates:

  • If you’re following my recovery from surgery (excision of a Schwannoma, a benign nerv sheath tumor): No, I’m not back to full touch typing; have begun hand therapy and ordered Dragon NaturallySpeaking. Posting and C&I still much reduced and the textual portions of the book may be more concise than otherwise–which could be a good thing.
  • I’ve completed the second data-gathering pass for the 2011-2015 project. The number of fully-analyzed “good” journals is up from 9,512 to 10,324, and the rough estimate of total articles from those journals for 2015 is around 566,000.
  • Yes, there will be a Cites & Insights soon, probably before May 1; no, it probably won’t be very long, given the difficulties of six-fingered typing…

Warriors Classic 50 Movies, Disc 1

April 5th, 2016

Fifty movies about an Oakland basketball team: who woulda thunk it? OK, so they’re really “sword and sandals” movies—all those Hercules, Son of Hercules, Colossus, Ursus and similar pictures, strong on Legendary Heroes, usually strong on magic and gods/goddesses, with lots of wholly innocent beefcake and (usually) cheesecake, usually some humor along with lots of fighting, loads of scenery, surprisingly good production values and plots that don’t always make much sense. Oh, and really bad dubbing, except sometimes for the one or two American actors. These are fun movies, mostly Italian, and I grade them within their own realm: a really great sword-and-sandals flick might not be a classic in traditional Hollywood terms. It’s a thirteen-disc set (there aren’t many hour-long sword-and-sandals flicks); Part 1 covers discs 1-6.

Hercules and the Masked Rider (orig. Golia e il cavaliere mascherato), 1963, color. Piero Pierotti (dir.), Alan Steel (that is, Sergio Ciani), Mimmo Palmara, José Greci, Pilar Cansino, Arturo Dominici. 1:26 [1:23]

Who knew that Hercules (“Alan Steel”) was not only a demigod but a time traveler? In this flick (clearly shot in widescreen and panned-and-scanned, more’s the pity), he’s jumped from the second century BC to the 16th century CE, since there are at least two handguns along with the many swords—and he’s somehow riding with a band of gypsies in Spain. (According to the source of all knowledge, this character was Goliath in the Italian original, but that still involves time travel, albeit only 16 rather than 18+ centuries—and Goliath wasn’t an immortal demigod. Hey, it’s swords-and-sandal magic!)

This means that—other than Hercules, who seems allergic to shirts, and a few of the evil Don’s soldiers who wind up naked after being humiliated by the gypsies and Hercules—everybody’s fully clothed, from head to toe. (Even Hercules has a shirt on for maybe three minutes total.) It also means that there are no gods & goddesses, no magic (although the Evil Don would happily burn the head gypsy as a witch), just lots of plot.

Plot. Hard to say whether it’s ever worth describing the plot in these spectaculars, but here it’s two Dons with their lands on either side of a river—and the Don on one side is pure evil, just loving to hunt down innocent peasants trying to escape from forced labor and really loving the occasional torture opportunity. The other Don is aging, has a beautiful daughter, and is unwilling to risk war with the evil Don—to the extent that he’s willing to marry his daughter off to the evil Don in the thought that this might prevent war. Foolish (and soon dead) man! Meanwhile, the aged Don’s nephew, the actual love of the daughter (well, why not? they’re first cousins, but it’s 16h century Spain), has returned from battle (after meeting up with the gypsies, fighting Hercules to a draw in a one-hour contest that earns him not only his life but the welcome of the gypsies), and thinks this is all a terrible idea. He becomes the Masked Rider and…

Lots’o’plot ensues, and of course things all work out in the end. (Hercules isn’t really the primary character, but here’s there now and then. Some reviewers compared the real protagonist, the cousin, to Zorro: that’s not too far off.) And, you know, even though the premise is even more bizarre than usual, it’s fun. Good score, pretty good print. I’ll give it $1.50.

Spartacus and the Ten Gladiators (orig. Gli invincibili dieci gladiatori), 1964, color. Nick Nostro (dir.), Dan Vadis, Helga Line, Ivano Staccioli/John Heston, Alfredo Varelli/John Warrell Ursula Davis, Giuliano Dell’Ovo/Julian Dower. 1:39

What this movie has in common with the previous one: in both cases, the titular character is not the major protagonist—Spartacus is there for maybe a third of the picture, and the biggest of the ten gladiators (who in this case aren’t slaves but entertainer/warriors) is the protagonist (and, in the end, rides away with The Girl).

Otherwise: set in Roman times, with the Ten Gladiators blackballed by the primary entrepreneur (because the big one almost spears a Roman senator instead of killing the winner of a 12-person to-the-death battle who refused to kill his father, one of the others) saving a senator’s daughter from Bad Thieves and being recruited by the senator to find and kill (they prefer capture) Spartacus, who is supposedly thieving. They find and meet Spartacus (involving an apparently hours-long battle between the big guy and Spartacus, ending with both of them collapsed and laughing) and join to his cause—which is, mostly, to take his group back to Thrace and freedom.

The gladiators say they’ll go back and try to sell that to the senator (with the promise that he’ll be sent ransom money for the group later)…who says “sure, why not?” and drugs them over dinner, putting them in the dungeon.

There’s more plot—and, other than the sheer stupidity of the gladiators and the apparent deal that knocking an enemy out means he’s out of the action forever, it’s not as implausible as you might expect—ending with a reasonably satisfactory conclusion. The overall lesson: if the venal, vicious Senator Varro had let a hundred or so slaves escape, he would have avoided destroying a major part of the Roman army—and dying in the process. But, you know, power demands respect, especially wholly corrupt power.

Lots of fights, of course, with swords but the good guys prefer punching the other guys out; very little blood shown; some humor; the gladiators almost never wear anything above the waist or more than a foot or so below, if that matters; and the kind of production values (thousands of extras, huge battle scenes) you expect from these movies. I was particularly taken with one plot point: the gladiators, trying to figure out how to free the slaves held in a compound that combines mining with aqueduct-building, capture a blacksmith and convert him to the cause by noting that, if they free the slaves, there will be thousands of chains and handcuffs that he can melt down and make into shields and the like. He winds up being one of the foremost warriors in the grand battle.

Excellent print, great production values, but a narrow view of a wide-screen movie. Still, another $1.50.

The Conqueror of the Orient (orig. Il conquistatore dell’Oriente), 1960, color. Tanio Boccia (dir.), Rik Battaglia, Irene Tunc, Paul Muller. 1:26 [1:14]

The story of Dakar, an Evil Usurper who’s murdered the king (or sultan) and seized the throne, with an army that seems to go around burning villages for fun (which makes it difficult to provide the required tributes), and along the way found a beautiful young woman, Fatima, who Dakar would make the first of his many wives. We’re also introduced to a young fisherman, Nadir, (trawling in the river) and his elder. A bit later, Fatima escapes and is next found floating in a little boat about to hit rapids—and, of course, Nadir rescues her. (Perhaps the name “Nadir” is a clue as to the quality of this flick.)

One thing leads to another, Fatima is recaptured, the fisherman vows vengeance, and of course we learn that he’s the legitimate heir to the throne—and after lots of talk, more talk, some really bad scimitar-fights, and the like, he slays the usurper and brings eternal peace to his kingdom.

Pretty bad. The English-language scriptwriter appears to have had English as a third language (at one point, having been captured, our hero is left behind bars “until thirst and famine shall end his life.” Famine? Really? The production values are at best OK, the plot makes little sense. Maybe the missing 12 minutes would help; probably not. Charitably, $0.75.

The Last of the Vikings, 1961, color. Giacomo Gentilomo (dir.), Cameron Mitchell, Edmond Purdom, Isabelle Corey. 1:43.

“Prince Harald needs more wood!” That cry as hundreds of trees are being felled by wholly inept axe-wielders is probably the best dialogue in this mess. We also learn that Vikings fight by waving axes around a lot, that axes defeat bows and arrows even at long range, that some kings are hand-rubbing gibbering incarnations while princes just laugh a lot…and that perfidy runs deep in Norway.

As to the plot and acting and scenery…well, this was the first old flick I’d watched in almost three months (the DOAJ project was more fun); I was watching it the day after surgery; I was on low-dose opioids,,,without all of which I might not have made it all the way through. Maybe, charitably, $0.75.

Recovery: a short, slow post

March 31st, 2016

Since I’ve left notes elsewhere saying I’m mostly offline for the next [1:n] days [where n is indeterminate], I thought a little more detail might be in order:

  • The surgery: removing a Schwannoma (a benign nerve sheath tumor) from my right forearm–a visible bump perhaps 1.2″ long and 1.3″(?) high, determined to be benign by a January needle biopsy-which also irritated the lump and caused it to grow.
  • When? Tuesday, March 29, around 3:30 pm Stanford Hospital, Dr. David G. Mohler (who did a great job).
  • Pain? Not bad: of the allowed 2-pills-each-6-hours allowed, I needed 1 pill Tuesday afternoon, 1 at bedtime, 1 Wednesday a,m.(10 hrs later) and, since then 1/2 pill every eight hours, Good chance I’ll stop altogether tomorrow. (OTOH, my metabolism appears to be tough on drugs: the whole-arm nerve block, intended to last 8-12 hours, lasted about 3.5 hours. General anesthesia not wanted or needed,)
  • Problems? Maybe just reality: after trauma to the tendons and muscles and nerves in the arm, my fingers aren’t back to normal. (But gripping, etc. is pretty much OK.)

So I mostly need to let my right arm rest until the swelling goes down. I’ve seen how hard it is to work online without instinctively using both hands. So I’m mostly staying off. Two fingers are starting to come back to semi-normal; the rest could take a day, or three, or a week.

Otherwise? There’s leeway enough in The Big Project; I’m feeling good enough that I went for the daily walk around the 1,3-mile block with my wife today.

Thanks for the expressions of concern

The Great Paskins Mystery

March 26th, 2016

And now for something completely different (and this post won’t be publicized–it’s for people who subscribe or otherwise come here on their own, what others call “both of my readers”).

Who is Paskins?

More to the point, why do hundreds of spammers think that my name is Paskins–even though, to the best of my knowledge, “” doesn’t really sound much like Paskins and I’m the only one with this email address.Life i

It’s a curiosity. Fortunately, Gmail traps almost all of it as spam. But it’s a, well, curious curiosity: why would all these people be trying to contact Paskins at my email address?

Life is full of curiosities, I guess.

Making the case (a follow-up post)

March 26th, 2016

A while back, I wrote a post explaining why the dataset for Gold Open Access Journals 2011-2015 will not include journal names and publishers, and invited people to send me email explaining possible positive use cases if that decision was changed.

I’ve received one such email so far, resulting in an exchange of email; I’ve saved it for later consideration.

Meanwhile, a tweetstorm has erupted that seems to say that my work is useless if I don’t provide the full data. Apparently the other post is too long to read (or didn’t get read), so here’s a slightly different and shorter version–but you still need to read the other post before you respond.

  • If somebody attempted to replicate the research starting in, say, July 2016, the results will be different for some significant number of journals, for several reasons (some of them having to do with what gets counted, some of them having to do with delays in posting, some because journals that yield 404s in March may not in July or vice-versa).
  • Somebody out to snipe or discredit will also look at individual journals and disagree with my choice of which of 28 broad subjects to assign it to; in quite a few cases, more than one choice is reasonable.
  • I’m very interested in use cases–cases where useful additional research would be possible based on a non-anonymized spreadsheet. (In some such cases, the dataset will be made available to the group or person–I’ve already done that for the previous dataset.) If there are convincing cases, I’d talk to SPARC about whether it makes sense to open up the data completely. And hope that I don’t spend the rest of the year dealing with a stream of “But THIS NUMBER’S WRONG, so your whole study’s worthless” or “But THIS JOURNAL’S REALLY ABOUT X, so your whole study’s worthless” or variants of that.
  • Email (to calmly suggesting positive use cases will be dealt with politely and taken into account. Head-on attacks 140 characters at a time are, shall we say, less likely to persuade me. (Well, they might persuade me never to get involved in this kind of project again, so if that’s your motive…)
  • Oh, and by the way: This isn’t about hiding methodology. I’ve never done so, and don’t plan to start now.

I’ll be off the air entirely for several days beginning the evening of March 28, so email may not receive quick responses at that point. Meanwhile, I’d like to get back to getting something done.

In partial defense of Jeffrey Beall

March 25th, 2016

Not in defense of his lists, which I regard as a bad idea in theory and fatally flawed in practice, for reasons I’ve documented (most recently here but elsewhere over time).

But…I’ve seen some stuff on another blog lately that bothers me.

  • I do not for a minute believe that Jeffrey Beall wrote the supposed email I’ve seen that suggests a listed publisher would be re-evaluated for $5,000. That email was written using English-as-a-third-language grammar; it’s just not plausible as coming from Beall.
  • I truly dislike the notion that a doctorate is the minimum qualification for scholarship. But then, I would, wouldn’t I (since my pinnacle of academic achievement is a BA and a handful of credits toward an MA).
  • I also dislike the notion that state colleges are somehow disreputable. My own degree comes from a state institution, and I’ll match its credentials with anybody.

The same blog had an interesting fisking of one of Beall’s sillier anti-OA papers. I had tagged it toward a future Cites & Insights essay on access and ethics. But after seeing this other stuff…I won’t link to or source from this particular blog.  Heck, I’ve been the subject of Beall’s ad hominem attacks; doesn’t mean I have to support that sort of thing.

Cites & Insights 16:3 (April 2016) available

March 23rd, 2016

The April 2016 Cites & Insights (16:3) is now available for downloading at

That print-oriented version is 30 pages long. If you’re planning to read online or on an ereader, you may prefer the single-column 6″ x 9″ version, 59 pages long, available at

While much of this issue has appeared as a series of posts in this blog, the final section of the lead essay is new, as is the fourth essay; the final section reprints 35 pages of The Gold OA Landscape 2011-2014 to serve as context for a portion of the first essay.

This issue includes:

The Front: Gold Open Access Journals 2011-2015: A SPARC Project pp. 1-8

Remember the “watch this space” note in the February-March “The Front”? This is what it was about. This essay includes the key announcement, a partial list of changes from the 2011-2014 project, a partial checkpoint prepared when I was halfway through the first pass, a section asking for possible “changes for the better” in the analysis and writeup (note that this year’s PDF ebook will be free and OA, since it’s a SPARC-sponsored project), another section discussing the planned anonymization of the (free) spreadsheet when analysis is done–and, new to this issue, a second checkpoint prepared at the end of the first journal pass.

The Front (also): Readership Notes  pp. 8-9

Notes on the most frequently downloaded issues in Volume 15 and the most frequently downloaded issues overall.

Intersections: “Trust Me”: The Other Problem with Beall’s Lists  pp. 9-11

As far as I can tell, Jeffrey Beall provides no evidence whatsoever–not even his classic “this publisher has a funny name”–for seven out of eight journals and publishers on his 2016 lists. This piece, which has a little additional material beyond the original post, goes into some detail.

The Back  pp. 11-12

Not precisely filler to get an even number of pages, but…OK, so these three mini-rants are mostly filler to get an even number of pages.

The Gold OA Landscape 2011-2014, pp. 39-73   following page 12

I’m including chapters 5 (starting dates), 6 (country of publication), 7 (segments and subjects), 8 (biology and medicine) and 9 (biology) to provide more context for my invitation to suggest better ways to analyze and present the 2011-2015 data. Please note that these pages appear precisely as they would in the PDF ebook if you’re looking at the online 6″ x 9″ version (since the book’s 6″x9″), but are reduced very slightly for the print-oriented version (to 5.5″x8.5″) so that two book pages will fit on one printed page.

Next issue?

I did not label this the April-May 2016 issue. Whether there’s a May issue in late April or early May, or a May-June issue later in May, depends on a number of factors having mostly to do with Gold Open Access Journals 2011-2015.

Why Anonymize?

March 14th, 2016

The project plan for Gold Open Access Journals 2011-2015 calls for me to make an anonymized version of the master spreadsheet freely available—and as soon as the project was approved, I made an anonymized version of the 2014 spreadsheet available.

Two people raised the question “Why anonymized?”—why don’t I just post the spreadsheet including all data, instead of removing journal names, publishers and URLs and adding a simple numeric key to make rows unique?

The short answer is that doing so would shift the focus of the project from patterns and the overall state of gold OA to specifics, and lead to arguments as to whether the data was any good.

Maybe that’s all the answer that’s needed. Although I counted very little use of the 2014 spreadsheet in January and February 2016, it’s been used more than 900 times in the first half of March 2016—but I have received no more queries as to why it’s anonymized. For any analysis of patterns, of course, journal names don’t matter. But maybe a slightly longer answer is useful.

That longer answer begins with the likelihood that some folks would try to undermine the report’s findings by claiming that the data is full of errors—and the certainty that such folks could find “errors” in the data.

Am I being paranoid in suggesting that this would happen? Thanks to Kent Anderson, I can safely say that I’m not, since within a day or two of my posting the spreadsheet, he tweeted this:

Anderson didn’t say “Am I misunderstanding?” or “Clarification needed” or any alternative suggesting that more information was needed. No: he went directly on the attack with “Errors exist” (by completely misreading the dataset, as it happens: around 500 gold OA journals began publication, usually not as OA, between 1853 and 1994).

It’s not wrong, it’s just different

To paraphrase Ed and Patsy Bruce (they wrote the song, even though Willie Nelson and Waylon Jennings had the big hit with it)…

If somebody else—especially someone looking to “invalidate” this research—goes back to do new counts on some number of journal, they will probably get different numbers in a fair number of cases.

Why? Several reasons:

  • Inclusiveness: Which items in journals—and which journals—do you include? The 2014 count tended to be more exclusive when I had to count each article individually; the 2015 count tends to include all items subject to some form of review, including book reviews and case reports. Similarly, the 2015 report includes journals that consist of (reviewed) conference reports (although I’ll note the subset of such journals).
  • Shortcuts: I did not in fact look at each and every item in each and every issue of each and every journal, compare it to that journal’s own criteria for reviewed or peer-reviewed, and determine whether to include it. To do that, I’d estimate that a single year’s count would require at least 2,000 hours exclusive of determining APC existence and levels and all other overhead—and, of course, a five-year study would require four times that amount (fewer journals and articles in earlier years). That’s not plausible under any circumstances. Instead, I used every shortcut that I could: publication-date indexes or equivalent for SciELO, J-Stage, MDPI, Dove and several others; DOI numbers when it’s clear they’re assigned sequentially; numbered tables of contents; Find (Ctrl-F) counts for distinctive strings (e.g., “doi:” or “HTML”) after quick scans of the contents tables. For the latter, I did make rough adjustments for clear editorials and other overhead.
  • Estimates: In some cases—fewer in 2015 than in 2014, but still some—I had to estimate, as for instance when a journal with no other way of counting publishes hundreds of articles each year and maintains page numbering throughout a dozen issues. I might count the articles in one or two issues, determine an average article length, and estimate the year’s total count based on that length. I also used counts from DOAJ in many cases, when those counts were plausible based on manual sampling.
  • Errors: I’m certain that my counts are off by one or two in some cases; that happens.
  • Late additions: Some journals, especially those that are issue-oriented and still include print versions, post online articles very late. Even though I’m retesting all cases where the “final issue” of 2015 seemed to be missing when checked in January-March 2016, it’s nearly certain that somebody looking at some journals in, say, August 2016 will find more 2015 articles than I did.

In practice, I doubt that any two counts of a thousand or more OA journals will yield precisely the same totals. I’d guess that I’m very slightly overcounting articles in some journals that provide convenient annual totals—and undercounting articles in some journals that don’t.

For the analysis I’m doing, and for any analysis others are likely to do, these “errors” shouldn’t matter. If somebody claimed that overall numbers were 5% lower or 5% higher, my response would be that this is quite possible. I doubt that the differences in counts would be greater than that, at least for any aggregated data.

Making the case

If you believe I’m wrong—that there are real, serious, worthwhile research cases where only the unanonymized version will do—let me know (

Obviously, anonymized datasets aren’t unusual; I don’t know of any open science advocate who would seriously argue that medical data should be posted with patient names or that libraries should keep enough data to be able to do analysis such as “people who borrowed X also borrowed Y.” In practice, there may be special use cases for an open copy of the master spreadsheet. On the other hand, except for the list of journals flagged as having malware on their sites, I’ll be doing my analysis with the anonymized spreadsheet—it’s what’s needed for this work, and won’t distract me with individual journal titles and how I might feel about their publishers.

Changes for the Better?

March 11th, 2016

Do you have suggestions that will help make Gold Open Access Journals 2011-2015 even better than The Gold OA Landscape 2011-2014?

If so, now’s the time to suggest them—any time between now and May 1, 2016 (the earliest date I’m likely to start working on data analysis and the book manuscript). Suggestions should go to me at

You say you haven’t purchased the book yet, either in paperback or PDF ebook form? You still can, and it will still be worthwhile when the new book comes out.

Alternatively, you can get a good idea of the general approach and tables used in the excerpt published as the October 2015 Cites & Insights, although that version lacks any graphs.

I’ve appended pages 39 through 73 of The Gold OA Landscape 2011-2014 to the end of the next Cites & Insights, probably out in late March 2016. That segment includes almost all varieties of tables and graphs used in the book. The online version is an exact replica of the print book; the print (two-column) version is just slightly smaller, so that four pages of the 6×9″ book fit on each 8.5×11″ sheet rather than having loads of waste space.

The Basics

Basically, the data used for analysis includes for each journal the year reported to DOAJ (which is not always the start of publication), the country of publication (again as reported to DOAJ), one of 28 subjects and three broad areas that I’ve derived from the subjects, keywords and journal/article titles for the journals, and the data I went looking for: whether there’s an author-side fee (usually called an APC or Article Processing Charge but they’re not all that straightforward) and how much it is, and the number of published articles (and similar items) for each year 2011 through 2015. There’s also a two-letter code (or “grade and subgrade”) for special cases, but most journals don’t have special codes. I also derive some measures: the peak article number during the five years and, if there are APCs, the maximum revenue for 2014 (2015 this time around).

Last year, after an overall discussion of maximum revenues, overall article counts, and special cases, I looked at journals by annual article volume for each of the three major areas (which have very different characteristics), fee and revenue levels, starting dates for free and APC-charging journals, and a number of measures by country of publication. I also provided one set of pie charts breaking down free and pay journals by major area.

For each of the three major areas (biomed, STEM, and humanities and social sciences) I looked at cost per article by year, journal and article volume by year (and free percentage of each), revenue brackets for journals, article volume brackets, and APC level brackets. A bar graph showed free and pay articles for each year.

For each subject within an area—using the revenue and article volume brackets appropriate for that area—I showed journals and articles for each year (and free percentage), the free/pay article bar graph, journals by article volume (and percent free), journals and articles by APC range, a line graph showing free and pay journals by starting date, and a table showing the countries with the most published 2014 articles for that subject.

At the end of the book, I provided a few subject summaries—percentage of free journals, percentage of articles in no-fee journals, change in article volume, change in free article volume, journals changing article volume by 10% or more from 2013 to 2014, average APC per paid article and for all articles, median APC per paid article and all articles, and the median, first quartile, and third quartile articles per journal for 2014.

Data Changes for 2015

There’s another year of data—more journals and more data for existing journals. I’m taking some pains to include more journals (and defining “articles” somewhat more inclusively and, I believe, consistently).

Beyond that, there may be one new category of derived data: a publisher category—breaking journals down into what seem to be five reasonable groups based on what’s in the DOAJ publisher field:

  • Academic, published by universities and colleges, including university presses.
  • Society, published by societies and associations.
  • Traditional*, published by publishers that also publish subscription journals.
  • OA publisher*, published by groups that don’t appear to publish subscription journals (and that publish at least a handful of journals—see notes on the “*” below)
  • Miscellany, everybody else.

About the asterisk on Traditional and OA publisher: there are 5,983 different “publisher names” (that is, distinct character strings in the DOAJ publisher field). That’s more than one “publisher” for every two journals. The vast majority of those, all but 919, publish a single DOAJ-listed journal.

I think it’s reasonable to limit the two “publisher” categories (Traditional and OA) to firms that publish at least a handful of journals, and lump the others in as Miscellany. (If nothing else, it makes this added data feasible.)

What’s a handful? If the cutoff is “five or more,” it involves only 221 publishers in all, accounting for 4,128 journals. If the cutoff is “four or more,” it involves 316 publishers—and, naturally, adds 380 journals for a total of 4,508. Dropping it to “three or more journals” brings us up to 486 publishers and 5,018 journals. I suspect the final cutoff will be either four or five

Incidentally, if I add that column, it will be in the anonymized spreadsheet made publicly available at the end of this project. Other than the list of journal titles apparently containing malware, it will be possible for anybody else to replicate any or all of the graphs and numbers in the book.

Probable Changes

I believe it will make sense to devote a chapter to publisher categories—whether there are major differences in article volume, APC charges (existence and amount) and, possibly, domination in some countries.

I’m fairly certain the pie charts will go away: I don’t believe they add enough information to justify the space. I could be convinced otherwise. (Note that the print paperback will, of necessity, be black and white to keep production costs down, so really attractive pie charts aren’t feasible.)

Possible Changes

What else should I consider? Which existing tables and graphs don’t seem especially valuable—and what would work better? (Assume that this year’s book can be larger than last, but not enormously larger.)

I’m open to suggestions, which I’ll discuss with my contacts at SPARC (and I anticipate suggestions from SPARC as well).

I would offer a free PDF version of this year’s book as a reward for good suggestions—but since this year’s PDF version will be free in any case, that’s

Gold OA Journals 2011-2015: Grade Changes and an Update

February 10th, 2016

After reviewing the numbers in The Gold OA Landscape 2011-2014 and considering what I can and, more significantly, cannot reasonably ascertain and judge in non-English journals and in short visits to websites, and in consultation with SPARC contacts, I made a number of changes in grades and, as a result, in exclusions.

I did not change the list of subjects and areas, although a few journals may have been assigned new subjects—and, as in the previous study, PLOS One is omitted from subject and area figures but included in overall discussions.

The fundamental meaning of Grade B has changed from “deserves attention” to “might be excluded from DOAJ or in some versions of Open Access.”

Changes in Grade A Subgrades

All subgrades for Grade A have been eliminated. Subgrade C (ceased) is now a subgrade for Grade B. Subgrades D, E, H, O and S—all cases where some year other than the first had fewer than five articles—have been collapsed into Grade B, Subgrade F (few or no 2015 articles) if the article count for 2015 is less than 5 and simply Grade A otherwise.

Changes in Grade B Subgrades

Grade B consists of journals that may or may not belong, either in DOAJ or in a study of open access, depending on your definitions. The old subgrades all have to do with mild visual or editorial issues that now seem as though they’re imposing my own values inappropriately.

There are four new subgrades—two from Grade A and two from Grade X, albeit with different letters.

  • C: Ceased—journals that published at least one article later than 2010 but explicitly ceased during or before 2015, have merged with other journals, or show no articles more recent than 2012.
  • F: Few or no 2015 articles—journals that published at least one article later than 2012 and published fewer than five articles in 2015. (By current DOAJ rules, these are subject to delisting.)
  • R: Conference and other reports—journals consisting entirely or primarily of conference papers and other reports. These were previously excluded, in subgrade XN, as not OA.
  • S: Sign-in or registration required—journals that require some form of registration before reading articles. These were previously excluded, also in subgrade XN, as not OA.

Changes in Grade C Subgrades

Grade C, “avoid this journal,” has been narrowed somewhat, specifically to eliminate subgrades that involve personal judgment or have so few journals that they’re hardly worth noting. Specifically, subgrades E (very bad English), S (incoherent site) and T (absurd article titles—there were almost none of these) have been eliminated, leaving subgrades A (APC missing), F (clear falsehoods), O (mix of problems) and P (implausible peer review turnaround). Briefly, clear falsehoods are statements such as “the leading journal in this field” for a brand-new journal; implausible peer-review turnaround involves promises to complete all peer reviews in a couple of days.

Changes in Grade X Subgrades

Grade X, excluded journals, retains the same subgrades—but the two largest categories within subgrade N (not OA) have been moved to subgrades BR and BS.

A Partial Checkpoint

What are the consequences of these changes? In general, and combined with more exhaustive checking of some difficult situations, they should mean that more journals will be included in the full analysis. As for specific results, those won’t be clear until the project is complete.

I thought it would be worth offering some glimpses into what might be happening at a natural breakpoint: essentially halfway through the first pass of data gathering (actually 5,500 of 10,948).

First pass? Yes indeed. There will be a second pass, beginning no earlier than April 1, 2016, for quite a few of the journals, for various reasons:

  • Many smaller journals, especially in the humanities and social sciences, post online articles and issues with significant delays. In practice, even waiting a year won’t get them all. I’m rechecking all journals that appear to be missing final issues for 2015; this gives them at least three months to get the articles posted.
  • I’m rechecking all journals that couldn’t be reached or that showed signs of malware, as well as those that showed as parking or ad pages or were unworkable.
  • I’ll take a second look at journals excluded for various reasons, trying harder to make sense of opaque cases and translation difficulties, looking more closely for apparently-missing APCs, rechecking whether certain journals are OA or not.

So far, it looks as though I’ll need to recheck about one-fifth of the journals: 1,047 of the first 5,500. I’d be delighted if that percentage goes down in the second half—but I’d also be surprised.

All the rest of these numbers are truly tentative, since review of the journals may change their categorization.

Free and Pay

Some journals started imposing APCs that didn’t have them previously (one large publisher dropped all of its free introductory periods); some (fewer) drop APCs; and some clarify the nature of their charges.

Overall, the percentage of no-APC journals (among journals where it’s clear) among the first half dropped from 64.9% to 59.8%: there are more no-fee journals than in the previous study, but there are a lot more APC-charging journals. (There are also, to be sure, more journals in general: about 412 so far.) There are fewer journals (so far) where there is an APC but it’s hidden.

The Newbies

Most journals that weren’t in the 2014 study are simply A (that is, “nothing special here one way or the other”), but 30 have fewer than five articles in 2015, a few couldn’t be contacted or were unworkable, a handful fall into various other categories—and, unfortunately, nine showed signs of malware.

Neutral Changes

Some changes in grade and subgrade are neutral: they’re just redefinitions. That’s true for the journals that changed from various A grades to BC (ceased explicitly or with no articles later than 2012): there are some 218 BC so far. It’s also true for the various A subgrades that are now simply A (around 230 of them) and for a number of other changes including quite a few moving from B subgrades to A.

Some 300 journals had five or more articles in 2014 but not in 2015, moving them all to BF: some of those will add articles in a recheck.

Changes for the Good

Some 27 journals previously graded CA (APC missing or hidden) now have more clarity (and four changed to various X subgrades).

Quite a few journals with explicit falsehoods on their homepages have been cleaned up—at least 80 of them.

Half a dozen journals flagged for malware no longer seem to have that problem (but see later!).

Most “not OA” entries in the first half have moved elsewhere on re-examination or redefinition, including 35 journals oriented to conference programs (another seven that had been “A” appear to be predominantly conferences and have been moved here) and ten that require registration to read articles. Some two dozen moved elsewhere, including 17 that now appear to be proper OA journals.

Most journals that I previously found too difficult to count (XO) are now handled, and I hope to reduce the number (70 for this half in the previous study is currently down to 28) even further.

Roughly half of the XT (couldn’t understand the site well enough to measure it) cases have been cleared up: so far, there are only three such journals in the first half, and I’ll try all of them again.

Changes for the Bad

A few journals have changed home pages such that I can no longer find an APC (but am sure they have one), but it’s a tiny number.

Some 70 journals that were reachable the last time around are either unreachable or unworkable when I checked this time; they’ll all be rechecked, but it’s unfortunate that there are so many.

Finally there’s the most unfortunate group, in my opinion: journals that now show signs of malware—frequently, I suspect, because they include ad networks that don’t have proper standards. A journal gets flagged for malware if Malwarebytes or McAfee Site Advisor or Windows Defender flags it or some of its components as malware; cases include phishing attempts and deliberate malware downloads. There are now twice as many of these as there were (for this subset of journals) in the previous study, and that’s about 72 too many.

Summing Up

Hundreds of new journals; a much shorter and simpler set of grades; adding literally thousands of peer-reviewed articles that were given as conference papers.

Far fewer journals falling by the wayside because I only read English (thanks, Google!) or because I can’t or am unwilling to count them (with true broadband, I’m willing to open up a dozen PDFs a year to see how many articles there are).

There will still be some approximate counts, but fewer (and better approximations) than last time around.

And, of course, the results will be freely available to everybody. In a few months.