The Gold OA Landscape 2011-2014: Medicine

October 12th, 2015

Another in an intermittent series of posts encouraging folks to buy The Gold OA Landscape 2011-2014, in part by noting what’s not in the excerpted Cites & Insights version.

Chapter 10 is Medicine–which probably should be broken into, say, half a dozen subsets, but I don’t know enough to make that breakdown. It’s by far the largest subject, as noted in the excerpted version.

A few items from the book’s coverage:

  • While a majority of articles published in serious gold OA journals in 2013 and 2014 involve APCs, a majority of those in 2011 and 2012 were in no-fee journals.
  • Fee-based articles have more than doubled since 2011.
  • 44% of articles involving APCs appeared in journals within the most expensive segment, $1,960 and up–and the average for articles involving APCs was $1,446 per article ($854 per article overall).
  • There does seem to be a gold rush of APC-charging journals starting in 2007 and peaking in 2009-2010.
  • You can probably guess the two countries publishing the most medicine articles in 2014, but maybe not the order: UK first, US second. Iran is sixth. For the rest of the 22 countries with at least 1,000 articles, see the book.

Much, much more in the book. Worthwhile for your library or if you’re seriously interested in OA. If enough copies sell (no change in the last week), the anonymized spreadsheet will go up on figshare; if enough more copies sell (or some other form of funding comes through), the study will be continued in 2016 for 2015 publications.

And you can buy the book through Amazon (and possibly Ingram), although it counts three times as much toward sales goals if you buy through Lulu.

This should not be my fight

October 7th, 2015

I’ve probably said this before, but thinking about yesterday’s post reminded me of it once again.

That is:

This should not be my fight.

No, I haven’t gone to each site that wrote a story touting the Shen/Björk article to point out the problems with the data–especially now that it’s clear what the response will be. Somebody should. They have the actual data.

But it shouldn’t be me. It’s really not my fight.

I didn’t even start out to discredit Beall’s lists. I did cross swords with him on his absurd notion that the Big Deal had solved the serials crisis, but I did a real-world study of the journals and publishers on his lists to get a reality check. I was fully ready to believe that the picture was as bleak as he painted it–and if that’s how the data had come out, that’s how I would have published it.

After all: I don’t publish any OA journals. I’m not on the editorial board of any OA journals. I don’t need publications for tenure (I’m retired and was never in a tenure-track position). I don’t make big bucks from speaking fees (haven’t done many appearances lately, and that’s OK). I sure as heck don’t make big bucks from the data gathering and analysis, although ALA Editions has published some of my work in the area (not big bucks, but some bucks and a venue I regard highly).

For that matter, I’ve been the subject of ad hominem attacks from Stevan Harnad as well as Jeffrey Beall, so I’m not even well-liked among all OA folks.

What I’ve been trying to do is see what’s actually happening and bring my 26 years of off-and-on experience with OA to bear in looking at what’s going on now and what’s being said. My mildly obsessive personality and retired status, and reasonably well organized techniques, have allowed me to do some large-scale studies that wouldn’t have been done otherwise. (With modest funding, I’d keep on doing them.)

It’s painful to see questionable results spread far and wide: it hurts good OA (the bulk of it) and probably doesn’t do much to questionable OA. It’s painful to see librarians and others take the easy way out, relying on a seriously defective set of blacklists rather than starting with an increasingly good whitelist (DOAJ) and working from there.

I’ll continue to provide facts and perspectives. (I’ve just subdivided a bunch of tagged items into a baker’s dozen subtopics within the overall “Ethics and Access” topic. That’s probably the December Cites & Insights; it might also be the January one, depending on how it goes.) I’ll continue to post the occasional post. I’m hoping some libraries, librarians, OA folks and others will eventually buy the book (which is apparently now available on Amazon as well as via Lulu; it may also be on Ingram, but I have no way of testing that). It’s always a pleasure to see my work being cited or used where it’s appropriate.

I’m not going away just yet…but as for coping with all the misrepresentations well, it’s not (or at least not entirely) my fight.

For those of you who need a Respectable Published Source:

I refer you to Open-Access Journals: Idealism and Opportunism, published by the American Library Association. That link gets you to the $43 40-page monograph (published as the August/September 2015 issue of Library Technology Reports). You can also go here to read the first chapter or order the ebook version (I believe you can also order individual chapters). If you’re in one of the several hundred libraries that subscribes to Library Technology Reports, it should already be available to you. (The link here is to one of two records for the series.)

Open-Access Journals: Idealism and Opportunism was professionally copy-edited, edited, and typeset. It was also reviewed by three professionals (two librarians, one other), although that wasn’t formal peer review. It’s concise, and includes not only real-world figures for 6,490 gold OA journals (in DOAJ) publishing 366,210 articles in 2013, it includes chapters on the “sideshow” of Beall’s lists, dealing with OA journals (including spotting questionable journals), and libraries and OA journals.

(It’s not a complete survey of DOAJ, because it doesn’t include journals that lack an English-language interface option. It also goes through June 30, 2014 rather than the end of 2014–thus, the 366,210 count is for 2013). It’s also, of course, far less detailed than The Gold OA Landscape 2011-2014.

But it’s concise, well-edited, based on an actual survey rather than sampling, and published by what I consider to be the premier publisher in librarianship, part of the world’s largest library association. So it has that level of authority that my self-pubbed works may not have.

The author? Walt Crawford. (No, I’m not angling for extra money here: the fee for preparing the issue was a one-time fee, with no royalties. But the final chapters make it a great resource, and for those who require Reputable Publishers, you don’t get more reputable than ALA.)



The Gold OA Landscape 2011-2014: a brief note on numbers

October 6th, 2015

oa14c300Here’s the tl;dr version: Go buy The Gold OA Landscape 2011-2014, either the $60 paperback or the $55 site-licensed PDF ebook (the contents are identical other than the copyright page/ISBN). I try to be wholly transparent about my investigations, and I’m confident that TGOAL represents the most accurate available count for serious gold OA publishing (excluding non-DOAJ members, “hybrids” and other stuff). Oh, and if enough copies are sold, I’ll keep doing this research…which I don’t think anybody else is going to do and which, as far as I can tell, can’t really be automated.

Running the Numbers

Now that I’ve said that, I won’t repeat the sales pitch. You presumably already know that you can get a hefty sampling of the story in Cites & Insights 15:9–but the full story is much more complete and much more interesting.

Meanwhile, I’ve gotten involved or failed to get involved in a number of discussions about numbers attached to OA.

On September 30, I posted “How many articles, how many journals?,” raising questions about statistics published in MDPI’s Sciforum asserting the number of OA journals and articles–numbers much lower than the ones I’ve derived by actual counting. I received email today regarding the issues I raised:

Thank you for passing this on. I think it’s quite difficult to pin down exactly how many papers are published, never mind adding in vagueries about the definition of ‘predatory’ or ‘questionable’ publishers. The data on Sciforum are taken from Crossref and, on, shows about 300,000 OA articles published in 2014. The difference may depend on correct deposition (including late or not at all), article types or publishers just not registered with Crossref. I think ball-park figures are about the closest we can get as things stand.

Well…yes and no. I think it’s highly likely that many smaller OA journals aren’t Crossref members or likely to become Crossref members: for little journals done out of a department’s back pocket, even $275/year plus $1/article is a not insignificant sum.

What bothers me here is not that the numbers are different, but that there seems to be no admission that a full manual survey is likely to produce more accurate numbers, not just a different “ball-park figure.” And that “pinning down” accurate numbers is aided by, you know, actually counting them. The Sciforum numbers are based on automated techniques: that’s presumably easy and fast, but that doesn’t make it likely to be right.

Then there’s the Shen/Björk article…which, as I might have expected, has been publicized all over the place, always with the twin effects of (a) making OA look bad and (b) providing further credibility to the one-man OA wrecking crew who shall go nameless here. The Retraction Watch article seems to be the only place there’s been much discussion of what may be wrong with the original article. Unfortunately, here is apparently the totality of what Björk chooses to say about mine and other criticisms:

“Our research has been carefully done using standard scientific techniques and has been peer reviewed by three substance editors and a statistical editor. We have no wish to engage in a possibly heated discussion within the OA community, particularly around the controversial subject of Beall’s list. Others are free to comment on our article and publish alternative results, we have explained our methods and reasoning quite carefully in the article itself and leave it there.”

Whew. No willingness to admit that their small sample could easily have resulted in estimates that are nearly three times too high. No willingness to admit that the author-nationality portion, based on fewer than 300 articles, is even more prone to sampling error. They used “standard scientific techniques” so the results must be accurate.

No, I’m not going around to all the places that have touted the Shen/Björk article to add comments. Not only is life too short, I don’t believe it will do much good.

The best I can do is transparent research with less statistical inference and more reliance on dealing with heterogeneity by full-scale testing, and hope that it will be useful. A hope that’s sometimes hard to keep going.

Meanwhile: I continue to believe that a whitelist approach–DOAJ‘s tougher standards–is far superior to a blacklist approach, especially given the historical record of blacklists.



Cites & Insights 15:10 (November 2015) available

October 5th, 2015

Cites & Insights 15:10 (November 2015) is now available for downloading at

This print-oriented two-column version is 38 pages long. If you plan to read the issue on a tablet or computer, you may prefer the 6″x9″ single column version, 74 pages long, which is available at

Unlike the book-excerpt October 2015 issue, there’s no advantage to the single-column version (other than its being single-column), and copyfitting has only been done on the two-column version. (As has been true for a couple of months, both versions do include links, bookmarks and visible bolding.)

This issue includes the following essays, stepping away from open access for a bit:

The Front: A Fair Use Trilogy   p. 1

A few notes about the rest of the issue–and a status report on The Gold OA Landscape 2011-2014.

Policy: Google Books: The Neverending Story?  pp. 1-18

Three years of updates on the seemingly endless Google Books story, which has now become almost entirely about fair use.

Policy: Catching Up on Fair Use  pp. 18-24

A handful of items regarding fair use that don’t hinge on Google Books or HathiTrust.

Intersections: Tracking the Elephant: Notes on HathiTrust  pp. 24-38

Pretty much what the title says, and again the main thrust appears to be fair use. (The elephant? Read the essay, including a little bit of Unicode.)


Careful reading and questionable extrapolation

October 2nd, 2015

On October 1, 2015 (yesterday, that is), I posted “The Gold OA Landscape 2011-2014: malware and some side notes,” including this paragraph:

Second, a sad note. An article–which I’d seen from two sources before publication–that starts by apparently assuming Beall’s lists are something other than junk, then bases an investigation on sampling from the lists, has appeared in a reputable OA journal and, of course, is being picked up all over the place…with Beall being quoted, naturally, thus making the situation worse. I was asked for comments by another reporter (haven’t seen whether the piece has appeared and whether I’m quoted), and the core of my comments was that it’s hard to build good research based on junk, and I regard Beall’s lists as junk, especially given his repeated condemnation of all OA–and, curiously, his apparent continuing belief that author-side charges, which in the Bealliverse automatically corrupt scholarship, only happen in OA (page charges are apparently mythical creatures in the Bealliverse). So, Beall gains even more credibility; challenging him becomes even more hopeless.

When I’d looked at the article, twice, I’d had lots of questions about the usefulness of extrapolating article volumes and, indeed active journal numbers from a rather small sampling of journals within an extremely heterogeneous space–but, glancing back at my own detailed analysis of journals in those lists (which, unlike the article, was a full survey, not a sampling), I was coming up with article volumes that, while lower, were somewhere within the same ballpark (although the number of active journals was less than half that estimated in the article. (The article is “‘Predatory’ open access: a longitudinal study of article volumes and market characteristics” by Cenyu Shen and Bo-Christer Björk; it’s just been published.)

Basically, the article extrapolated 8,000 active “predatory” journals publishing around 420,000 articles in 2014, based on a sampling of fewer than 700 journals. And, while I showed only 3,876 journals (I won’t call them “predatory” but they were in the junk lists) active at some point between 2011 and June 2014, I did come up with a total volume of 323,491 articles–so I was focusing my criticism of the article on the impossibility of basing good science on junk foundations.

Now, go back and note the italicized word two paragraphs above: “glancing.” Thanks to an email exchange with Lars Bjørnshauge at DOAJ, I went back and read my own article more carefully–that is, actually reading the text, not just glancing at the figures. Turns out 323,491 is the total volume of articles for 3.5 years (2011 through June 30, 2014). The annual total for 2013 was 115,698; the total for the first half of 2014 was 67,647, so it’s fair to extrapolate that the 2014 annual total would be under 150,000.

That’s a huge difference: not only is the article’s active-journal total more than twice as high as my own (non-extrapolated, based on a full survey) number, the article total is nearly three times as high. That shouldn’t be surprising: the article is based on extrapolations from a small number of journals in an extremely heterogeneous universe, and all the statistical formulae in the world don’t make that level of extrapolation reliable.

Shen and Björk ignored my work, either because it’s not Properly Published or because they weren’t aware of it (although I’m pretty sure Björk knows of my work). They say “It would have taken a lot of effort to manually collect publication volumes” for all the journals on the list. That’s true: it was a lot of effort. Effort which I carried out. Effort which results in dramatically lower counts for the number of active journals and articles.

(As to the article’s “geographical spread of articles,” that’s based on a sample of 205 articles out of what they seem to think are about 420,000. But I didn’t look at authors so I won’t comment on this aspect.)

I should note that “active” journals includes those that published at least one article any time during the period. Since I did my analysis in late 2014 and cut off article data at June 30, 2014, it’s not surprising that the “active this year” count is lower for 2014 (3,014 journals) than for 2013 (3,282)–and I’ll agree with the article that recent growth in these journals has been aggressive: the count of active journals was 2,084 for 2012 and 1,450 for 2011.

I could speculate as to whether what I regard as seriously faulty extrapolations based on a junk foundation will get more or less publicity, citations, and credibility than counts based on a full survey–but carried out by an independent researcher using wholly transparent methodology and not published in a peer-reviewed journal. I know how I’d bet. I’d like to hope I’m wrong. (If not being peer-reviewed is a fatal problem, then a big issue in the study goes away: the junk lists are, of course, not at all peer reviewed.)



Mystery Collection Disc 45

October 2nd, 2015

The Manipulator, 1971, color. Yabo Yablonsky (dir & screenplay), Mickey Rooney, Luana Anders, Keenan Wynn. 1:25 [1:31]

No. No no no. It’s been almost six months since I watched one of these, and more like this could make me give up entirely. The plot, to the extent that I saw it: Mickey Rooney as a crazed old Hollywood person who carries all parts of a movie-making set of conversations as he bumps into thinks in an old prop warehouse…but he’s got an actress tied up as well (kidnapped and being slowly starved), and I guess that their interactions are the heart of the movie. But after 20 minutes, I just couldn’t—and wish I’d given up after ten.

I didn’t see Keenan Wynn during the chunk I watched. Looking at the IMDB reviews, I see one that values it as an experimental film and, well, I guess you can make the worst shit look like roses if you try hard enough. Another praises it for Rooney’s “extraordinarily uninhibited performance,” but several say things like “endurance test for the viewer” and “nearly unwatchable.” I’m with them: not only no redeeming value, but really nasty. No rating.

Death in the Shadows (orig. De prooi), 1985, color. Vivian Peters (dir.), Maayke Bouten, Erik de Vries, Johan Leysen, Marlous Fluitsma. 1:37.

This one’s pretty good—with plenty of mystery, although the metamystery’s easy enough to resolve. (The metamystery: why is a 1985 color film available in a Mill Creek Entertainment set? The answer: it’s from the Netherlands, has no stars known in America, and wouldn’t have done well as a U.S. release.)

In brief: an almost-18-year-old young woman finds that her mother was killed—and that her mother didn’t have any children. The young woman now lives alone (and her boyfriend/lover is leaving for a big vacation as it’s the end of the school year), and—sometimes working with a police detective, sometimes ignoring his advice—wants to know what happened. In the process, she almost gets run down (which is what happened to her mother), her mother’s brother gets murdered, and she avoids death. We find out what happened.

Moody, frequently dark, fairly well done. Maayke Bouten is quite effective as the young woman, Valerie Jaspers. but this is apparently her only actual film credit (she was 21 at the time, so 18 isn’t much of a stretch: she also did one TV movie and appeared as herself on a TV show). Not fast-moving and no flashy special effects, but a pretty good film. $1.50.

Born to Win, 1971, color. Ivan Passer (dir.), George Segal, Paula Prentiss, Karen BlackJay Fletcher, Hector Elizondo, Robert De Niro. 1:28 [1:24]

The disc sleeve identifies Robert De Niro as the star here, but this is very much a George Segal flick, with Karen Black and others—although De Niro’s in it (for some reason feeling to me like Billy Crystal playing Robert De Niro). The movie’s about a junkie (Segal) and…well, it’s about an hour and 24 minutes long.

Beyond that: poor editing, worse scriptwriting, continuity that deserves a “dis” in front of it. I got a hint in the first five minutes that this was going to have what you might call an “experimental” narrative arc, and so it was. Pretty dreary, all in all. Yes, it’s a low-budget indie with a great cast, but… (I will say: most IMDB reviews seem very positive. Good for them.) Charitably, for George Segal or Karen Black fans, maybe $0.75.

A Killing Affair, 1986, color. David Saperstein (dir.), Peter Weller, Kathy Baker, John Glover. 1:40.

A juicy chunk of Southern Gothic—set in West Virginia in 1943, starring Kathy Baker as the wife (or, really, property of a mill foreman who’s ripping off the employees, openly sleeping with other women, and generally a piece of work. A stranger comes to…well, not so much town as the house across the lake from town where Baker lives (with her children on weekends—during the week, they stay in town with her brother, the preacher who clearly believes that women are to Obey their husbands).

Ah, but shortly before the stranger (Peter Weller) shows up, she discovers that her rotten husband is now hanging in the shed, very much dead. She makes some efforts to get help but isn’t quite willing to walk two miles to town (the boat’s gone), so… Anyway, the stranger shows up and Plot happens. Part of it: he admits to killing her husband, but claims her husband killed his wife and children and was about to shoot him. And there are all sorts of family secrets involved in her past. A pack of wild dogs also plays a role throughout the flick, especially in the climax.

Languid most of the time, with an unsurprising ending. Not terrible, not great; Weller’s a pretty convincing mentally unstable (but smooth!) killer, and Baker’s pretty much always good, and certainly is here. (How does a movie this recent and plausibly good wind up in a cheap collection? I have no idea.) I’ll give it $1.25.

The Gold OA Landscape 2011-2014: malware and some side notes

October 1st, 2015

First, a very brief status report. As of this morning, the book has sold five copies (four paperback, one ebook)–exactly the same numbers as a week ago, September 24, 2015. This is, how you say, not especially rapid progress toward the twin goals of making the data available and carrying forward the research into 2016. (Meanwhile, the October 2015 Cites & Insights has been downloaded at least 1,300 times so far–about 85% of those downloads being the more-readable single-column version of this excerpted version of The Gold OA Landscape 2011-2014. (If one out of every 20 downloads yielded a sale of the book, that would meet the data-availability goal and probably the next-year’s-research goal…)

Second, a sad note. An article–which I’d seen from two sources before publication–that starts by apparently assuming Beall’s lists are something other than junk, then bases an investigation on sampling from the lists, has appeared in a reputable OA journal and, of course, is being picked up all over the place…with Beall being quoted, naturally, thus making the situation worse. I was asked for comments by another reporter (haven’t seen whether the piece has appeared and whether I’m quoted), and the core of my comments was that it’s hard to build good research based on junk, and I regard Beall’s lists as junk, especially given his repeated condemnation of all OA–and, curiously, his apparent continuing belief that author-side charges, which in the Bealliverse automatically corrupt scholarship, only happen in OA (page charges are apparently mythical creatures in the Bealliverse). So, Beall gains even more credibility; challenging him becomes even more hopeless. [See this followup post]

Third, a somewhat better note: Cheryl LaGuardia has published “An Interview with Peter Suber” in her “Not Dead Yet” column at Library Journal. If you haven’t already read it, you should. A couple of key quotes (in my opinion):

Not all librarians are well-informed about OA, but as a class they’re much better informed than faculty.

First, scam OA journals do exist, just as scam subscription journals exist. On the other side, first-rate OA journals also exist, just as first-rate subscription journals also exist. There’s a full range of quality on both sides of the line. Authors often need help identifying the first-rate OA journals, or at least steering clear of the frauds, and librarians can help with that. The Directory of Open Access Journals (DOAJ) is a “white list” of trustworthy OA journals…

I used to think [“hybrid” OA] was good, since at least it gave publishers first-hand experience with the economics of fee-based OA journals. But I changed my mind about that years ago. Because these journals still have subscriptions, they have no incentive to make the OA option attractive. The economics are artificial. Moreover, as I mentioned, most hybrid OA journals double-dip, which is dishonest. But even when it’s honest, it’s still a small OA step that’s often mistaken for a big step.

Finally, the direct tie-in to the book…and to the second quote from the Suber interview.


The excerpted version omits the whole section on exclusions–DOAJ-listed journals that weren’t included in the study for a variety of reasons. In most cases, it’s not necessarily that these journals are scam journals (the term “predatory” has been rendered meaningless in this context) but that, for one reason or another, they either don’t fit my definition of a gold OA journal devoted to peer-reviewed articles or that I was simply unable to analyze them properly.

One unfortunate subcategory includes 65 journals, which is 65 more than should appear in this category: journals with malware issues. My best guess is that some of these will disappear from DOAJ and that others either try too hard for ad revenue (accepting ads that incorporate malware) or have been badly designed, or for that matter use some convenient add-in for the website that just happens to carry malware. I don’t believe there’s any excuse for a journal to raise malware cautions–even if some of the defense tools I use might be overly cautious. (I added Malwarebytes after an OA journal infected my PC with a particularly nasty bit of malware, and at least two others attempted to load the same malware. It took me two days to get rid of the crap, and I have no interest in repeating that process. McAfee Site Adviser seems to be omnipresent in browsers and new computers, and since it’s now part of Intel I see no reason to distrust it.)

In any case: since it doesn’t look like OA publishers are rushing to buy the book and dig through it (I know, it’s early days yet), I’ll include that section here–the single case in which I actually list journal titles other than PLOS One (which I mention by name in the book because I excluded it from subject and segment discussions in order to avoid wrecking averages and distributions, since it is more than six times as large as any other OA journal).

Here’s the excerpt:

M: Malware

When attempting to reach these journals’ webpages, either Microsoft Office, McAfee Site Advisor, Windows Defender or Malwarebytes Anti-Malware threw up a caution screen indicating that the site had malware of some sort. (Actually, in one case the website got past all four—and showed an overlay that was a clear phishing attempt.)

In some few cases, the warning was a McAfee “yellow flag”; in most, it was either a McAfee red flag or Malwarebytes blocked the site.

Given that I encountered a serious virus with at least three different journals in a previous pass (getting rid of the virus is one reason I now run Malwarebytes as well as Windows Defender; note that I do not run McAfee’s general suite, but only the free Site Advisor that flags suspicious websites on the fly), I was not about to ignore the warnings and go look at the journals. I’d guess that, in some cases, the malware is in an ad on the journal page. In any case, it’s simply not acceptable for an OA journal to have malware or even possible malware.

I find it sad that there are 65 of these. They are not dominated by any one country of publication: 27 countries are represented among the 65 offending sites, although only a dozen have more than one each. The countries with more than three possible-malware journals include Germany and India (seven each), Brazil (six), Romania and the Russian Federation (five each), and the United States (four).

Malware Possibilities

While this report generally avoids naming individual journal titles or publishers, since it’s intended as an overall study, I think it’s worth making an exception for these 65 cases. These journals may have fixed their problems, but I’d approach with caution:

Acta Medica Transilvanica

Algoritmy, Metody i Sistemy Obrabotki Dannyh

Analele Universitatii din Oradea, Fascicula Biologie

Andhra Pradesh Journal of Psychological Medicine

Annals and Essences of Dentistry

Applied Mathematics in Engineering, Management and Technology

Avances en Ciencias e Ingeniería


Breviário de Filosofia Pública

Chinese Journal of Plant Ecology

Communications in Numerical Analysis

Confines de Relaciones Internacionales y Ciencia Política

Contemporary Materials

Data Envelopment Analysis and Decision Science



Economic Sociology

Education Research Frontier


European Journal of Environmental Sciences

Exatas Online

Filosofiâ i Kosmologiâ

Forum for Inter-American Research (Fiar)


Global Engineers and Technologists Review

Health Sciences and Disease

Impossibilia : Revista Internacional de Estudios Literarios

International Journal of Academic Research in Business and Social Sciences

International Journal of Ayurvedic Medicine

International Journal of Educational Research and Technology

International Journal of Information and Communication Technology Research

International Journal of Pharmaceutical Frontier Research

İşletme Araştırmaları Dergisii

Journal of Behavioral Science for Development

Journal of Community Nutrition & Health

Journal of Interpolation and Approximation in Scientific Computing

Journal of Management and Science

Journal of Nonlinear Analysis and Application

Journal of Numerical Mathematics and Stochastics

Journal of Soft Computing and Applications

Journal of Wetlands Environmental Management

Kritikos. Journal of postmodern cultural sound, text and image

Latin American Journal of Conservation

Mathematics Education Trends and Research

Nesne Psikoloji Dergisi

Networks and Neighbours


Potravinarstvo : Scientific Journal for Food Industry

Proceedings of the International Conference Nanomaterials : Applications and Properties

Psihologičeskaâ Nauka i Obrazovanie

Psikiyatride Guncel Yaklasimlar

Regionalʹnaâ Èkonomika i Upravlenie: Elektronnyi Nauchnyi Zhurnal

Revista Caribeña e Ciencias Sociales

Revista de Biologia Marina y Oceanografia

Revista de Educación en Biología

Revista de Engenharia e Tecnologia

Revista de Estudos AntiUtilitaristas e PosColoniais

Revista Pădurilor

Romanian Journal of Regional Science

Studii de gramatică contrastivă

Tecnoscienza : Italian Journal of Science & Technology Studies

Tekhnologiya i Konstruirovanie v Elektronnoi Apparature

Vestnik Volgogradskogo Gosudarstvennogo Universiteta. Seriâ 4. Istoriâ, Regionovedenie, Meždunarodnye Otnošeniâ


How many articles, how many journals?

September 30th, 2015

Thanks to Dietrich Rordorf’s comment on a post at Scholars Kitchen, I am now aware of MDPI’s sciforum (the link is to the journal reviews/statistics section), which among other things “aims at publishing statistics and rankings of scientific and scholarly journals and their Publishers.” Quoting from the disclaimer:

Statistics are automatically computed from available data, and are not manually curated. While we have made every possible effort to provide meaningful statistics, we can not guarantee the correctness or accuracy of any of the statistics. Statistics might be recomputed anytime without notice. Access to statistics might be disabled anytime without notice.

Quoting the section on Open Access:

Data about papers published under open access licenses are currently collected from two providers: DOAJ (licensing information available on journal-level but only for journals that publish exclusively under open access licenses), and Publisher website metadata (licensing information available on a per paper-level for some Publishers). We will include PubMed Central article-level licensing data in a future update. Because many hybrid journals do not offer metadata or licensing information which are easily machine-readable, the statistics about open access content are likely too low. E.g. JR reports 270’000 open access papers for a total market size of roughly 2.4 million papers for 2013 (which is about 11% open access papers). In reality the share of open access papers might be much higher if all papers published under open licenses in hybrid journals could be easily and properly counted). Green open access, i.e. self-archived pre- or post-prints are currently not included.

The section also specifies where data comes from.

I’m delighted there is such a source. Think of the rest of this as additional data (yes, I’ll be emailing a note to Rordorf, but I’m not sure how he can blend manually-counted and automatically-gathered data).

Note that I don’t include “hybrid” OA in my counts at all, partly because there’s no good way to count it, partly because it’s consistently the most expensive form of OA and, I believe, the wrong way to go about OA. Neither does this site because there’s no good way to count it.

Number of Journals

Sciforum shows 25,064 journals publishing at least one article in 2014, of which 3,693 journals are Gold OA—and the chart shows that as being down from 3,990 with at least one article in 2013.

My study, excluding questionable journals and those not in DOAJ, shows 8,760 gold OA journals publishing at least one article in 2014, but that number is down slightly, from 8,960 in 2013.

That’s an enormous difference, one that I believe speaks to the limitations (at this point) of automated data gathering for OA. Even leaving out the global south can’t really account for omitting more than half of the active journals. (Sciforum does not indicate that it’s limited to STM, so I’m assuming it’s not.)

Number of Articles

Sciforum shows 2,423,122 articles in 2014, up from 2,248,966 in 2014. So “roughly a quarter million” seems like a plausible estimate for the total article production.

But: Sciforum shows 302,339 OA articles in 2014, 279,967 in 2013, 250,237 in 2012 and 196,508 in 2011.

The Gold OA Landscape 2011-2014 shows 482,361 articles in 2014, 440,843 in 2013, 394,374 in 2012 and 321,312 in 2011.

Those are also enormous differences, although slightly smaller percentage-wise. To wit, my actual count (again omitting hybrids, questionable journals and journals not in DOAJ) is 60% higher for 2014, 57% higher for 2013, 58% higher for 2012 and 64% higher for 2011.

If we assume that subscription journals can be measured more accurately through automated means (an assumption I’m a little wary of making), then actual article totals for 2014 are around 2.6 million, of which around 18% are in gold OA journals.

My main takeaway: at this point, automated data gathering severely undercounts the OA field—which, given the ludicrous amount of time spent gathering data manually (but hey! book sales are already up to…well, five copies so far), is at least a trifle reassuring.

The Gold OA Landscape and Outsell’s Open Access 2015

September 29th, 2015

I thought it might be interesting to look at Outsell’s Open Access 2015: Market Size, Share, Forecast and Trends (which the CCC seems to have made openly available, or at least it was for a while—if it’s back to $2,500, my apologies)in light of The Gold OA Landscape 2011-2014. Is there anything useful or mildly controversial to say?

Definitional Differences

Outsell estimates OA as “about 1.1% of the total 2014 STM market and 4.3% of the STM journals market”—but remember that this is a dollar share, not magnitude. Also, it’s STM, while a sizable chunk of the OA universe as I measured it is humanities and social sciences.

The methodology is entirely different, of course: Outsell develops estimates where I tried to do a universal count. Outsell also has industry contacts, which I entirely lack.

A third key definitional difference, especially given Outsell’s finding: I explicitly exclude “hybrid” publishing—not only because I think it’s a trap but because it’s essentially impossible to count.

Also, Outsell seems to define “megajournals” based on crossing subject boundaries, which strikes me as odd; I’d define them as journals with very large numbers of articles. There are hundreds of interdisciplinary journals; there are only a few very large OA journals.

Questionable Statements

Outsell says (p. 5) “Hybrid currently prevails as the Gold model.” If “prevails” means there are more hybrid journals than there are Gold OA journals (by my definition and that of DOAJ, a hybrid journal cannot be gold OA; otherwise it wouldn’t be hybrid), then that may be true, as so many big publishers will happily take big bucks to make an article OA (sort of) in any of their journals. If “prevails” means that there are more OA articles published in hybrid journals than in Gold OA journals, that would be astonishing: it would mean there were close to a million OA articles published in 2014 (not including green OA). I find that hard to believe, and I don’t know how Outsell could determine this number, so I’ll have to assume “prevails” has to do with number of journals.

On pages 5-6, Outsell offers a proliferation of terms and models including “Platinum” OA, “Gold for Gold” and more. As you read this section, it’s ever more clear that the point of Outsell’s report is to inform publishers how they can best maintain and increase revenues—so, for example, institutionally-sponsored OA is not included in the list of Gold models because “it does not produce revenues.” If it doesn’t generate $Gold, it isn’t worth discussing.

Market Size and Forecast

Outsell estimates the 2014 total STM market at a breathtaking $26.2 billion but the serials market at “only” $6.8 billion. Given other estimates of $10 billion, I can only wonder whether this means that the non-STM journals market is $3.2 billion (which seems unlikely) or whether something else is going on.

Outsell’s numbers for APCs are $290.4 million in 2014, $252.3 million in 2013, and $171.9 million in 2012. But those numbers include articles in “hybrid” journals, which Outsell says are the more prevalent model.

My own figures—not allowing for discounts and waivers, but also not including hybrid publications—are $305.4 million in 2014, $241.9 million in 2013, and $195.5 million in 2012. Those numbers do include the humanities and social sciences, but HSS only accounts for $9.5 million in 2014, $7.7 million in2013 and $6.6 million in 2012.

What’s interesting (to me) is that, if you subtract HSS from my figures, they’re not wildly different from Outsell’s numbers (Outsell’s numbers are about 2% lower in 2014, 8% higher in 2013, and 9% lower in 2013). I could conclude that hybrid APCs don’t really amount to much, or that Outsell defines STM more narrowly than my STEM+Biomed, or…


Outsell says that there are approximately 20 megajournals—but also that they published fewer articles in 2014 than in 2013. Looking at the graph, it appears that Outsell is saying that all megajournals put together published about 40,000 articles in 2014, as compared to 43,000 or so in 2013 and 25,000 or so (?) in 2012.

But it’s hard to tell what Outsell considers to be a megajournal. Let’s look at some subsets within the DOAJ landscape (among the 9,512 I report on fully):

  • Journals with at least 1,000 articles in some year 2011-2014: There are 40 of these. I count the article totals as 38,830 in 2011; 56,827 in 2012; 70,986 in 2013; and 88,315 in 2014. That’s a much smaller percentage increase from 2012 to 2013 than Outsell’s graph seems to show—but my figures show between 24% and 25% increase in articles from 2012 to 2013 and again from 2013 to 2014. So: I’m showing a healthy increase from 2013 to 2014, not a decrease.
  • Cut that down to journals with at least 1,500 articles in one of those years (14 of them), and you still get a healthy increase each year: 27,052 in 2011; 39,683 in 2012; 49,131 in 2013; and 61,844 in 2014.
  • Trim it to the eight journals with at least 2,000 articles in one year, and there’s still an increase in each year:23,461 in 2011; 35,145 in 2012; 43,667 in 2013; and 51,646 in 2014.
  • Include only “other sciences” (my term for interdisciplinary STM journals) and PLOS One, and use 1,000 as the cutoff, and I get 17,167 articles in 2011; 28,168 in 2012; 38,604 in 2013; and 43,443 in 2014.
  • If I include PLOS One and all “Other Sciences” journals (171 journals in all), I get 23,041 articles in 2011; 35,502 in 2012; 47,607 in 2013; and 54,210 in 2014.

I have no doubt that Outsell is able to define some set of “megajournals” that declined in article count from 2013 to 2014. Since PLOS One alone increased (very slightly, from 31,509 to 31,882), that presumably means that all the other megajournals combined went from around 11,500 articles in 2013 to around 8,000 in 2014. That’s a surprisingly large drop (although one very big and very badly run “megajournal” could account for all of it). Certainly possible—again, depending on how you define megajournals.

Competitive Landscape

Table 4 in the Outsell report is interesting if only because of the numbers: the 14 publishers listed publish a total of 11,740 journals in 2014 (which would appear to be about half of the total) but only 1,505 Gold OA journals (and that includes Hindawi’s 438 and the whole set of BioMed Central journals), fewer than one-sixth of the Gold OA total. (Take away Hindawi, which only publishes Gold OA journals, and you’re down to less than one-eighth of the gold OA total). But note that those 14 publishers claim to have 8,404 hybrid journals.

Ah, but Outsell’s only looking at STM. That does make a difference, because there are so many HSS gold OA journals (more than 4,000 in my study): If you remove those from my count, I’m left with just under5,500 journals. The 14 biggies still account for less than one-third of the STM Gold OA total (around one-fifth without Hindawi), but that’s an improvement.


I question the supposed decline in megajournal publishing activity, but that’s a matter of definition.

I don’t particularly question the estimated APC totals—unless Outsell is seriously claiming that hybrid publications account for most APCs, in which case I’d question them a lot: I’d believe that waivers and discounts might reduce my numbers by, say, 15%-20%, but that would still leave a huge gap.

Since Outsell doesn’t estimate overall article counts at all, my primary focus, I have no comments.

In general, given differences in definition, the only fault I might find with Outsell (I can’t fault them for a laser focus on $$$) is the decline in megajournal publishing from 2013 to 2014–and, again, I’m sure they managed to define a group of journals that comes out that way.

pubs_since_1994.htm finally up

September 28th, 2015

For the thousands (well, hundreds (well, tens (well…anybody?))) of avid readers of Cites & Insights August-September 2015 who, wondering about the quotes in “A Few Words, Part 2,” clicked through to find the bibliography…

It’s (finally) there, such as it is:

My apologies for the slight delay in getting it ready. The fact that I’ve seen zero instances of anybody looking at the first part of the bibliography may have influenced the priority with which this part was prepared…

(But hey, there are lots of 404s on, as usual: I could convince myself that those were all folks looking for the second part of the bibliography. I could convince myself that I look like a slightly older George Clooney, too, but it would be equally absurd.)