Archive for October, 2015

The Gold OA Landscape 2011-2014: a brief note on numbers

Tuesday, October 6th, 2015

oa14c300Here’s the tl;dr version: Go buy The Gold OA Landscape 2011-2014, either the $60 paperback or the $55 site-licensed PDF ebook (the contents are identical other than the copyright page/ISBN). I try to be wholly transparent about my investigations, and I’m confident that TGOAL represents the most accurate available count for serious gold OA publishing (excluding non-DOAJ members, “hybrids” and other stuff). Oh, and if enough copies are sold, I’ll keep doing this research…which I don’t think anybody else is going to do and which, as far as I can tell, can’t really be automated.

Running the Numbers

Now that I’ve said that, I won’t repeat the sales pitch. You presumably already know that you can get a hefty sampling of the story in Cites & Insights 15:9–but the full story is much more complete and much more interesting.

Meanwhile, I’ve gotten involved or failed to get involved in a number of discussions about numbers attached to OA.

On September 30, I posted “How many articles, how many journals?,” raising questions about statistics published in MDPI’s Sciforum asserting the number of OA journals and articles–numbers much lower than the ones I’ve derived by actual counting. I received email today regarding the issues I raised:

Thank you for passing this on. I think it’s quite difficult to pin down exactly how many papers are published, never mind adding in vagueries about the definition of ‘predatory’ or ‘questionable’ publishers. The data on Sciforum are taken from Crossref and, on, shows about 300,000 OA articles published in 2014. The difference may depend on correct deposition (including late or not at all), article types or publishers just not registered with Crossref. I think ball-park figures are about the closest we can get as things stand.

Well…yes and no. I think it’s highly likely that many smaller OA journals aren’t Crossref members or likely to become Crossref members: for little journals done out of a department’s back pocket, even $275/year plus $1/article is a not insignificant sum.

What bothers me here is not that the numbers are different, but that there seems to be no admission that a full manual survey is likely to produce more accurate numbers, not just a different “ball-park figure.” And that “pinning down” accurate numbers is aided by, you know, actually counting them. The Sciforum numbers are based on automated techniques: that’s presumably easy and fast, but that doesn’t make it likely to be right.

Then there’s the Shen/Björk article…which, as I might have expected, has been publicized all over the place, always with the twin effects of (a) making OA look bad and (b) providing further credibility to the one-man OA wrecking crew who shall go nameless here. The Retraction Watch article seems to be the only place there’s been much discussion of what may be wrong with the original article. Unfortunately, here is apparently the totality of what Björk chooses to say about mine and other criticisms:

“Our research has been carefully done using standard scientific techniques and has been peer reviewed by three substance editors and a statistical editor. We have no wish to engage in a possibly heated discussion within the OA community, particularly around the controversial subject of Beall’s list. Others are free to comment on our article and publish alternative results, we have explained our methods and reasoning quite carefully in the article itself and leave it there.”

Whew. No willingness to admit that their small sample could easily have resulted in estimates that are nearly three times too high. No willingness to admit that the author-nationality portion, based on fewer than 300 articles, is even more prone to sampling error. They used “standard scientific techniques” so the results must be accurate.

No, I’m not going around to all the places that have touted the Shen/Björk article to add comments. Not only is life too short, I don’t believe it will do much good.

The best I can do is transparent research with less statistical inference and more reliance on dealing with heterogeneity by full-scale testing, and hope that it will be useful. A hope that’s sometimes hard to keep going.

Meanwhile: I continue to believe that a whitelist approach–DOAJ‘s tougher standards–is far superior to a blacklist approach, especially given the historical record of blacklists.



Cites & Insights 15:10 (November 2015) available

Monday, October 5th, 2015

Cites & Insights 15:10 (November 2015) is now available for downloading at

This print-oriented two-column version is 38 pages long. If you plan to read the issue on a tablet or computer, you may prefer the 6″x9″ single column version, 74 pages long, which is available at

Unlike the book-excerpt October 2015 issue, there’s no advantage to the single-column version (other than its being single-column), and copyfitting has only been done on the two-column version. (As has been true for a couple of months, both versions do include links, bookmarks and visible bolding.)

This issue includes the following essays, stepping away from open access for a bit:

The Front: A Fair Use Trilogy   p. 1

A few notes about the rest of the issue–and a status report on The Gold OA Landscape 2011-2014.

Policy: Google Books: The Neverending Story?  pp. 1-18

Three years of updates on the seemingly endless Google Books story, which has now become almost entirely about fair use.

Policy: Catching Up on Fair Use  pp. 18-24

A handful of items regarding fair use that don’t hinge on Google Books or HathiTrust.

Intersections: Tracking the Elephant: Notes on HathiTrust  pp. 24-38

Pretty much what the title says, and again the main thrust appears to be fair use. (The elephant? Read the essay, including a little bit of Unicode.)


Careful reading and questionable extrapolation

Friday, October 2nd, 2015

On October 1, 2015 (yesterday, that is), I posted “The Gold OA Landscape 2011-2014: malware and some side notes,” including this paragraph:

Second, a sad note. An article–which I’d seen from two sources before publication–that starts by apparently assuming Beall’s lists are something other than junk, then bases an investigation on sampling from the lists, has appeared in a reputable OA journal and, of course, is being picked up all over the place…with Beall being quoted, naturally, thus making the situation worse. I was asked for comments by another reporter (haven’t seen whether the piece has appeared and whether I’m quoted), and the core of my comments was that it’s hard to build good research based on junk, and I regard Beall’s lists as junk, especially given his repeated condemnation of all OA–and, curiously, his apparent continuing belief that author-side charges, which in the Bealliverse automatically corrupt scholarship, only happen in OA (page charges are apparently mythical creatures in the Bealliverse). So, Beall gains even more credibility; challenging him becomes even more hopeless.

When I’d looked at the article, twice, I’d had lots of questions about the usefulness of extrapolating article volumes and, indeed active journal numbers from a rather small sampling of journals within an extremely heterogeneous space–but, glancing back at my own detailed analysis of journals in those lists (which, unlike the article, was a full survey, not a sampling), I was coming up with article volumes that, while lower, were somewhere within the same ballpark (although the number of active journals was less than half that estimated in the article. (The article is “‘Predatory’ open access: a longitudinal study of article volumes and market characteristics” by Cenyu Shen and Bo-Christer Björk; it’s just been published.)

Basically, the article extrapolated 8,000 active “predatory” journals publishing around 420,000 articles in 2014, based on a sampling of fewer than 700 journals. And, while I showed only 3,876 journals (I won’t call them “predatory” but they were in the junk lists) active at some point between 2011 and June 2014, I did come up with a total volume of 323,491 articles–so I was focusing my criticism of the article on the impossibility of basing good science on junk foundations.

Now, go back and note the italicized word two paragraphs above: “glancing.” Thanks to an email exchange with Lars Bjørnshauge at DOAJ, I went back and read my own article more carefully–that is, actually reading the text, not just glancing at the figures. Turns out 323,491 is the total volume of articles for 3.5 years (2011 through June 30, 2014). The annual total for 2013 was 115,698; the total for the first half of 2014 was 67,647, so it’s fair to extrapolate that the 2014 annual total would be under 150,000.

That’s a huge difference: not only is the article’s active-journal total more than twice as high as my own (non-extrapolated, based on a full survey) number, the article total is nearly three times as high. That shouldn’t be surprising: the article is based on extrapolations from a small number of journals in an extremely heterogeneous universe, and all the statistical formulae in the world don’t make that level of extrapolation reliable.

Shen and Björk ignored my work, either because it’s not Properly Published or because they weren’t aware of it (although I’m pretty sure Björk knows of my work). They say “It would have taken a lot of effort to manually collect publication volumes” for all the journals on the list. That’s true: it was a lot of effort. Effort which I carried out. Effort which results in dramatically lower counts for the number of active journals and articles.

(As to the article’s “geographical spread of articles,” that’s based on a sample of 205 articles out of what they seem to think are about 420,000. But I didn’t look at authors so I won’t comment on this aspect.)

I should note that “active” journals includes those that published at least one article any time during the period. Since I did my analysis in late 2014 and cut off article data at June 30, 2014, it’s not surprising that the “active this year” count is lower for 2014 (3,014 journals) than for 2013 (3,282)–and I’ll agree with the article that recent growth in these journals has been aggressive: the count of active journals was 2,084 for 2012 and 1,450 for 2011.

I could speculate as to whether what I regard as seriously faulty extrapolations based on a junk foundation will get more or less publicity, citations, and credibility than counts based on a full survey–but carried out by an independent researcher using wholly transparent methodology and not published in a peer-reviewed journal. I know how I’d bet. I’d like to hope I’m wrong. (If not being peer-reviewed is a fatal problem, then a big issue in the study goes away: the junk lists are, of course, not at all peer reviewed.)



Mystery Collection Disc 45

Friday, October 2nd, 2015

The Manipulator, 1971, color. Yabo Yablonsky (dir & screenplay), Mickey Rooney, Luana Anders, Keenan Wynn. 1:25 [1:31]

No. No no no. It’s been almost six months since I watched one of these, and more like this could make me give up entirely. The plot, to the extent that I saw it: Mickey Rooney as a crazed old Hollywood person who carries all parts of a movie-making set of conversations as he bumps into thinks in an old prop warehouse…but he’s got an actress tied up as well (kidnapped and being slowly starved), and I guess that their interactions are the heart of the movie. But after 20 minutes, I just couldn’t—and wish I’d given up after ten.

I didn’t see Keenan Wynn during the chunk I watched. Looking at the IMDB reviews, I see one that values it as an experimental film and, well, I guess you can make the worst shit look like roses if you try hard enough. Another praises it for Rooney’s “extraordinarily uninhibited performance,” but several say things like “endurance test for the viewer” and “nearly unwatchable.” I’m with them: not only no redeeming value, but really nasty. No rating.

Death in the Shadows (orig. De prooi), 1985, color. Vivian Peters (dir.), Maayke Bouten, Erik de Vries, Johan Leysen, Marlous Fluitsma. 1:37.

This one’s pretty good—with plenty of mystery, although the metamystery’s easy enough to resolve. (The metamystery: why is a 1985 color film available in a Mill Creek Entertainment set? The answer: it’s from the Netherlands, has no stars known in America, and wouldn’t have done well as a U.S. release.)

In brief: an almost-18-year-old young woman finds that her mother was killed—and that her mother didn’t have any children. The young woman now lives alone (and her boyfriend/lover is leaving for a big vacation as it’s the end of the school year), and—sometimes working with a police detective, sometimes ignoring his advice—wants to know what happened. In the process, she almost gets run down (which is what happened to her mother), her mother’s brother gets murdered, and she avoids death. We find out what happened.

Moody, frequently dark, fairly well done. Maayke Bouten is quite effective as the young woman, Valerie Jaspers. but this is apparently her only actual film credit (she was 21 at the time, so 18 isn’t much of a stretch: she also did one TV movie and appeared as herself on a TV show). Not fast-moving and no flashy special effects, but a pretty good film. $1.50.

Born to Win, 1971, color. Ivan Passer (dir.), George Segal, Paula Prentiss, Karen BlackJay Fletcher, Hector Elizondo, Robert De Niro. 1:28 [1:24]

The disc sleeve identifies Robert De Niro as the star here, but this is very much a George Segal flick, with Karen Black and others—although De Niro’s in it (for some reason feeling to me like Billy Crystal playing Robert De Niro). The movie’s about a junkie (Segal) and…well, it’s about an hour and 24 minutes long.

Beyond that: poor editing, worse scriptwriting, continuity that deserves a “dis” in front of it. I got a hint in the first five minutes that this was going to have what you might call an “experimental” narrative arc, and so it was. Pretty dreary, all in all. Yes, it’s a low-budget indie with a great cast, but… (I will say: most IMDB reviews seem very positive. Good for them.) Charitably, for George Segal or Karen Black fans, maybe $0.75.

A Killing Affair, 1986, color. David Saperstein (dir.), Peter Weller, Kathy Baker, John Glover. 1:40.

A juicy chunk of Southern Gothic—set in West Virginia in 1943, starring Kathy Baker as the wife (or, really, property of a mill foreman who’s ripping off the employees, openly sleeping with other women, and generally a piece of work. A stranger comes to…well, not so much town as the house across the lake from town where Baker lives (with her children on weekends—during the week, they stay in town with her brother, the preacher who clearly believes that women are to Obey their husbands).

Ah, but shortly before the stranger (Peter Weller) shows up, she discovers that her rotten husband is now hanging in the shed, very much dead. She makes some efforts to get help but isn’t quite willing to walk two miles to town (the boat’s gone), so… Anyway, the stranger shows up and Plot happens. Part of it: he admits to killing her husband, but claims her husband killed his wife and children and was about to shoot him. And there are all sorts of family secrets involved in her past. A pack of wild dogs also plays a role throughout the flick, especially in the climax.

Languid most of the time, with an unsurprising ending. Not terrible, not great; Weller’s a pretty convincing mentally unstable (but smooth!) killer, and Baker’s pretty much always good, and certainly is here. (How does a movie this recent and plausibly good wind up in a cheap collection? I have no idea.) I’ll give it $1.25.

The Gold OA Landscape 2011-2014: malware and some side notes

Thursday, October 1st, 2015

First, a very brief status report. As of this morning, the book has sold five copies (four paperback, one ebook)–exactly the same numbers as a week ago, September 24, 2015. This is, how you say, not especially rapid progress toward the twin goals of making the data available and carrying forward the research into 2016. (Meanwhile, the October 2015 Cites & Insights has been downloaded at least 1,300 times so far–about 85% of those downloads being the more-readable single-column version of this excerpted version of The Gold OA Landscape 2011-2014. (If one out of every 20 downloads yielded a sale of the book, that would meet the data-availability goal and probably the next-year’s-research goal…)

Second, a sad note. An article–which I’d seen from two sources before publication–that starts by apparently assuming Beall’s lists are something other than junk, then bases an investigation on sampling from the lists, has appeared in a reputable OA journal and, of course, is being picked up all over the place…with Beall being quoted, naturally, thus making the situation worse. I was asked for comments by another reporter (haven’t seen whether the piece has appeared and whether I’m quoted), and the core of my comments was that it’s hard to build good research based on junk, and I regard Beall’s lists as junk, especially given his repeated condemnation of all OA–and, curiously, his apparent continuing belief that author-side charges, which in the Bealliverse automatically corrupt scholarship, only happen in OA (page charges are apparently mythical creatures in the Bealliverse). So, Beall gains even more credibility; challenging him becomes even more hopeless. [See this followup post]

Third, a somewhat better note: Cheryl LaGuardia has published “An Interview with Peter Suber” in her “Not Dead Yet” column at Library Journal. If you haven’t already read it, you should. A couple of key quotes (in my opinion):

Not all librarians are well-informed about OA, but as a class they’re much better informed than faculty.

First, scam OA journals do exist, just as scam subscription journals exist. On the other side, first-rate OA journals also exist, just as first-rate subscription journals also exist. There’s a full range of quality on both sides of the line. Authors often need help identifying the first-rate OA journals, or at least steering clear of the frauds, and librarians can help with that. The Directory of Open Access Journals (DOAJ) is a “white list” of trustworthy OA journals…

I used to think [“hybrid” OA] was good, since at least it gave publishers first-hand experience with the economics of fee-based OA journals. But I changed my mind about that years ago. Because these journals still have subscriptions, they have no incentive to make the OA option attractive. The economics are artificial. Moreover, as I mentioned, most hybrid OA journals double-dip, which is dishonest. But even when it’s honest, it’s still a small OA step that’s often mistaken for a big step.

Finally, the direct tie-in to the book…and to the second quote from the Suber interview.


The excerpted version omits the whole section on exclusions–DOAJ-listed journals that weren’t included in the study for a variety of reasons. In most cases, it’s not necessarily that these journals are scam journals (the term “predatory” has been rendered meaningless in this context) but that, for one reason or another, they either don’t fit my definition of a gold OA journal devoted to peer-reviewed articles or that I was simply unable to analyze them properly.

One unfortunate subcategory includes 65 journals, which is 65 more than should appear in this category: journals with malware issues. My best guess is that some of these will disappear from DOAJ and that others either try too hard for ad revenue (accepting ads that incorporate malware) or have been badly designed, or for that matter use some convenient add-in for the website that just happens to carry malware. I don’t believe there’s any excuse for a journal to raise malware cautions–even if some of the defense tools I use might be overly cautious. (I added Malwarebytes after an OA journal infected my PC with a particularly nasty bit of malware, and at least two others attempted to load the same malware. It took me two days to get rid of the crap, and I have no interest in repeating that process. McAfee Site Adviser seems to be omnipresent in browsers and new computers, and since it’s now part of Intel I see no reason to distrust it.)

In any case: since it doesn’t look like OA publishers are rushing to buy the book and dig through it (I know, it’s early days yet), I’ll include that section here–the single case in which I actually list journal titles other than PLOS One (which I mention by name in the book because I excluded it from subject and segment discussions in order to avoid wrecking averages and distributions, since it is more than six times as large as any other OA journal).

Here’s the excerpt:

M: Malware

When attempting to reach these journals’ webpages, either Microsoft Office, McAfee Site Advisor, Windows Defender or Malwarebytes Anti-Malware threw up a caution screen indicating that the site had malware of some sort. (Actually, in one case the website got past all four—and showed an overlay that was a clear phishing attempt.)

In some few cases, the warning was a McAfee “yellow flag”; in most, it was either a McAfee red flag or Malwarebytes blocked the site.

Given that I encountered a serious virus with at least three different journals in a previous pass (getting rid of the virus is one reason I now run Malwarebytes as well as Windows Defender; note that I do not run McAfee’s general suite, but only the free Site Advisor that flags suspicious websites on the fly), I was not about to ignore the warnings and go look at the journals. I’d guess that, in some cases, the malware is in an ad on the journal page. In any case, it’s simply not acceptable for an OA journal to have malware or even possible malware.

I find it sad that there are 65 of these. They are not dominated by any one country of publication: 27 countries are represented among the 65 offending sites, although only a dozen have more than one each. The countries with more than three possible-malware journals include Germany and India (seven each), Brazil (six), Romania and the Russian Federation (five each), and the United States (four).

Malware Possibilities

While this report generally avoids naming individual journal titles or publishers, since it’s intended as an overall study, I think it’s worth making an exception for these 65 cases. These journals may have fixed their problems, but I’d approach with caution:

Acta Medica Transilvanica

Algoritmy, Metody i Sistemy Obrabotki Dannyh

Analele Universitatii din Oradea, Fascicula Biologie

Andhra Pradesh Journal of Psychological Medicine

Annals and Essences of Dentistry

Applied Mathematics in Engineering, Management and Technology

Avances en Ciencias e Ingeniería


Breviário de Filosofia Pública

Chinese Journal of Plant Ecology

Communications in Numerical Analysis

Confines de Relaciones Internacionales y Ciencia Política

Contemporary Materials

Data Envelopment Analysis and Decision Science



Economic Sociology

Education Research Frontier


European Journal of Environmental Sciences

Exatas Online

Filosofiâ i Kosmologiâ

Forum for Inter-American Research (Fiar)


Global Engineers and Technologists Review

Health Sciences and Disease

Impossibilia : Revista Internacional de Estudios Literarios

International Journal of Academic Research in Business and Social Sciences

International Journal of Ayurvedic Medicine

International Journal of Educational Research and Technology

International Journal of Information and Communication Technology Research

International Journal of Pharmaceutical Frontier Research

İşletme Araştırmaları Dergisii

Journal of Behavioral Science for Development

Journal of Community Nutrition & Health

Journal of Interpolation and Approximation in Scientific Computing

Journal of Management and Science

Journal of Nonlinear Analysis and Application

Journal of Numerical Mathematics and Stochastics

Journal of Soft Computing and Applications

Journal of Wetlands Environmental Management

Kritikos. Journal of postmodern cultural sound, text and image

Latin American Journal of Conservation

Mathematics Education Trends and Research

Nesne Psikoloji Dergisi

Networks and Neighbours


Potravinarstvo : Scientific Journal for Food Industry

Proceedings of the International Conference Nanomaterials : Applications and Properties

Psihologičeskaâ Nauka i Obrazovanie

Psikiyatride Guncel Yaklasimlar

Regionalʹnaâ Èkonomika i Upravlenie: Elektronnyi Nauchnyi Zhurnal

Revista Caribeña e Ciencias Sociales

Revista de Biologia Marina y Oceanografia

Revista de Educación en Biología

Revista de Engenharia e Tecnologia

Revista de Estudos AntiUtilitaristas e PosColoniais

Revista Pădurilor

Romanian Journal of Regional Science

Studii de gramatică contrastivă

Tecnoscienza : Italian Journal of Science & Technology Studies

Tekhnologiya i Konstruirovanie v Elektronnoi Apparature

Vestnik Volgogradskogo Gosudarstvennogo Universiteta. Seriâ 4. Istoriâ, Regionovedenie, Meždunarodnye Otnošeniâ