Cites & Insights 15:10 (November 2015) available

October 5th, 2015

Cites & Insights 15:10 (November 2015) is now available for downloading at

This print-oriented two-column version is 38 pages long. If you plan to read the issue on a tablet or computer, you may prefer the 6″x9″ single column version, 74 pages long, which is available at

Unlike the book-excerpt October 2015 issue, there’s no advantage to the single-column version (other than its being single-column), and copyfitting has only been done on the two-column version. (As has been true for a couple of months, both versions do include links, bookmarks and visible bolding.)

This issue includes the following essays, stepping away from open access for a bit:

The Front: A Fair Use Trilogy   p. 1

A few notes about the rest of the issue–and a status report on The Gold OA Landscape 2011-2014.

Policy: Google Books: The Neverending Story?  pp. 1-18

Three years of updates on the seemingly endless Google Books story, which has now become almost entirely about fair use.

Policy: Catching Up on Fair Use  pp. 18-24

A handful of items regarding fair use that don’t hinge on Google Books or HathiTrust.

Intersections: Tracking the Elephant: Notes on HathiTrust  pp. 24-38

Pretty much what the title says, and again the main thrust appears to be fair use. (The elephant? Read the essay, including a little bit of Unicode.)


Careful reading and questionable extrapolation

October 2nd, 2015

On October 1, 2015 (yesterday, that is), I posted “The Gold OA Landscape 2011-2014: malware and some side notes,” including this paragraph:

Second, a sad note. An article–which I’d seen from two sources before publication–that starts by apparently assuming Beall’s lists are something other than junk, then bases an investigation on sampling from the lists, has appeared in a reputable OA journal and, of course, is being picked up all over the place…with Beall being quoted, naturally, thus making the situation worse. I was asked for comments by another reporter (haven’t seen whether the piece has appeared and whether I’m quoted), and the core of my comments was that it’s hard to build good research based on junk, and I regard Beall’s lists as junk, especially given his repeated condemnation of all OA–and, curiously, his apparent continuing belief that author-side charges, which in the Bealliverse automatically corrupt scholarship, only happen in OA (page charges are apparently mythical creatures in the Bealliverse). So, Beall gains even more credibility; challenging him becomes even more hopeless.

When I’d looked at the article, twice, I’d had lots of questions about the usefulness of extrapolating article volumes and, indeed active journal numbers from a rather small sampling of journals within an extremely heterogeneous space–but, glancing back at my own detailed analysis of journals in those lists (which, unlike the article, was a full survey, not a sampling), I was coming up with article volumes that, while lower, were somewhere within the same ballpark (although the number of active journals was less than half that estimated in the article. (The article is “‘Predatory’ open access: a longitudinal study of article volumes and market characteristics” by Cenyu Shen and Bo-Christer Björk; it’s just been published.)

Basically, the article extrapolated 8,000 active “predatory” journals publishing around 420,000 articles in 2014, based on a sampling of fewer than 700 journals. And, while I showed only 3,876 journals (I won’t call them “predatory” but they were in the junk lists) active at some point between 2011 and June 2014, I did come up with a total volume of 323,491 articles–so I was focusing my criticism of the article on the impossibility of basing good science on junk foundations.

Now, go back and note the italicized word two paragraphs above: “glancing.” Thanks to an email exchange with Lars Bjørnshauge at DOAJ, I went back and read my own article more carefully–that is, actually reading the text, not just glancing at the figures. Turns out 323,491 is the total volume of articles for 3.5 years (2011 through June 30, 2014). The annual total for 2013 was 115,698; the total for the first half of 2014 was 67,647, so it’s fair to extrapolate that the 2014 annual total would be under 150,000.

That’s a huge difference: not only is the article’s active-journal total more than twice as high as my own (non-extrapolated, based on a full survey) number, the article total is nearly three times as high. That shouldn’t be surprising: the article is based on extrapolations from a small number of journals in an extremely heterogeneous universe, and all the statistical formulae in the world don’t make that level of extrapolation reliable.

Shen and Björk ignored my work, either because it’s not Properly Published or because they weren’t aware of it (although I’m pretty sure Björk knows of my work). They say “It would have taken a lot of effort to manually collect publication volumes” for all the journals on the list. That’s true: it was a lot of effort. Effort which I carried out. Effort which results in dramatically lower counts for the number of active journals and articles.

(As to the article’s “geographical spread of articles,” that’s based on a sample of 205 articles out of what they seem to think are about 420,000. But I didn’t look at authors so I won’t comment on this aspect.)

I should note that “active” journals includes those that published at least one article any time during the period. Since I did my analysis in late 2014 and cut off article data at June 30, 2014, it’s not surprising that the “active this year” count is lower for 2014 (3,014 journals) than for 2013 (3,282)–and I’ll agree with the article that recent growth in these journals has been aggressive: the count of active journals was 2,084 for 2012 and 1,450 for 2011.

I could speculate as to whether what I regard as seriously faulty extrapolations based on a junk foundation will get more or less publicity, citations, and credibility than counts based on a full survey–but carried out by an independent researcher using wholly transparent methodology and not published in a peer-reviewed journal. I know how I’d bet. I’d like to hope I’m wrong. (If not being peer-reviewed is a fatal problem, then a big issue in the study goes away: the junk lists are, of course, not at all peer reviewed.)



Mystery Collection Disc 45

October 2nd, 2015

The Manipulator, 1971, color. Yabo Yablonsky (dir & screenplay), Mickey Rooney, Luana Anders, Keenan Wynn. 1:25 [1:31]

No. No no no. It’s been almost six months since I watched one of these, and more like this could make me give up entirely. The plot, to the extent that I saw it: Mickey Rooney as a crazed old Hollywood person who carries all parts of a movie-making set of conversations as he bumps into thinks in an old prop warehouse…but he’s got an actress tied up as well (kidnapped and being slowly starved), and I guess that their interactions are the heart of the movie. But after 20 minutes, I just couldn’t—and wish I’d given up after ten.

I didn’t see Keenan Wynn during the chunk I watched. Looking at the IMDB reviews, I see one that values it as an experimental film and, well, I guess you can make the worst shit look like roses if you try hard enough. Another praises it for Rooney’s “extraordinarily uninhibited performance,” but several say things like “endurance test for the viewer” and “nearly unwatchable.” I’m with them: not only no redeeming value, but really nasty. No rating.

Death in the Shadows (orig. De prooi), 1985, color. Vivian Peters (dir.), Maayke Bouten, Erik de Vries, Johan Leysen, Marlous Fluitsma. 1:37.

This one’s pretty good—with plenty of mystery, although the metamystery’s easy enough to resolve. (The metamystery: why is a 1985 color film available in a Mill Creek Entertainment set? The answer: it’s from the Netherlands, has no stars known in America, and wouldn’t have done well as a U.S. release.)

In brief: an almost-18-year-old young woman finds that her mother was killed—and that her mother didn’t have any children. The young woman now lives alone (and her boyfriend/lover is leaving for a big vacation as it’s the end of the school year), and—sometimes working with a police detective, sometimes ignoring his advice—wants to know what happened. In the process, she almost gets run down (which is what happened to her mother), her mother’s brother gets murdered, and she avoids death. We find out what happened.

Moody, frequently dark, fairly well done. Maayke Bouten is quite effective as the young woman, Valerie Jaspers. but this is apparently her only actual film credit (she was 21 at the time, so 18 isn’t much of a stretch: she also did one TV movie and appeared as herself on a TV show). Not fast-moving and no flashy special effects, but a pretty good film. $1.50.

Born to Win, 1971, color. Ivan Passer (dir.), George Segal, Paula Prentiss, Karen BlackJay Fletcher, Hector Elizondo, Robert De Niro. 1:28 [1:24]

The disc sleeve identifies Robert De Niro as the star here, but this is very much a George Segal flick, with Karen Black and others—although De Niro’s in it (for some reason feeling to me like Billy Crystal playing Robert De Niro). The movie’s about a junkie (Segal) and…well, it’s about an hour and 24 minutes long.

Beyond that: poor editing, worse scriptwriting, continuity that deserves a “dis” in front of it. I got a hint in the first five minutes that this was going to have what you might call an “experimental” narrative arc, and so it was. Pretty dreary, all in all. Yes, it’s a low-budget indie with a great cast, but… (I will say: most IMDB reviews seem very positive. Good for them.) Charitably, for George Segal or Karen Black fans, maybe $0.75.

A Killing Affair, 1986, color. David Saperstein (dir.), Peter Weller, Kathy Baker, John Glover. 1:40.

A juicy chunk of Southern Gothic—set in West Virginia in 1943, starring Kathy Baker as the wife (or, really, property of a mill foreman who’s ripping off the employees, openly sleeping with other women, and generally a piece of work. A stranger comes to…well, not so much town as the house across the lake from town where Baker lives (with her children on weekends—during the week, they stay in town with her brother, the preacher who clearly believes that women are to Obey their husbands).

Ah, but shortly before the stranger (Peter Weller) shows up, she discovers that her rotten husband is now hanging in the shed, very much dead. She makes some efforts to get help but isn’t quite willing to walk two miles to town (the boat’s gone), so… Anyway, the stranger shows up and Plot happens. Part of it: he admits to killing her husband, but claims her husband killed his wife and children and was about to shoot him. And there are all sorts of family secrets involved in her past. A pack of wild dogs also plays a role throughout the flick, especially in the climax.

Languid most of the time, with an unsurprising ending. Not terrible, not great; Weller’s a pretty convincing mentally unstable (but smooth!) killer, and Baker’s pretty much always good, and certainly is here. (How does a movie this recent and plausibly good wind up in a cheap collection? I have no idea.) I’ll give it $1.25.

The Gold OA Landscape 2011-2014: malware and some side notes

October 1st, 2015

First, a very brief status report. As of this morning, the book has sold five copies (four paperback, one ebook)–exactly the same numbers as a week ago, September 24, 2015. This is, how you say, not especially rapid progress toward the twin goals of making the data available and carrying forward the research into 2016. (Meanwhile, the October 2015 Cites & Insights has been downloaded at least 1,300 times so far–about 85% of those downloads being the more-readable single-column version of this excerpted version of The Gold OA Landscape 2011-2014. (If one out of every 20 downloads yielded a sale of the book, that would meet the data-availability goal and probably the next-year’s-research goal…)

Second, a sad note. An article–which I’d seen from two sources before publication–that starts by apparently assuming Beall’s lists are something other than junk, then bases an investigation on sampling from the lists, has appeared in a reputable OA journal and, of course, is being picked up all over the place…with Beall being quoted, naturally, thus making the situation worse. I was asked for comments by another reporter (haven’t seen whether the piece has appeared and whether I’m quoted), and the core of my comments was that it’s hard to build good research based on junk, and I regard Beall’s lists as junk, especially given his repeated condemnation of all OA–and, curiously, his apparent continuing belief that author-side charges, which in the Bealliverse automatically corrupt scholarship, only happen in OA (page charges are apparently mythical creatures in the Bealliverse). So, Beall gains even more credibility; challenging him becomes even more hopeless. [See this followup post]

Third, a somewhat better note: Cheryl LaGuardia has published “An Interview with Peter Suber” in her “Not Dead Yet” column at Library Journal. If you haven’t already read it, you should. A couple of key quotes (in my opinion):

Not all librarians are well-informed about OA, but as a class they’re much better informed than faculty.

First, scam OA journals do exist, just as scam subscription journals exist. On the other side, first-rate OA journals also exist, just as first-rate subscription journals also exist. There’s a full range of quality on both sides of the line. Authors often need help identifying the first-rate OA journals, or at least steering clear of the frauds, and librarians can help with that. The Directory of Open Access Journals (DOAJ) is a “white list” of trustworthy OA journals…

I used to think [“hybrid” OA] was good, since at least it gave publishers first-hand experience with the economics of fee-based OA journals. But I changed my mind about that years ago. Because these journals still have subscriptions, they have no incentive to make the OA option attractive. The economics are artificial. Moreover, as I mentioned, most hybrid OA journals double-dip, which is dishonest. But even when it’s honest, it’s still a small OA step that’s often mistaken for a big step.

Finally, the direct tie-in to the book…and to the second quote from the Suber interview.


The excerpted version omits the whole section on exclusions–DOAJ-listed journals that weren’t included in the study for a variety of reasons. In most cases, it’s not necessarily that these journals are scam journals (the term “predatory” has been rendered meaningless in this context) but that, for one reason or another, they either don’t fit my definition of a gold OA journal devoted to peer-reviewed articles or that I was simply unable to analyze them properly.

One unfortunate subcategory includes 65 journals, which is 65 more than should appear in this category: journals with malware issues. My best guess is that some of these will disappear from DOAJ and that others either try too hard for ad revenue (accepting ads that incorporate malware) or have been badly designed, or for that matter use some convenient add-in for the website that just happens to carry malware. I don’t believe there’s any excuse for a journal to raise malware cautions–even if some of the defense tools I use might be overly cautious. (I added Malwarebytes after an OA journal infected my PC with a particularly nasty bit of malware, and at least two others attempted to load the same malware. It took me two days to get rid of the crap, and I have no interest in repeating that process. McAfee Site Adviser seems to be omnipresent in browsers and new computers, and since it’s now part of Intel I see no reason to distrust it.)

In any case: since it doesn’t look like OA publishers are rushing to buy the book and dig through it (I know, it’s early days yet), I’ll include that section here–the single case in which I actually list journal titles other than PLOS One (which I mention by name in the book because I excluded it from subject and segment discussions in order to avoid wrecking averages and distributions, since it is more than six times as large as any other OA journal).

Here’s the excerpt:

M: Malware

When attempting to reach these journals’ webpages, either Microsoft Office, McAfee Site Advisor, Windows Defender or Malwarebytes Anti-Malware threw up a caution screen indicating that the site had malware of some sort. (Actually, in one case the website got past all four—and showed an overlay that was a clear phishing attempt.)

In some few cases, the warning was a McAfee “yellow flag”; in most, it was either a McAfee red flag or Malwarebytes blocked the site.

Given that I encountered a serious virus with at least three different journals in a previous pass (getting rid of the virus is one reason I now run Malwarebytes as well as Windows Defender; note that I do not run McAfee’s general suite, but only the free Site Advisor that flags suspicious websites on the fly), I was not about to ignore the warnings and go look at the journals. I’d guess that, in some cases, the malware is in an ad on the journal page. In any case, it’s simply not acceptable for an OA journal to have malware or even possible malware.

I find it sad that there are 65 of these. They are not dominated by any one country of publication: 27 countries are represented among the 65 offending sites, although only a dozen have more than one each. The countries with more than three possible-malware journals include Germany and India (seven each), Brazil (six), Romania and the Russian Federation (five each), and the United States (four).

Malware Possibilities

While this report generally avoids naming individual journal titles or publishers, since it’s intended as an overall study, I think it’s worth making an exception for these 65 cases. These journals may have fixed their problems, but I’d approach with caution:

Acta Medica Transilvanica

Algoritmy, Metody i Sistemy Obrabotki Dannyh

Analele Universitatii din Oradea, Fascicula Biologie

Andhra Pradesh Journal of Psychological Medicine

Annals and Essences of Dentistry

Applied Mathematics in Engineering, Management and Technology

Avances en Ciencias e Ingeniería


Breviário de Filosofia Pública

Chinese Journal of Plant Ecology

Communications in Numerical Analysis

Confines de Relaciones Internacionales y Ciencia Política

Contemporary Materials

Data Envelopment Analysis and Decision Science



Economic Sociology

Education Research Frontier


European Journal of Environmental Sciences

Exatas Online

Filosofiâ i Kosmologiâ

Forum for Inter-American Research (Fiar)


Global Engineers and Technologists Review

Health Sciences and Disease

Impossibilia : Revista Internacional de Estudios Literarios

International Journal of Academic Research in Business and Social Sciences

International Journal of Ayurvedic Medicine

International Journal of Educational Research and Technology

International Journal of Information and Communication Technology Research

International Journal of Pharmaceutical Frontier Research

İşletme Araştırmaları Dergisii

Journal of Behavioral Science for Development

Journal of Community Nutrition & Health

Journal of Interpolation and Approximation in Scientific Computing

Journal of Management and Science

Journal of Nonlinear Analysis and Application

Journal of Numerical Mathematics and Stochastics

Journal of Soft Computing and Applications

Journal of Wetlands Environmental Management

Kritikos. Journal of postmodern cultural sound, text and image

Latin American Journal of Conservation

Mathematics Education Trends and Research

Nesne Psikoloji Dergisi

Networks and Neighbours


Potravinarstvo : Scientific Journal for Food Industry

Proceedings of the International Conference Nanomaterials : Applications and Properties

Psihologičeskaâ Nauka i Obrazovanie

Psikiyatride Guncel Yaklasimlar

Regionalʹnaâ Èkonomika i Upravlenie: Elektronnyi Nauchnyi Zhurnal

Revista Caribeña e Ciencias Sociales

Revista de Biologia Marina y Oceanografia

Revista de Educación en Biología

Revista de Engenharia e Tecnologia

Revista de Estudos AntiUtilitaristas e PosColoniais

Revista Pădurilor

Romanian Journal of Regional Science

Studii de gramatică contrastivă

Tecnoscienza : Italian Journal of Science & Technology Studies

Tekhnologiya i Konstruirovanie v Elektronnoi Apparature

Vestnik Volgogradskogo Gosudarstvennogo Universiteta. Seriâ 4. Istoriâ, Regionovedenie, Meždunarodnye Otnošeniâ


How many articles, how many journals?

September 30th, 2015

Thanks to Dietrich Rordorf’s comment on a post at Scholars Kitchen, I am now aware of MDPI’s sciforum (the link is to the journal reviews/statistics section), which among other things “aims at publishing statistics and rankings of scientific and scholarly journals and their Publishers.” Quoting from the disclaimer:

Statistics are automatically computed from available data, and are not manually curated. While we have made every possible effort to provide meaningful statistics, we can not guarantee the correctness or accuracy of any of the statistics. Statistics might be recomputed anytime without notice. Access to statistics might be disabled anytime without notice.

Quoting the section on Open Access:

Data about papers published under open access licenses are currently collected from two providers: DOAJ (licensing information available on journal-level but only for journals that publish exclusively under open access licenses), and Publisher website metadata (licensing information available on a per paper-level for some Publishers). We will include PubMed Central article-level licensing data in a future update. Because many hybrid journals do not offer metadata or licensing information which are easily machine-readable, the statistics about open access content are likely too low. E.g. JR reports 270’000 open access papers for a total market size of roughly 2.4 million papers for 2013 (which is about 11% open access papers). In reality the share of open access papers might be much higher if all papers published under open licenses in hybrid journals could be easily and properly counted). Green open access, i.e. self-archived pre- or post-prints are currently not included.

The section also specifies where data comes from.

I’m delighted there is such a source. Think of the rest of this as additional data (yes, I’ll be emailing a note to Rordorf, but I’m not sure how he can blend manually-counted and automatically-gathered data).

Note that I don’t include “hybrid” OA in my counts at all, partly because there’s no good way to count it, partly because it’s consistently the most expensive form of OA and, I believe, the wrong way to go about OA. Neither does this site because there’s no good way to count it.

Number of Journals

Sciforum shows 25,064 journals publishing at least one article in 2014, of which 3,693 journals are Gold OA—and the chart shows that as being down from 3,990 with at least one article in 2013.

My study, excluding questionable journals and those not in DOAJ, shows 8,760 gold OA journals publishing at least one article in 2014, but that number is down slightly, from 8,960 in 2013.

That’s an enormous difference, one that I believe speaks to the limitations (at this point) of automated data gathering for OA. Even leaving out the global south can’t really account for omitting more than half of the active journals. (Sciforum does not indicate that it’s limited to STM, so I’m assuming it’s not.)

Number of Articles

Sciforum shows 2,423,122 articles in 2014, up from 2,248,966 in 2014. So “roughly a quarter million” seems like a plausible estimate for the total article production.

But: Sciforum shows 302,339 OA articles in 2014, 279,967 in 2013, 250,237 in 2012 and 196,508 in 2011.

The Gold OA Landscape 2011-2014 shows 482,361 articles in 2014, 440,843 in 2013, 394,374 in 2012 and 321,312 in 2011.

Those are also enormous differences, although slightly smaller percentage-wise. To wit, my actual count (again omitting hybrids, questionable journals and journals not in DOAJ) is 60% higher for 2014, 57% higher for 2013, 58% higher for 2012 and 64% higher for 2011.

If we assume that subscription journals can be measured more accurately through automated means (an assumption I’m a little wary of making), then actual article totals for 2014 are around 2.6 million, of which around 18% are in gold OA journals.

My main takeaway: at this point, automated data gathering severely undercounts the OA field—which, given the ludicrous amount of time spent gathering data manually (but hey! book sales are already up to…well, five copies so far), is at least a trifle reassuring.

The Gold OA Landscape and Outsell’s Open Access 2015

September 29th, 2015

I thought it might be interesting to look at Outsell’s Open Access 2015: Market Size, Share, Forecast and Trends (which the CCC seems to have made openly available, or at least it was for a while—if it’s back to $2,500, my apologies)in light of The Gold OA Landscape 2011-2014. Is there anything useful or mildly controversial to say?

Definitional Differences

Outsell estimates OA as “about 1.1% of the total 2014 STM market and 4.3% of the STM journals market”—but remember that this is a dollar share, not magnitude. Also, it’s STM, while a sizable chunk of the OA universe as I measured it is humanities and social sciences.

The methodology is entirely different, of course: Outsell develops estimates where I tried to do a universal count. Outsell also has industry contacts, which I entirely lack.

A third key definitional difference, especially given Outsell’s finding: I explicitly exclude “hybrid” publishing—not only because I think it’s a trap but because it’s essentially impossible to count.

Also, Outsell seems to define “megajournals” based on crossing subject boundaries, which strikes me as odd; I’d define them as journals with very large numbers of articles. There are hundreds of interdisciplinary journals; there are only a few very large OA journals.

Questionable Statements

Outsell says (p. 5) “Hybrid currently prevails as the Gold model.” If “prevails” means there are more hybrid journals than there are Gold OA journals (by my definition and that of DOAJ, a hybrid journal cannot be gold OA; otherwise it wouldn’t be hybrid), then that may be true, as so many big publishers will happily take big bucks to make an article OA (sort of) in any of their journals. If “prevails” means that there are more OA articles published in hybrid journals than in Gold OA journals, that would be astonishing: it would mean there were close to a million OA articles published in 2014 (not including green OA). I find that hard to believe, and I don’t know how Outsell could determine this number, so I’ll have to assume “prevails” has to do with number of journals.

On pages 5-6, Outsell offers a proliferation of terms and models including “Platinum” OA, “Gold for Gold” and more. As you read this section, it’s ever more clear that the point of Outsell’s report is to inform publishers how they can best maintain and increase revenues—so, for example, institutionally-sponsored OA is not included in the list of Gold models because “it does not produce revenues.” If it doesn’t generate $Gold, it isn’t worth discussing.

Market Size and Forecast

Outsell estimates the 2014 total STM market at a breathtaking $26.2 billion but the serials market at “only” $6.8 billion. Given other estimates of $10 billion, I can only wonder whether this means that the non-STM journals market is $3.2 billion (which seems unlikely) or whether something else is going on.

Outsell’s numbers for APCs are $290.4 million in 2014, $252.3 million in 2013, and $171.9 million in 2012. But those numbers include articles in “hybrid” journals, which Outsell says are the more prevalent model.

My own figures—not allowing for discounts and waivers, but also not including hybrid publications—are $305.4 million in 2014, $241.9 million in 2013, and $195.5 million in 2012. Those numbers do include the humanities and social sciences, but HSS only accounts for $9.5 million in 2014, $7.7 million in2013 and $6.6 million in 2012.

What’s interesting (to me) is that, if you subtract HSS from my figures, they’re not wildly different from Outsell’s numbers (Outsell’s numbers are about 2% lower in 2014, 8% higher in 2013, and 9% lower in 2013). I could conclude that hybrid APCs don’t really amount to much, or that Outsell defines STM more narrowly than my STEM+Biomed, or…


Outsell says that there are approximately 20 megajournals—but also that they published fewer articles in 2014 than in 2013. Looking at the graph, it appears that Outsell is saying that all megajournals put together published about 40,000 articles in 2014, as compared to 43,000 or so in 2013 and 25,000 or so (?) in 2012.

But it’s hard to tell what Outsell considers to be a megajournal. Let’s look at some subsets within the DOAJ landscape (among the 9,512 I report on fully):

  • Journals with at least 1,000 articles in some year 2011-2014: There are 40 of these. I count the article totals as 38,830 in 2011; 56,827 in 2012; 70,986 in 2013; and 88,315 in 2014. That’s a much smaller percentage increase from 2012 to 2013 than Outsell’s graph seems to show—but my figures show between 24% and 25% increase in articles from 2012 to 2013 and again from 2013 to 2014. So: I’m showing a healthy increase from 2013 to 2014, not a decrease.
  • Cut that down to journals with at least 1,500 articles in one of those years (14 of them), and you still get a healthy increase each year: 27,052 in 2011; 39,683 in 2012; 49,131 in 2013; and 61,844 in 2014.
  • Trim it to the eight journals with at least 2,000 articles in one year, and there’s still an increase in each year:23,461 in 2011; 35,145 in 2012; 43,667 in 2013; and 51,646 in 2014.
  • Include only “other sciences” (my term for interdisciplinary STM journals) and PLOS One, and use 1,000 as the cutoff, and I get 17,167 articles in 2011; 28,168 in 2012; 38,604 in 2013; and 43,443 in 2014.
  • If I include PLOS One and all “Other Sciences” journals (171 journals in all), I get 23,041 articles in 2011; 35,502 in 2012; 47,607 in 2013; and 54,210 in 2014.

I have no doubt that Outsell is able to define some set of “megajournals” that declined in article count from 2013 to 2014. Since PLOS One alone increased (very slightly, from 31,509 to 31,882), that presumably means that all the other megajournals combined went from around 11,500 articles in 2013 to around 8,000 in 2014. That’s a surprisingly large drop (although one very big and very badly run “megajournal” could account for all of it). Certainly possible—again, depending on how you define megajournals.

Competitive Landscape

Table 4 in the Outsell report is interesting if only because of the numbers: the 14 publishers listed publish a total of 11,740 journals in 2014 (which would appear to be about half of the total) but only 1,505 Gold OA journals (and that includes Hindawi’s 438 and the whole set of BioMed Central journals), fewer than one-sixth of the Gold OA total. (Take away Hindawi, which only publishes Gold OA journals, and you’re down to less than one-eighth of the gold OA total). But note that those 14 publishers claim to have 8,404 hybrid journals.

Ah, but Outsell’s only looking at STM. That does make a difference, because there are so many HSS gold OA journals (more than 4,000 in my study): If you remove those from my count, I’m left with just under5,500 journals. The 14 biggies still account for less than one-third of the STM Gold OA total (around one-fifth without Hindawi), but that’s an improvement.


I question the supposed decline in megajournal publishing activity, but that’s a matter of definition.

I don’t particularly question the estimated APC totals—unless Outsell is seriously claiming that hybrid publications account for most APCs, in which case I’d question them a lot: I’d believe that waivers and discounts might reduce my numbers by, say, 15%-20%, but that would still leave a huge gap.

Since Outsell doesn’t estimate overall article counts at all, my primary focus, I have no comments.

In general, given differences in definition, the only fault I might find with Outsell (I can’t fault them for a laser focus on $$$) is the decline in megajournal publishing from 2013 to 2014–and, again, I’m sure they managed to define a group of journals that comes out that way.

pubs_since_1994.htm finally up

September 28th, 2015

For the thousands (well, hundreds (well, tens (well…anybody?))) of avid readers of Cites & Insights August-September 2015 who, wondering about the quotes in “A Few Words, Part 2,” clicked through to find the bibliography…

It’s (finally) there, such as it is:

My apologies for the slight delay in getting it ready. The fact that I’ve seen zero instances of anybody looking at the first part of the bibliography may have influenced the priority with which this part was prepared…

(But hey, there are lots of 404s on, as usual: I could convince myself that those were all folks looking for the second part of the bibliography. I could convince myself that I look like a slightly older George Clooney, too, but it would be equally absurd.)



Reading the way you prefer

September 28th, 2015

I ran into an odd blog post (on a ALA divisional blog) this morning–and didn’t comment directly for two reasons:

  1. I’m not a member of the division
  2. I’m hoping that I simply misread or misunderstood the post.

The post seemed to be saying that libraries/library groups should be helping to persuade younger people to do all their book reading in ebook form. (I believe it springs from the New York Times piece regarding a slowdown in ebook sales.)

Again, I’m probably misunderstanding what was being said–but I have certainly seen in the past discussions that seemed to say that the “digital shift” was not only inevitable but desirable, and that good librarians should be backing it.

And I just don’t get it.

I’ve suggested for some time that there is no such thing as an inevitable digital shift when it comes to books: that there’s no reason to believe, based on precedent or history, that ebooks would sweep away print books entirely–or that this was even a desirable thing.

I’ve tried to be consistent in saying what the title of this post suggests. Expanding:

  • It seems likely that some people will prefer to do all or most of their extended-narrative reading on digital devices, either because they like them better, they’re more convenient, they believe they should do so…or for whatever reasons.
  • It seems likely that some people will prefer to do all or most of their extended-narrative (that is, “book”) reading from print books, either because they like them better or for whatever reasons.
  • It seems likely that some people will prefer to read some books in print form, some in digital form–and that the variety and distribution of preference will be different for different people.
  • Public libraries should not be “out ahead of the users” on such matters unless there’s a clear and consistent shift in preferences–and even then, maybe not. (Which is not to say public libraries shouldn’t provide ebook services, but maybe that they shouldn’t screw up their budgets or priorities to emphasize ebook services.)

I’ve said for some time that I expect book publishing and print book publishing to be a healthy business throughout my lifetime, with total print book revenues certainly in the billions and probably in the tens of billions of dollars per year. But I’ve tried to avoid nonsensical prophecies about the long-term balance between print and e.

Maybe ebooks will stabilize at 20% of the total book market. Maybe they’ll wind up being 25%, or 30%, or even 80% (although achieving a majority is beginning to seem less likely, but I’m no prophet). Maybe there is no equilibrium level, with percentages shifting back and forth.

In any case, books should be available in the form readers prefer, public libraries should support those preferences to the best of their abilities, and it should never be a matter of shoving one medium down people’s throats preferring one medium at the expense of another despite apparent use patterns.*

Of course, I’m ancient enough to go back to all those predictions that all books would become movies (although never stated that way), because of course everybody really wants their books to be singing and dancing. It always struck me that those making such predictions weren’t really book readers, and it turns out most book readers aren’t especially interested in “enhanced books.”

Those of you who read my stuff in another area may note that I also don’t foresee OA sweeping away traditional journal publishing in any great hurry, or even in my lifetime. I’m just not much of a triumphalist or a single-path advocate. Such is life.

*I do believe a case can be made that public libraries should resist aggressively bad ebook contracts, to the extent that they effectively privilege ebooks over print books if there’s not clear evidence of similar patron preferences–but that’s part of what I’m saying.

TGOL approved for global distribution

September 25th, 2015

I’m delighted to note that The Gold OA Landscape 2011-2014 should be on its way to being available through outlets such as Amazon, Barnes & Noble and Ingram.

It could still be rejected by those channels, but it’s on its way (which means that I have now–finally–received my own copy, verified that the ISBN on the back cover and copyright page match, and determined that I’m happy with the way it looks).

So if you’re at a library that finds it much easier to purchase through Ingram or somebody that Ingram supplies, or has an account with Amazon, but can’t cope with Lulu…well, in six to eight weeks (maybe sooner) the paperback should be available. (If you can deal with Lulu, I much prefer that, since I get three times as much net revenue for each copy. And I’m not sure whether other agencies produce copies using the same great cream/60lb. paper Lulu uses or not…although I’ll assume they do.)

[Yes, the cover could stand a little tweaking, as my wife informs me–but whether or when that will happen is another thing. After all, several people alreoa14c300ady have or have ordered the current version, slightly low author’s name and all. Incidentally: the gold background is precisely the color of the OA open lock; I downloaded a version of that icon from Wikimedia and used’s color selector to choose that color. That Excel’s orange/gold in the blue/orange graph scheme is very close to that color, is a happy accident. I think.]


The Gold OA Landscape 2011-2014: Agriculture and Update 2

September 25th, 2015

First, the update: As of early this morning (September 25, 2015), there have been 1,198 downloads of Cites & Insights 15:9 (1,061 of them the single-column version). Also as of now, sales of the book have more than doubled, to a total of four paperback copies and one PDF ebook. For those of you hoping to see an anonymized version of the data available on figshare, we’re now one-seventh of the way there; one more sale would put us one-sixth of the way there.

Now, a quick note about agriculture (which, for this study, includes aquaculture, fisheries and other aspects of raising and processing plants and animals, including food and some aspects of nutrition). It’s one field that gets only a stub chapter in the excerpted version—but it’s also a field where including the whole world of DOAJ-listed journals makes a big difference.

To wit: where the interim report and the Walt at Random post (which omits one graph) show 286 journals (excluding C, as almost all of the new book does) publishing 14,879 articles in 2014, the complete study includes 418 journals (excluding C) publishing 19,861 articles in 2014. That’s a full third more articles from 46% more journals (as expected, the added journals tend to be smaller). There’s also a shift toward no-fee journals: of those actually publishing articles in 2014, the no-fee percentage was 62% for the smaller group and 66% for the more complete set—and while it’s still true that (as with most STEM fields) a majority of articles are in APC-charging journals, the percentage of articles in no-fee journals went from 44% for the smaller group to 48% in the larger (and for that larger group, articles in no-fee journals were at least half of all articles in 2013, 2012 and 2011).

In the smaller report, the average cost per article for articles involving APCs was $734.07, or $397.05 for all articles. Those figures don’t change enormously: for the fuller report, the average is now $729.95 for articles in APC-charging journals or $349.89 across all articles.

Agriculture OA journals in 55 countries published articles in 2014; 19 countries published 300 articles or more. It’s an interesting list from top (Brazil, with more than the next three countries put together) to bottom (Egypt—among the 19 top countries, that is, a list that also includes Bangladesh, Romania and the Czech Republic).

For more details on this area and on 27 others (together with lots of other stuff), buy the book.

Oh, and for graphics fans, here are the two graphs from Chapter 12, Agriculture:

Figure 12.1. Agriculture articles per year

Figure 12.2. Agriculture OA journals by starting date