How many articles, how many journals?

September 30th, 2015

Thanks to Dietrich Rordorf’s comment on a post at Scholars Kitchen, I am now aware of MDPI’s sciforum (the link is to the journal reviews/statistics section), which among other things “aims at publishing statistics and rankings of scientific and scholarly journals and their Publishers.” Quoting from the disclaimer:

Statistics are automatically computed from available data, and are not manually curated. While we have made every possible effort to provide meaningful statistics, we can not guarantee the correctness or accuracy of any of the statistics. Statistics might be recomputed anytime without notice. Access to statistics might be disabled anytime without notice.

Quoting the section on Open Access:

Data about papers published under open access licenses are currently collected from two providers: DOAJ (licensing information available on journal-level but only for journals that publish exclusively under open access licenses), and Publisher website metadata (licensing information available on a per paper-level for some Publishers). We will include PubMed Central article-level licensing data in a future update. Because many hybrid journals do not offer metadata or licensing information which are easily machine-readable, the statistics about open access content are likely too low. E.g. JR reports 270’000 open access papers for a total market size of roughly 2.4 million papers for 2013 (which is about 11% open access papers). In reality the share of open access papers might be much higher if all papers published under open licenses in hybrid journals could be easily and properly counted). Green open access, i.e. self-archived pre- or post-prints are currently not included.

The section also specifies where data comes from.

I’m delighted there is such a source. Think of the rest of this as additional data (yes, I’ll be emailing a note to Rordorf, but I’m not sure how he can blend manually-counted and automatically-gathered data).

Note that I don’t include “hybrid” OA in my counts at all, partly because there’s no good way to count it, partly because it’s consistently the most expensive form of OA and, I believe, the wrong way to go about OA. Neither does this site because there’s no good way to count it.

Number of Journals

Sciforum shows 25,064 journals publishing at least one article in 2014, of which 3,693 journals are Gold OA—and the chart shows that as being down from 3,990 with at least one article in 2013.

My study, excluding questionable journals and those not in DOAJ, shows 8,760 gold OA journals publishing at least one article in 2014, but that number is down slightly, from 8,960 in 2013.

That’s an enormous difference, one that I believe speaks to the limitations (at this point) of automated data gathering for OA. Even leaving out the global south can’t really account for omitting more than half of the active journals. (Sciforum does not indicate that it’s limited to STM, so I’m assuming it’s not.)

Number of Articles

Sciforum shows 2,423,122 articles in 2014, up from 2,248,966 in 2014. So “roughly a quarter million” seems like a plausible estimate for the total article production.

But: Sciforum shows 302,339 OA articles in 2014, 279,967 in 2013, 250,237 in 2012 and 196,508 in 2011.

The Gold OA Landscape 2011-2014 shows 482,361 articles in 2014, 440,843 in 2013, 394,374 in 2012 and 321,312 in 2011.

Those are also enormous differences, although slightly smaller percentage-wise. To wit, my actual count (again omitting hybrids, questionable journals and journals not in DOAJ) is 60% higher for 2014, 57% higher for 2013, 58% higher for 2012 and 64% higher for 2011.

If we assume that subscription journals can be measured more accurately through automated means (an assumption I’m a little wary of making), then actual article totals for 2014 are around 2.6 million, of which around 18% are in gold OA journals.

My main takeaway: at this point, automated data gathering severely undercounts the OA field—which, given the ludicrous amount of time spent gathering data manually (but hey! book sales are already up to…well, five copies so far), is at least a trifle reassuring.

The Gold OA Landscape and Outsell’s Open Access 2015

September 29th, 2015

I thought it might be interesting to look at Outsell’s Open Access 2015: Market Size, Share, Forecast and Trends (which the CCC seems to have made openly available, or at least it was for a while—if it’s back to $2,500, my apologies)in light of The Gold OA Landscape 2011-2014. Is there anything useful or mildly controversial to say?

Definitional Differences

Outsell estimates OA as “about 1.1% of the total 2014 STM market and 4.3% of the STM journals market”—but remember that this is a dollar share, not magnitude. Also, it’s STM, while a sizable chunk of the OA universe as I measured it is humanities and social sciences.

The methodology is entirely different, of course: Outsell develops estimates where I tried to do a universal count. Outsell also has industry contacts, which I entirely lack.

A third key definitional difference, especially given Outsell’s finding: I explicitly exclude “hybrid” publishing—not only because I think it’s a trap but because it’s essentially impossible to count.

Also, Outsell seems to define “megajournals” based on crossing subject boundaries, which strikes me as odd; I’d define them as journals with very large numbers of articles. There are hundreds of interdisciplinary journals; there are only a few very large OA journals.

Questionable Statements

Outsell says (p. 5) “Hybrid currently prevails as the Gold model.” If “prevails” means there are more hybrid journals than there are Gold OA journals (by my definition and that of DOAJ, a hybrid journal cannot be gold OA; otherwise it wouldn’t be hybrid), then that may be true, as so many big publishers will happily take big bucks to make an article OA (sort of) in any of their journals. If “prevails” means that there are more OA articles published in hybrid journals than in Gold OA journals, that would be astonishing: it would mean there were close to a million OA articles published in 2014 (not including green OA). I find that hard to believe, and I don’t know how Outsell could determine this number, so I’ll have to assume “prevails” has to do with number of journals.

On pages 5-6, Outsell offers a proliferation of terms and models including “Platinum” OA, “Gold for Gold” and more. As you read this section, it’s ever more clear that the point of Outsell’s report is to inform publishers how they can best maintain and increase revenues—so, for example, institutionally-sponsored OA is not included in the list of Gold models because “it does not produce revenues.” If it doesn’t generate $Gold, it isn’t worth discussing.

Market Size and Forecast

Outsell estimates the 2014 total STM market at a breathtaking $26.2 billion but the serials market at “only” $6.8 billion. Given other estimates of $10 billion, I can only wonder whether this means that the non-STM journals market is $3.2 billion (which seems unlikely) or whether something else is going on.

Outsell’s numbers for APCs are $290.4 million in 2014, $252.3 million in 2013, and $171.9 million in 2012. But those numbers include articles in “hybrid” journals, which Outsell says are the more prevalent model.

My own figures—not allowing for discounts and waivers, but also not including hybrid publications—are $305.4 million in 2014, $241.9 million in 2013, and $195.5 million in 2012. Those numbers do include the humanities and social sciences, but HSS only accounts for $9.5 million in 2014, $7.7 million in2013 and $6.6 million in 2012.

What’s interesting (to me) is that, if you subtract HSS from my figures, they’re not wildly different from Outsell’s numbers (Outsell’s numbers are about 2% lower in 2014, 8% higher in 2013, and 9% lower in 2013). I could conclude that hybrid APCs don’t really amount to much, or that Outsell defines STM more narrowly than my STEM+Biomed, or…


Outsell says that there are approximately 20 megajournals—but also that they published fewer articles in 2014 than in 2013. Looking at the graph, it appears that Outsell is saying that all megajournals put together published about 40,000 articles in 2014, as compared to 43,000 or so in 2013 and 25,000 or so (?) in 2012.

But it’s hard to tell what Outsell considers to be a megajournal. Let’s look at some subsets within the DOAJ landscape (among the 9,512 I report on fully):

  • Journals with at least 1,000 articles in some year 2011-2014: There are 40 of these. I count the article totals as 38,830 in 2011; 56,827 in 2012; 70,986 in 2013; and 88,315 in 2014. That’s a much smaller percentage increase from 2012 to 2013 than Outsell’s graph seems to show—but my figures show between 24% and 25% increase in articles from 2012 to 2013 and again from 2013 to 2014. So: I’m showing a healthy increase from 2013 to 2014, not a decrease.
  • Cut that down to journals with at least 1,500 articles in one of those years (14 of them), and you still get a healthy increase each year: 27,052 in 2011; 39,683 in 2012; 49,131 in 2013; and 61,844 in 2014.
  • Trim it to the eight journals with at least 2,000 articles in one year, and there’s still an increase in each year:23,461 in 2011; 35,145 in 2012; 43,667 in 2013; and 51,646 in 2014.
  • Include only “other sciences” (my term for interdisciplinary STM journals) and PLOS One, and use 1,000 as the cutoff, and I get 17,167 articles in 2011; 28,168 in 2012; 38,604 in 2013; and 43,443 in 2014.
  • If I include PLOS One and all “Other Sciences” journals (171 journals in all), I get 23,041 articles in 2011; 35,502 in 2012; 47,607 in 2013; and 54,210 in 2014.

I have no doubt that Outsell is able to define some set of “megajournals” that declined in article count from 2013 to 2014. Since PLOS One alone increased (very slightly, from 31,509 to 31,882), that presumably means that all the other megajournals combined went from around 11,500 articles in 2013 to around 8,000 in 2014. That’s a surprisingly large drop (although one very big and very badly run “megajournal” could account for all of it). Certainly possible—again, depending on how you define megajournals.

Competitive Landscape

Table 4 in the Outsell report is interesting if only because of the numbers: the 14 publishers listed publish a total of 11,740 journals in 2014 (which would appear to be about half of the total) but only 1,505 Gold OA journals (and that includes Hindawi’s 438 and the whole set of BioMed Central journals), fewer than one-sixth of the Gold OA total. (Take away Hindawi, which only publishes Gold OA journals, and you’re down to less than one-eighth of the gold OA total). But note that those 14 publishers claim to have 8,404 hybrid journals.

Ah, but Outsell’s only looking at STM. That does make a difference, because there are so many HSS gold OA journals (more than 4,000 in my study): If you remove those from my count, I’m left with just under5,500 journals. The 14 biggies still account for less than one-third of the STM Gold OA total (around one-fifth without Hindawi), but that’s an improvement.


I question the supposed decline in megajournal publishing activity, but that’s a matter of definition.

I don’t particularly question the estimated APC totals—unless Outsell is seriously claiming that hybrid publications account for most APCs, in which case I’d question them a lot: I’d believe that waivers and discounts might reduce my numbers by, say, 15%-20%, but that would still leave a huge gap.

Since Outsell doesn’t estimate overall article counts at all, my primary focus, I have no comments.

In general, given differences in definition, the only fault I might find with Outsell (I can’t fault them for a laser focus on $$$) is the decline in megajournal publishing from 2013 to 2014–and, again, I’m sure they managed to define a group of journals that comes out that way.

pubs_since_1994.htm finally up

September 28th, 2015

For the thousands (well, hundreds (well, tens (well…anybody?))) of avid readers of Cites & Insights August-September 2015 who, wondering about the quotes in “A Few Words, Part 2,” clicked through to find the bibliography…

It’s (finally) there, such as it is:

My apologies for the slight delay in getting it ready. The fact that I’ve seen zero instances of anybody looking at the first part of the bibliography may have influenced the priority with which this part was prepared…

(But hey, there are lots of 404s on, as usual: I could convince myself that those were all folks looking for the second part of the bibliography. I could convince myself that I look like a slightly older George Clooney, too, but it would be equally absurd.)



Reading the way you prefer

September 28th, 2015

I ran into an odd blog post (on a ALA divisional blog) this morning–and didn’t comment directly for two reasons:

  1. I’m not a member of the division
  2. I’m hoping that I simply misread or misunderstood the post.

The post seemed to be saying that libraries/library groups should be helping to persuade younger people to do all their book reading in ebook form. (I believe it springs from the New York Times piece regarding a slowdown in ebook sales.)

Again, I’m probably misunderstanding what was being said–but I have certainly seen in the past discussions that seemed to say that the “digital shift” was not only inevitable but desirable, and that good librarians should be backing it.

And I just don’t get it.

I’ve suggested for some time that there is no such thing as an inevitable digital shift when it comes to books: that there’s no reason to believe, based on precedent or history, that ebooks would sweep away print books entirely–or that this was even a desirable thing.

I’ve tried to be consistent in saying what the title of this post suggests. Expanding:

  • It seems likely that some people will prefer to do all or most of their extended-narrative reading on digital devices, either because they like them better, they’re more convenient, they believe they should do so…or for whatever reasons.
  • It seems likely that some people will prefer to do all or most of their extended-narrative (that is, “book”) reading from print books, either because they like them better or for whatever reasons.
  • It seems likely that some people will prefer to read some books in print form, some in digital form–and that the variety and distribution of preference will be different for different people.
  • Public libraries should not be “out ahead of the users” on such matters unless there’s a clear and consistent shift in preferences–and even then, maybe not. (Which is not to say public libraries shouldn’t provide ebook services, but maybe that they shouldn’t screw up their budgets or priorities to emphasize ebook services.)

I’ve said for some time that I expect book publishing and print book publishing to be a healthy business throughout my lifetime, with total print book revenues certainly in the billions and probably in the tens of billions of dollars per year. But I’ve tried to avoid nonsensical prophecies about the long-term balance between print and e.

Maybe ebooks will stabilize at 20% of the total book market. Maybe they’ll wind up being 25%, or 30%, or even 80% (although achieving a majority is beginning to seem less likely, but I’m no prophet). Maybe there is no equilibrium level, with percentages shifting back and forth.

In any case, books should be available in the form readers prefer, public libraries should support those preferences to the best of their abilities, and it should never be a matter of shoving one medium down people’s throats preferring one medium at the expense of another despite apparent use patterns.*

Of course, I’m ancient enough to go back to all those predictions that all books would become movies (although never stated that way), because of course everybody really wants their books to be singing and dancing. It always struck me that those making such predictions weren’t really book readers, and it turns out most book readers aren’t especially interested in “enhanced books.”

Those of you who read my stuff in another area may note that I also don’t foresee OA sweeping away traditional journal publishing in any great hurry, or even in my lifetime. I’m just not much of a triumphalist or a single-path advocate. Such is life.

*I do believe a case can be made that public libraries should resist aggressively bad ebook contracts, to the extent that they effectively privilege ebooks over print books if there’s not clear evidence of similar patron preferences–but that’s part of what I’m saying.

TGOL approved for global distribution

September 25th, 2015

I’m delighted to note that The Gold OA Landscape 2011-2014 should be on its way to being available through outlets such as Amazon, Barnes & Noble and Ingram.

It could still be rejected by those channels, but it’s on its way (which means that I have now–finally–received my own copy, verified that the ISBN on the back cover and copyright page match, and determined that I’m happy with the way it looks).

So if you’re at a library that finds it much easier to purchase through Ingram or somebody that Ingram supplies, or has an account with Amazon, but can’t cope with Lulu…well, in six to eight weeks (maybe sooner) the paperback should be available. (If you can deal with Lulu, I much prefer that, since I get three times as much net revenue for each copy. And I’m not sure whether other agencies produce copies using the same great cream/60lb. paper Lulu uses or not…although I’ll assume they do.)

[Yes, the cover could stand a little tweaking, as my wife informs me–but whether or when that will happen is another thing. After all, several people alreoa14c300ady have or have ordered the current version, slightly low author’s name and all. Incidentally: the gold background is precisely the color of the OA open lock; I downloaded a version of that icon from Wikimedia and used’s color selector to choose that color. That Excel’s orange/gold in the blue/orange graph scheme is very close to that color, is a happy accident. I think.]


The Gold OA Landscape 2011-2014: Agriculture and Update 2

September 25th, 2015

First, the update: As of early this morning (September 25, 2015), there have been 1,198 downloads of Cites & Insights 15:9 (1,061 of them the single-column version). Also as of now, sales of the book have more than doubled, to a total of four paperback copies and one PDF ebook. For those of you hoping to see an anonymized version of the data available on figshare, we’re now one-seventh of the way there; one more sale would put us one-sixth of the way there.

Now, a quick note about agriculture (which, for this study, includes aquaculture, fisheries and other aspects of raising and processing plants and animals, including food and some aspects of nutrition). It’s one field that gets only a stub chapter in the excerpted version—but it’s also a field where including the whole world of DOAJ-listed journals makes a big difference.

To wit: where the interim report and the Walt at Random post (which omits one graph) show 286 journals (excluding C, as almost all of the new book does) publishing 14,879 articles in 2014, the complete study includes 418 journals (excluding C) publishing 19,861 articles in 2014. That’s a full third more articles from 46% more journals (as expected, the added journals tend to be smaller). There’s also a shift toward no-fee journals: of those actually publishing articles in 2014, the no-fee percentage was 62% for the smaller group and 66% for the more complete set—and while it’s still true that (as with most STEM fields) a majority of articles are in APC-charging journals, the percentage of articles in no-fee journals went from 44% for the smaller group to 48% in the larger (and for that larger group, articles in no-fee journals were at least half of all articles in 2013, 2012 and 2011).

In the smaller report, the average cost per article for articles involving APCs was $734.07, or $397.05 for all articles. Those figures don’t change enormously: for the fuller report, the average is now $729.95 for articles in APC-charging journals or $349.89 across all articles.

Agriculture OA journals in 55 countries published articles in 2014; 19 countries published 300 articles or more. It’s an interesting list from top (Brazil, with more than the next three countries put together) to bottom (Egypt—among the 19 top countries, that is, a list that also includes Bangladesh, Romania and the Czech Republic).

For more details on this area and on 27 others (together with lots of other stuff), buy the book.

Oh, and for graphics fans, here are the two graphs from Chapter 12, Agriculture:

Figure 12.1. Agriculture articles per year

Figure 12.2. Agriculture OA journals by starting date

The Lords of the Universe Don’t Worry About Carbon Footprints…

September 24th, 2015

…Especially if it’s under the hallowed sanction of the National Geographic Society.

Today’s mail brought another slick brochure (28 pages this time) for National Geographic Expeditions “Around the World by Private Jet.” We’ve received them in the past, along with thick catalogs of overpriced travel offerings from NatGeo (which, I guess, is now deeply in bed with Rupert Murdoch, but I’m not going to get into that here…)

Here’s what it is: 24 days. Looks like 10 or 11 stops. First-rate hotels along the way (no problem there).

Oh, and you’re flying on a Boeing 757 that seats 75 people rather than the usual 233.

So even if the 757 was one of Boeing’s most fuel-efficient planes (I can’t tell offhand; the 233-seat configuration isn’t typical), you can triple the fuel consumption per passenger for these flights. And the plane’s going around the world to visit 10 or 11 places.

Text in the brochure about mitigation for this humongous carbon footprint? None that I could find.

But hey, we’re talking masters of the universe here: the price for this 3.5-week adventure is just under $86,000 for one person or $154,000 for two people.

That does include booze, tips, hotels, etc. It does not include the travel insurance they strongly recommend, which (if I remember correctly) would add around 6%. But hey, if you have to ask, you shouldn’t be despoiling the environment on this trip anyway.

There are a lot of masters of the universe out there, I guess: the brochure is for four different tours between October 2016 and February 2017.

(And now I’ll go recycle the brochure.)

The Gold OA Landscape 2011-2014: This one’s not in the book

September 21st, 2015

The first in an occasional series noting interesting items in The Gold OA Landscape 2011-2014 that aren’t in the excerpted version in Cites & Insights 15:9—or observations drawn from the book.

Chapter 6, Country of Publication, is mostly tables showing countries in which gold OA journals are published arranged in different ways with additional facts. The C&I version includes Table 6.2, countries ranged by percentage of OA journals that don’t charge APCs (from the 100% free of Cuba, Venezuela, Denmark, Costa Rica, Estonia, Philippines, Sri Lanka and Ecuador down to the only four where at least 70% of gold OA journals do charge author-side fees: the United Kingdom, United Arab Emirates, New Zealand and Nigeria). It also includes all or part of Table 6.3, articles by country in 2014 for countries with at least 1,000 (and percentage in no-fee journals) from 89,485 in the U.S. (17% in no-fee journals, although 62% of U.S. gold OA journals don’t charge fees) down to 1,001 in Peru (94% in no-fee journals: 96% of the OA journals in Peru don’t charge fees).

The issue doesn’t include Tables 6.1, 6.4, 6.5 or 6.6, illustrating different aspects. For those, you need to buy the book.

Here’s one I could have included in the book but didn’t, thrown in as an online bonus: potential revenue in 2014 by country. There are 75 countries where at least one APC-charging journal published articles in 2014, so it’s a long table. Here goes:

Country Max Revenue 2014
United Kingdom


United States
















South Korea


New Zealand
















Iran, Islamic Republic of






Czech Republic


South Africa
















United Arab Emirates




Hong Kong






Russian Federation


Macedonia, the Former Yugoslav Republic of


































Taiwan, Province of China




Bosnia and Herzegovina




Saudi Arabia
























Palestine, State of












If you haven’t purchased the book (paperback or site-licensed PDF) yet, please do. (Yes, it will eventually be available—paperback only—from Ingram, Amazon or Barnes & Noble, but that will be something like two months from now.)

What the heck: as more copies sell, I may try to find more interesting new stuff that’s not in the book (but might be next year, if the project continues).

Sometimes there is a little progress

September 21st, 2015

Sometimes. Shonda Rhimes (who must be the most powerful black woman in TV today, I’d guess) puts together shows that always feature strong women who aren’t just appendages of men, and sometimes they’re black–so that Viola Davis was able to win an Emmy. As she said, it’s tough to win an Emmy for parts that don’t exist.

So that’s progress, a little of it.

And in language: if I was writing about either of these people at length, I’d probably use Ms. Rhimes and Ms. Davis, because I neither know their marital status nor believe that’s a defining characteristic for a woman.

Which is, I think, progress, given that I’ve been reading portions of a William Safire language-column collection from 1986, including a discursion on the use of Ms. (Safire was in favor), including this gem:

Most of the mail ran the other way. “A woman who wants to be addressed as ‘Ms.,'” wrote Mrs. Havens Grant of Greenwich, Connecticut, “is either ashamed of not being married or ashamed of being married.”

And at the time, that supposed newspaper of record in New York City would not allow Ms. (have they finally stopped that nonsense?). And, sure enough, the longest response to Safire’s follow-up column attack Ms. as feminism run amok.

I’d like to think that people like Mrs. Grant (I assume her husband’s first name is or was Havens, since The Traditional And Proper Means of Naming Woman makes it clear that they’re essentially property by not even retaining their first names) have come around to the belief that a woman is something more than her marital status. I could be wrong.

Hey, I’m an optimist (my wife, Ms. Driver, sometimes has stronger terms); I’ll take progress where I can find it. Even if it is slow.

By the way: if you’re one of those who still believes it is Right and Proper for a woman to be either Miss or Mrs.: Show me the commonly-used male equivalents. If you can’t, well…

Personalized ads: An odd incident

September 20th, 2015

Yes, I know most sidebar ads on websites are affected somehow by what you’ve searched or what sites you’ve gone to before. No big surprise, that, although it’s always amusing to see all the ads for competitors to something you just purchased.


I don’t remember ever seeing Lulu running these sidebar ads; since Lulu’s a service company for self-publishers more than it’s really an online bookstore, that was OK with me.

Somehow, though, for the past three or four days, I’ve been getting loads of Lulu sidebar ads, usually scrolling through three to six different items on order.

One of which is almost always the paperback version of The Gold OA Landscape 2011-2014.

Which is odd on a couple of counts:

  • I’ve already purchased a copy–not surprisingly, since it’s my book, and especially since I can’t approve it for global distribution (Ingram, Amazon, B&N) until I receive my copy and “approve” it.
  • For that matter, if I do order a copy, it won’t cost the $60 shown in the ads: as the author, I pay only production costs, with no real profit for Lulu.
  • At least the last time I checked, searching for “the gold OA landscape” at Lulu yields the PDF ebook but not the paperback (Lulu’s book search is sometimes a little strange). But, of course, the ad takes me right to the product page that should show up on a search.

Is anybody else seeing this book advertised in sidebars? I’d love to think so, but I’m not going to assume it’s true.

By the way, another book that seems to show up for me all the time is Ann Dodds Costello’s Smart Women: The Search for America’s Historic All-Women Study Clubs. Which actually looks pretty interesting; I might yet buy a copy. (The link here is for the currently-$32 hardback; there’s also a $24 paperback and $8.99 ebook. It’s a 426-page book.)

Hmm. If I do buy that book, then Lulu’s ads are working…even if they’re also advertising my own stuff to me.