GOA6: August 2021 report

September 3rd, 2021


As of September 2, 2021, as far as I can tell:

GOA6:

  • Overall report: 1,117 PDF copies (no books other than my copy)
  • Countries: 67 PDF (no books)
  • Dataset: 137 views, 23 downloads

GOA5:

  • Overall report: 763 copies (two books)
  • Countries: 175 copies (no books)
  • Dataset: 715 views, 124 downloads



GOA6: July 2021 report

August 2nd, 2021


As of August 2, 2021, as far as I can tell:

GOA6:

  • Overall report: 1,073 PDF copies (no books other than my copy)
  • Countries: 46 PDF (no books_
  • Dataset: 105 views, 25 downloads

GOA5:

  • Overall report: 737 copies (two books)
  • Countries: 162 copies (no books)
  • Dataset: 671 views, 114 downloads



Country supplement for GOA6 is out

July 12th, 2021

Gold Open Access by Country 2015-2020 is now available for free download or nominal ($7.50 in this case) purchase as a 284-page paperback.

This year, both the front and back cover feature OA heatmaps–but different ones. The front cover shows no-fee or “diamond” OA by 2020 articles per 100,000 people, with minimum values green and maximum deep gold or red. (The software used, GunnMap, is less sophisticated than GunnMap2, which requires the no-longer-supported Adobe Flash, and partly as a result does not have a legend.) The rear cover shows fee-based 2020 articles–but logarithmically, and “reversed” (lowest is yellow, highest is deep green for money).

The PDF ebook (free) is  available on my website: https://waltcrawford.name/g6cntry.pdf.

The trade paperback is available from Lulu at https://www.lulu.com/en/us/shop/walt-crawford/gold-open-access-by-country-2015-2020/paperback/product-eqzgjg.html?page=1&pageSize=4

As always, all links are available at the project page, https://waltcrawford.name/goaj.html

GOA6: June 2021 status report

July 2nd, 2021

As of July 1, 2021, as far as I can tell:

GOA6:

  • Overall report: 212 PDF copies (no books other than my copy)
  • Dataset: 62 views, nine downloads

GOA5:

  • Overall report: 717 copies (two books)
  • Countries: 129 copies (no books)
  • Dataset: 635 views, 115 downloads

Gold Open Access 2015-2020 (GOA6) is out

June 16th, 2021

Gold Open Access 2015-2020: Articles in Journals (GOA6) is out now.

Key figures: a million and a billion–more than one million articles in 2020, and considerably more than $1 billion in possible fees.

The book, a 243-page trade paperback with color figures, is available for $10.50 (plus shipping) from Lulu at https://www.lulu.com/en/us/shop/walt-crawford/gold-open-access-2015-2020-articles-in-journals-goa6/paperback/product-2ndqwe.html?page=1&pageSize=4. (The price is $10.50 in the US; Lulu requires that I set prices in Euros, Pounds, Australian Dollars and Canadian Dollars as well, so I’d guess Lulu prints and ships locally in many countries–and I set the price rounded up to the nearest ,5 or 1 from printing costs in each case. I clear anywhere from $0.08 to $0.40 depending on currency.)

I mention the printed book first because I really believe it’s the best way to browse through my analysis and occasional comments–and having a few book sales, while obviously irrelevant for income, would encourage me to keep doing these exhausting exhaustive studies (if SPARC continues to support them).

But most of you will prefer the free PDF, available at https://waltcrawford.name/goa6.pdf. It’s precisely the same content as the printed book: the same PDF used for the book, with the front and back covers added.

The dataset is also available at Figshare, at https://figshare.com/articles/dataset/Gold_Open_Access_6_2015-2020/14787888. Since it’s more than 15,000 rows (plus additional worksheets showing currency conversions, codes, and excluded journals), you’ll want to download it).

Everything has Creative Commons BY (attribution) licenses, so you can use them as desired. But if you send other people links rather than redistributing the book or the dataset directly, I can track usage, which is nice.

The country-of-publication book will be ready in a few weeks.

As always, all links are available at the Gold Open Access page at https://www.waltcrawford.name/goaj.html.

GOA6: Data gathering complete

May 20th, 2021

I just finished the final data-gathering pass–rechecking all problematic journals. I was able to clear 105 additional journals and found 8 more that either had no post-2014 activity or were no longer in DOAJ. That left 639 xm journals and 91 xx journals. I’ve concluded that defective journals that are still defective after six checks over two years, or that had no post-2018 article counts, should not be included in the overall analysis. In all, 492 journals were excluded, leaving 260 malware and 74 unavailable/unworkable journals, all with article counts from DOAJ, retained for the analysis.

The big numbers: 15,130 fully analyzed journals, 69.7% of them without fees (“diamond” if you like). 14,175 of those showed 2020 articles (68.8% no-fee), for a total of 1,061,256 articles. The bad news: while the percentage of no-fee journals has stayed about constant at nearly 70%, the percentage of no-fee articles has fallen significantly, from 39% for 2019 articles (in GOA5) to 35.5% for 2020: in essence, nearly all the 2020 growth was in fee-charging journals. NOTE: These numbers may change slightly during additional checking–e.g., two journals have moved to the Excluded category, because neither had any post-2014 articles.

If you’re wondering: excluding all xm and xx journals would reduce the 2020 article count by 5,268 articles (but, of course, I only have article counts where journals were reporting them to DOAJ)–and including all xm and xx journals would increase the article count by 4,315 (same caveat applies). In other words, these decisions have almost no impact overall.

I finished the data analysis on the same day as I did last year, despite having more than 1,000 additional journals: that speaks to fewer health and other interruptions, perhaps cleverer counting techniques, and perhaps fewer journals making it really hard to count articles. (Although some do try–including one where literally the only way to find dates is to read the articles and look for the recommended citation form! )

Next: add derived data, move columns around, and start the data processing and book writing. Anticipated completion date: somewhere around June 24-July 4, perhaps 2-3 weeks later for the country book.

I haven’t done a usage report for GOA4 and GOA5 for a while, and now that there’s a hosted copy of the GOA5 dataset with a dashboard elsewhere. I’m not sure how useful they are. I do know that book copies have declined considerably, from over 4,000 for GOA4 to around 740 for GOA5 (including two printed books). I find that discouraging, especially since the book includes caveats that aren’t in the dataset.

Meantime, one with the show. I surely hope we don’t hit “double 70” next year — with 70% of serious OA journals “diamond” but 70% of the articles appearing in the fee-charging journals….

GOA6: Second part of second pass

May 10th, 2021


I just finished the second part of the second data-gathering pass–rechecking 732 journals that had problems other than malware. Most of those problems were resolved, either because the journal’s host fixed a problem or because I could reach the journal at a different URL by searching on title and ISSN. The pass found 21 no longer in DOAJ and left 126 to be rechecked one last time. [The check of journals against those removed from DOAJ since January 1, 2021 yielded 40 cases–and this part of the scan yielded another 21 no longer in DOAJ. Since many journals have two ISSNs, I only save one, and the removal list only provides one, this oddity is not surprising.]

At this point, there are 1,049,954 articles from 2020 and 879,377 from 2019, from 14,635 fully-analyzed journals–numbers that probably won’t grow a lot.

Next step: a quick check of xm/malware journals for resolutions–then, beginning May 15, the final check.



GOA6: First part of second pass

May 2nd, 2021

I just finished the first part of the second data-gathering pass–rechecking slightly more than 2,200 journals that (a) seemed likely to have another 2020 issue appear in early 2021 or (b) had less than 2/3 as many 2020 articles as 2019.

In all, counts were changed in 690 of the journals–mostly added 2020 counts but with some changes for 2019 (and, rarely, earlier years) as well. At the end of the process, the 2,200-odd journals show 56,268 articles for 2020 (but 92,225 for 2019: since (b) above meant that every journal with 2019 articles and no 2020 articles was rechecked).

Naturally, the success rate for additional articles declined as the scan progressed, since the time lapse between scans shrank. Counts changed for 273 of the first 550 journals; 207 of the second 550; 106 for the third 550 and 104 for the last 560+.

At this point, 14,054 journals are in place for full analysis, with 1,028,737 articles for 2020 and 860,799 for 2019.

Next step: check the remaining 1,612 journals against the list of journals removed from DOAJ between January 1 and May 2. Then recheck all the journals that were problematic for some reason other than malware, since most of those should be transitory problems.

GOA6: First pass completed

April 21st, 2021

First things first: If you’re in a position to help resolve some of the very large number of journals with malware (787) or ones that were unreachable or unworkable (752), there’s a spreadsheet with the key information for all of them here:

https://docs.google.com/spreadsheets/d/19gXpn3kVn-R33uDdOUSHgssPaLO5CEPRWUGPvDB9az0/edit?usp=sharing

And I won’t do the final piece of the multistep “second pass” until at least May 15. Help from folks with colleagues in Indonesian or Brazilian academia most helpful. (The spreadsheet, g6x, is sorted first by code, then by country, then by publisher, then by journal. The second page lists the codes and notes.)

Here’s where things stand. 15,666 journals had 1,018,364 articles, up from 890,069 2019 articles. The 2020 number will rise somewhat, both because some journals are late to publish issues but also because the numbers don’t include *any* of the malware and unreachable journals (but 2019 numbers do). For GOA5, the 2019 total was 854,018 articles.

The 15,666 (yes, I know, I say 15,667 sometimes–it’s hard to remember to subtract one for the row of labels) include:

  • 13,391 “a”–regular–journals
  • 317 “bi”–no articles in 2019 or 2020, mostly ceased, renamed, changed publishers or otherwise disappeared
  • 3 bm–early cases of journals with malware that could be reached through other addesses
  • 343 bx–journals available at a different URL than the one in DOAJ. There will probably be quite a few more of these; nearly all at present are either Sciendo (from DeGruyter) or dergipark, moved from .gov to .org without generally changing DOAJ records.
  • 58 xd: journals with no articles later than 2014, most of them “duplicates” that have been superseded.
  • 787 xm: Malware
  • 14 xn: Apparently not OA.
  • 1 xt: A website I couldn’t translate or make enough sense of to count
  • 752 xx: Unreachable (404, etc.) or unworkable (db errors, etc.)
  • So far, I see 4,371 journals with fees, 9,706 with no fees, and a few hundred needing rechecking (mostly newly-added journals that are xm or xx).

Now, after ignoring journals for a day or two, I’ll recheck 2,211 journals for added issues/articles and 1,613 to try to clear malware and unreachable cases. (The 2,211 includes 946 cases marked along the way and 1,265 where there were at least 1.5 times as many articles in 2019 as in 2020–the original version of this paragraph had incorrect numbers here; fortunately, the correction means fewer to check.)

As already noted, the final malware pass will start no earlier than May 15. If all goes well, the primary book and spreadsheet should be ready in very late June or early July.

GOA6: Ninth Update

April 10th, 2021


Time for another GOA6 checkpoint, at 14,400 of 15,676.

Note that, as always, I sort journals by publisher before checking–because many multijournal publishers use the same templates for all journals, making it easier for me to find fee data and do article counts.

For GOA6, that means I’ve now checked partway through the University of Isfahan. So far, the 2020 article count is 942.685, and that will go up. The 2019 total for this set of journals is 830,018 articles.

Last year, that range of publishers included 12,835 journals, which published 792,068 articles in 2019. So there’s a net gain of 1,562 added journals so far. A million overall articles still seems likely, but not certain.

For this group of 1,600 journals–ignoring the first 12,600–problematic journals include 308 malware case and 75 or so unreachable/unworkable. Yes, that’s a terribly high malware ratio.

Looking more closely at the malware cases for these 1,600 journals, there are ten security-certificate problem, seven ransomware, ten malware, 24 phishing and 256 Trojans.

The problem is mostly Indonesia: 842 of the 1,600 journals in this group are from Indonesia, and 281 of those have malware, mostly at the root URL for a university’s set of journals.

I checked all 14,400 journals scanned so far. Of 764 total malware cases, 481 are from Indonesia. Brazil is a distant second at 121, with smaller clusters from Romania and Spain (and a few cases elsewhere). Yes, Indonesia has more DOAJ-listed journals than any other country, but 481 of Indonesia’s 1,745 (so far) are problematic; Brazil has the second-most journals, and 121 of 1,578 are problematic. (All these figures exclude the remaining 1,276 journals–but only 26 of those are from Indonesia.)

I believe attempts have been made to alert publishers to malware problems. Some may be again this year. This is a continuing problem.

I’d say it’s now nearly certain that the first scan will be done in late April, barring illness or other unexpected events. That would leave some checking and the long rescans. (So far, about 2,200 journals need rechecking; the final number will probably exceed 2,300. Rechecking can be a slow process.)

So no overall target date yet…