Archive for May, 2021

GOA6: Data gathering complete

Thursday, May 20th, 2021

I just finished the final data-gathering pass–rechecking all problematic journals. I was able to clear 105 additional journals and found 8 more that either had no post-2014 activity or were no longer in DOAJ. That left 639 xm journals and 91 xx journals. I’ve concluded that defective journals that are still defective after six checks over two years, or that had no post-2018 article counts, should not be included in the overall analysis. In all, 492 journals were excluded, leaving 260 malware and 74 unavailable/unworkable journals, all with article counts from DOAJ, retained for the analysis.

The big numbers: 15,130 fully analyzed journals, 69.7% of them without fees (“diamond” if you like). 14,175 of those showed 2020 articles (68.8% no-fee), for a total of 1,061,256 articles. The bad news: while the percentage of no-fee journals has stayed about constant at nearly 70%, the percentage of no-fee articles has fallen significantly, from 39% for 2019 articles (in GOA5) to 35.5% for 2020: in essence, nearly all the 2020 growth was in fee-charging journals. NOTE: These numbers may change slightly during additional checking–e.g., two journals have moved to the Excluded category, because neither had any post-2014 articles.

If you’re wondering: excluding all xm and xx journals would reduce the 2020 article count by 5,268 articles (but, of course, I only have article counts where journals were reporting them to DOAJ)–and including all xm and xx journals would increase the article count by 4,315 (same caveat applies). In other words, these decisions have almost no impact overall.

I finished the data analysis on the same day as I did last year, despite having more than 1,000 additional journals: that speaks to fewer health and other interruptions, perhaps cleverer counting techniques, and perhaps fewer journals making it really hard to count articles. (Although some do try–including one where literally the only way to find dates is to read the articles and look for the recommended citation form! )

Next: add derived data, move columns around, and start the data processing and book writing. Anticipated completion date: somewhere around June 24-July 4, perhaps 2-3 weeks later for the country book.

I haven’t done a usage report for GOA4 and GOA5 for a while, and now that there’s a hosted copy of the GOA5 dataset with a dashboard elsewhere. I’m not sure how useful they are. I do know that book copies have declined considerably, from over 4,000 for GOA4 to around 740 for GOA5 (including two printed books). I find that discouraging, especially since the book includes caveats that aren’t in the dataset.

Meantime, one with the show. I surely hope we don’t hit “double 70” next year — with 70% of serious OA journals “diamond” but 70% of the articles appearing in the fee-charging journals….

GOA6: Second part of second pass

Monday, May 10th, 2021


I just finished the second part of the second data-gathering pass–rechecking 732 journals that had problems other than malware. Most of those problems were resolved, either because the journal’s host fixed a problem or because I could reach the journal at a different URL by searching on title and ISSN. The pass found 21 no longer in DOAJ and left 126 to be rechecked one last time. [The check of journals against those removed from DOAJ since January 1, 2021 yielded 40 cases–and this part of the scan yielded another 21 no longer in DOAJ. Since many journals have two ISSNs, I only save one, and the removal list only provides one, this oddity is not surprising.]

At this point, there are 1,049,954 articles from 2020 and 879,377 from 2019, from 14,635 fully-analyzed journals–numbers that probably won’t grow a lot.

Next step: a quick check of xm/malware journals for resolutions–then, beginning May 15, the final check.



GOA6: First part of second pass

Sunday, May 2nd, 2021

I just finished the first part of the second data-gathering pass–rechecking slightly more than 2,200 journals that (a) seemed likely to have another 2020 issue appear in early 2021 or (b) had less than 2/3 as many 2020 articles as 2019.

In all, counts were changed in 690 of the journals–mostly added 2020 counts but with some changes for 2019 (and, rarely, earlier years) as well. At the end of the process, the 2,200-odd journals show 56,268 articles for 2020 (but 92,225 for 2019: since (b) above meant that every journal with 2019 articles and no 2020 articles was rechecked).

Naturally, the success rate for additional articles declined as the scan progressed, since the time lapse between scans shrank. Counts changed for 273 of the first 550 journals; 207 of the second 550; 106 for the third 550 and 104 for the last 560+.

At this point, 14,054 journals are in place for full analysis, with 1,028,737 articles for 2020 and 860,799 for 2019.

Next step: check the remaining 1,612 journals against the list of journals removed from DOAJ between January 1 and May 2. Then recheck all the journals that were problematic for some reason other than malware, since most of those should be transitory problems.