Archive for 2023

Gold Open Access 8 is now available

Friday, May 26th, 2023

Gold Open Access 2017-2022: Articles in Journals (GOA8) is now available in print book, PDF ebook, and dataset forms. The print book–a 6×9 trade paperback with color graphs–is $11.50 (or the nearest equivalent in other currencies supported by Lulu), of which I receive a stunning $0.24. The PDF ebook and dataset are both free, and all versions are CC-BY.

As usual, all links are available at https://waltcrawford.name/goaj.html, or you can use these direct links:

The Lulu trade paperback

The free PDF ebook

Figshare dataset

Same dataset on my own site.

Next, some time in July, Diamond OA 2023: The World of No-Fee Open Access Publishing, covering the same years and the no-fee portion of the dataset–but with a little lightweight new research added to give some sense of how such journals are funded (when they’re not directly from universities or societies).

GOA8: Progress Report and Prediction

Saturday, May 20th, 2023

Short form: The book and shared data will probably be available some time between Wednesday, October 24 and Wednesday, October 31 (2023).

That’s also the long form. I’m doing several proofing passes of the manuscript, about to start preparing the covers, and taking the time to do it right. There may be some additional checking on data, but not much–there will doubtless be a few duplicates and other minor slipups, as there have been every year.

As to the next step: Yes, I’m doing Diamond OA 2023: The World of No-Fee Open Access Journals–and I will be including some brief notes on apparent funding for journals not obviously published by universities and societies. (Guess what? They’re mostly supported by…universities and societies.) I’ve done quick checks on 1,800 of these and have about 360 more to check (interleaved with other work). Then it’s a matter of preparing an appropriate matrix and putting it all together. I’m guessing about a month’s work after GOA8 is complete, give or take a week or so. [For this first attempt. I’m including journals that appear to be temporarily diamond and the tiny number that seem to rely on Subscribe-to-Open. If there are later editions, I’ll reconsider those.]

A reminder: The spreadsheet is not intended to be an authoritative dataset; it should never be used in place of DOAJ, for example. On large-scale groupings, I’m satisfied that it’s at least 98%-99% right, but things can happen at the single-journal scale that may get missed.

GOA8: Last report, phase 1

Friday, May 5th, 2023

The data’s as gathered as it’s going to be. 18,765 journals; 18,254 fully analyzed.

I’ve firmed up dates and ISSNs (using online ISSNs where available).

I’ve checked codes for consistency, etc.

And I’ve simplified and clarified some codes:

  • Fee code is gone; just not readily available or ever called meaningful by anybody.
  • Count code is gone. That was really for my own use. Of the 17,900 cases where I entered such a code, 10,081 came from DOAJ–with 2022 figures checked and frequently revised. 6,159 came from journals and were counted by using repeating metadata such as “PDF.” 1,083 came with help–e.g., article counts offered either by year or at least by issue. And 592 were damn nuisances, requiring manual counting. [These figures don’t count exclusions.]
  • Code “bi” is now just “i” for “inactive” (and “a”=active.)
  • Code “bx” is gone (I could probably resurrect it if needed), as are the notes.
  • So the codes now are, for the main sheet, a (17,585); i (423); xm (143, up from 140 last year); and xx (102, up from 90 last uear). For the exclusions sheet: xd (defunct/duplicates, 100, up from 92 last year); xm (47, up from 12 last year); xm2 (260, DOWN from 383 last year; xn (4, down from 10 last year; xo (41l down from 119 last year); xx (40, up from 27 last year); and xx2,(20, same as last year). Total identifiable articles excluded: 3,994, almost all xm2.

As far as I can tell, of journals in the main sheet, 5,818 have fees and 12,435 don’t–but, as usual, most articles involve fees: 997,913 in 2022, compared to 440,229 without fees. Those figures could change slightly, but probably not by much. And the total is 1.438,142 2022 articles (from 16.984 journals with 2022 articles) and 1,322,021 2021 articles (from 17,344 journals with 2021 articles).

That’s it. Now I’ll go silent while IL

  • Add derived data and save off the Figshare version
  • Write the appendix–mostly from these reports.
  • Start the book itself.

Help?

If you have OA-related contacts, let them know this is coming,

And if you know people who want to see the 12,435 “diamond” journals treated properly, let them know–and let them know I could still use feedback (it’s still an optional project, with no monetary gain). Before, say, the end of May.

GOA8: Brief progress report

Monday, May 1st, 2023

I just finished the scan of xx2/xm2 journals, and–deviating from my original plan–actually did full retests in most cases.

This turned out well. At the end of the scan, I have 260 remaining xm2 cases (down from 383 last year, and that doesn’t include cases that became xm2 this year) and only 20 xx2 cases–same as last year, which means new long-term problems were offset by fixed cases. Those journals account for at most a few thousand articles–perhaps 4,000.

What stands out is that Indonesia is problematic. Last year, it had a majority of xm2 journals. This year, it has 71% of all excluded xm/xm2 journals and 83% of all xm2 journals: 218 of 260.

Right now, about 1,420,000 2022 articles are accounted for (not counting excluded journals). That figure will rise after I check 864 journals that seem reasonably likely to have more articles now than they did when first checked. These are small journals: they added up to about 37,000 articles in 2021 and, so far, 15,013 in 2022. I figure three or four days to do the retests.

[Then come overall data consistency checks, firming up ISSNs, saving the master lists, adding derived data…and, eventually, the fun part.]

Feedback on planned drop of “bx” still feasible. Feedback on “diamond OA” plan STILL REALLY WANTED. It is extra work, but if it will be useful, I’ll happily do it.

Oh, and sigh, just saw another pontification on the colors of OA that says flatly that all gold OA includes fees. THIS IS NOT HELPFUL! Diamond/platinum is a subset of gold.

GOA8: Week 17

Saturday, April 29th, 2023

I’ve completed the first portion of the second pass, asked some questions and got some answers, and had a new thought. Here goes:

ISSNs

I’m satisfied that ISSNs do serve (some) purpose in the spreadsheet, so I’ll keep them–and, perhaps to make them a bit more useful, when I do final cleanup I’ll see to it that e-ISSNs are used in all cases where available.

Pass 2, Part 1: problematic journals

This involved around 1,200 journals–mostly xx and xm (but not xx2 or xm2). This is a slogging process (with up to four paths to try to find a “good” site), but definitely productive. (Some 20 journals that should have  been in pass 2 part 2–now part 3, see below–were accidentally included here, which does no harm.)

At the end of the scan, I had 307 journals that could be excluded (xx, xm, xn. xo) and 926 journals that are good to go. The latter include about 44,000 2022 articles; the former perhaps 3,100. In practice, most of the 307 journals will be included–all except those that aren’t really journals or are both unfindable and no longer in DOAJ.

Given how well that went, I’ll add another partial check before the scan of 864 journals that seem at least plausibly likely to have more 2022 issues added since they were scanned. By adding Part 2 and making this Part 3, they’ve had four full months to do late additions.

The new Part 2 is a quick scan of the 416 xx2 and xm2 journals–ones that have been problematic for more than one year. Basically, I’ll check each URL; any that are actually available (not xx or xm), I’ll scan properly and count as restored. I will be surprised (pleasantly) if there are more than a couple of dozen of these: journals that are bad for two years tend to stay bad (or get removed from DOAJ). UPDATE: see next post. I did a fuller check, and was indeed pleasantly surprised,

Best guess: that quick scan should take two or three. Part 3,  may use the rest of the week, maybe more (there are real-world things that interfere). With a lot of luck, I might be done with data gathering by the end of next week, setting the stage for normalization and adding derived data (e.g., peak articles, revenue, categories of size and price).

New data issues

As already noted, I’ll keep ISSNs.

Having heard no comments to the contrary, I’ll drop fee code from the spreadsheet. (Count code was never in the spreadsheet.)

I’m now looking at code “bx”–available at a different URL. It can happen for any number of reasons. In some previous years, I didn’t actually change the URL in the spreadsheet. I do that now. Last year there were 699 such cases; the year before that, 730. This year there are 438, there for a range of reasons. I don’t believe they add anything to the spreadsheet: they’re part of the data-gathering proces;. Unless I hear reasons not to, I’ll change them to “a,” which will then be a clean code for “active” in 2021-2022.

GOA8: Some data questions and a progress report

Tuesday, April 25th, 2023

I’ve done as much crosschecking as makes sense at this point, and started on the second pass–around 2,,000-2,400 journals to be looked at. That process can be rewarding but slow (an xx/xm journal can be restored in one of four ways, for example, each tried in turn). So I’ll just say “a couple of weeks” where “couple” means 1.5 to 3 or more–plus a week or so for final crosschecks and adding derived data

The Data Questions

I’m considering some data retention/display changes:

  1. ISSN: I don’t believe this is serving any purpose, especially since a journal can have more than one. Before DOAJ added unique URLs, it was one way of identifying a journal, but has never had any role in calculation or display. Unless I hear a good reason  not to, this will disappear from the master & shared datasets. [Some amplification: Every DOAJ/URL in the spreadsheet points directly to the DOAJ page with one or both ISSNs for the journal, so there’s no loss of access whatsoever. And just looking at the Figshare data, you can’t tell whether it’s the “right” ISSN.]
  2. Fc (Fee code): I’m inclined to drop this because, now that I’m starting from DOAJ fee numbers, it’s not very useful or reliable. I’m not sure it ever was very useful.
  3. Count code: This has never appeared on Figshare, and was used for the first time this year to track where I was getting article counts for each journal. It’s interesting in a vague summary way (and has been in the weekly reports), but nothing more. I may or may not use it again in future GOAs, if any, but see no reason to add it to the shared spreadsheet.

That’s it. You know where to comment. If there are any comments I’ll look at them–but I’m not holding my breath.

Meanwhile, the P2 scan has yielded 398 journals that can be fully used and 515 exclusions, including xx2 and xm2 exclusions, with 735 more problematic journals to go and 864 journals that might have picked up more articles. Depending on how that goes, I might do a very fast rescan of the 417 xm2/xx2 journals. Still hoping to finish the prep work and start (but not finish) the book in May 2023. With luck.

GOA8: Week 16

Saturday, April 22nd, 2023

Between the first pass and the second/final pass comes consistency checking, which can take an hour or two to a day or four. That’s still going on, but may be done soon. Meanwhile…

GOA8: Week 15.5, end of first pass

Wednesday, April 19th, 2023

That’s right! The first pass is complete. Now comes a week or two of data checking and rechecking somewhere between 2,000 and 2,450 journals (depending on my xm2/xx2 decision). The numbers below are subject to change in two ways: some journals that have counts will be excluded from the final dataset (last year, that was around 6,000 articles eliminated) and some journals will have more articles added (probably around 3,000 articles based on last year) and, with luck, some unavailable journals that *don’t* have counts will become available.

The numbers

The overall counts at this point:
18,769 journals checked, of which
16,548 published 1,420,735 articles in 2022 and
17,457 published 1,334,553 articles in 2021.

The rest of the numbers:

  • Fee versus diamond/no-fee: 5,838 journals with fees, 12,931 without. Just over two-thirds of journals are fee-free.
  • New vs. continuing: 2,164 newly-added, 16,605 continuing (including all of the “x”status below).
  • Status code:
    16,562 “a”–clean.
    447 “bi”– inactive (no articles since at least 2020).
    75 “bx”–done but at a different URL.
    109 “xd”–defunct, no articles since at least 2016.
    326 “xm”–malware (but not last year).
    57 “xn”–not an OA journal (including those removed this year but before I got to them) and ones suddenly requiring a login.
    776 “xx”–unreachable or unworkable.
    And the two oddities:
    359 “xm2”–malware, also malware last year
    58 “xx2”–unreachable or unworkable, also last year.
  • Ease of article counting:
    “d” 9,410: easiest, taken directly from DOAJ (sometimes with 2022 count modified)
    “w” 1,022: easy, journal website provides direct numbers at either volume or issue level.
    “f”  5,455: middling; numbers calculated using Find function for constants (e.g. “doi.” or “pdf”)
    “c” 566: slowest; articles counted manually.
  • Why the counts of “ease of…” don’t add up to total journals counted: all xd and bi cases, not quite all other non-a cases. If I couldn’t count them at all…

 

 

GOA8: Week 15

Saturday, April 15th, 2023

Another fairly strong week, and clearly the penultimate week for the first pass. There are 869 journals left to scan, and around 500 of those are either from Wiley or Wolters Kluwer/Wolters Kluwer Medknow. (Just finishing up Vilnius University Press, and if all the rest were as clearly done, this would be even easier.)

I anticipate finishing the first pass. I expect to do a little cleanup/consistency work. I hope to split off those needing further attention (around 2,400, but that will rise a bit) and check those for journals dropped from1 DOAJ since 1/1/2023–and probably decide whether to skip rechecking for xx2/xm2 cases. The following week or two involves rechecking several hundred journals that might reasonably have added 2022 issues since they were first checked, and then resolving the rest of the problematic cases. (After that? A couple of days to add derivative columns such as revenue, journal size and article price, then start on the book–with luck, in very early May.)

The numbers

1,300 more journals checked.

The overall counts at this point:
17,900 journals checked, of which
15,733 published 1,347,316 articles in 2022 and
16,637 published 1,263,927 articles in 2021.

The rest of the numbers:

  • Fee versus diamond/no-fee: 5,450 journals with fees, 12,450 without.
  • New vs. continuing: 2,057 newly-added, 15,843 continuing (including all of the “x”status below).
  • Status code:
    15,740 “a”–clean.
    422 “bi”– inactive (no articles since at least 2020).
    73 “bx”–done but at a different URL.
    108 “xd”–defunct, no articles since at least 2016.
    321 “xm”–malware (but not last year).
    54 “xn”–not an OA journal (including those removed this year but before I got to them) and ones suddenly requiring a login.
    765 “xx”–unreachable or unworkable.
    And the two oddities:
    359 “xm2”–malware, also malware last year
    58 “xx2”–unreachable or unworkable, also last year.
  • Ease of article counting:
    “d” 9,410: easiest, taken directly from DOAJ (sometimes with 2022 count modified)
    “w” 1,022: easy, journal website provides direct numbers at either volume or issue level.
    “f”  5,455: middling; numbers calculated using Find function for constants (e.g. “doi.” or “pdf”)
    “c” 566: slowest; articles counted manually.
  • Why the counts of “ease of…” don’t add up to total journals counted: all xd and bi cases, not quite all other non-a cases. If I couldn’t count them at all…

And I’d still appreciate feedback on the Diamond OA idea. Anyone out there? [At the moment I’m inclined to do it, but would love a little support…]

 

GOAJ stats: April 10, slightly incomplete

Monday, April 10th, 2023

I forgot to do another statistics run at the end of 2022, so some of these figures (PDF downloads, not Figshare data use) are missing part of December 2022. Given the low rate of use, I doubt that it makes much difference.

Gold Open Access 7

PDF: 1,023 downloads (no books)

Country book: 195 downloads (no books)

Database: 53 downloads, 272 views,

Gold Open Access 6

PDF: 2,930 downloads (no books)

Country book: 456 downloads

Database: 165 downloads, 1018 views.