Archive for January, 2023

GOA8: Week 4

Saturday, January 28th, 2023

I now know when the seven-week slowdown will begin: February 6. It will take somewhere between 90 minutes and two hours of each weekday’s schedule–and possible side effects might also slow things down. However, I’m finding that this is both going faster than expected and is more interesting and less frustrating than previous years. I’m pretty sure the two factors are related. Could I have used these more effective/efficient methods in previous years? Not really: the data wasn’t there.

This was another very productive week, and I reached a breakpoint I was hoping to reach by the very end of the month: 5000 journals done (well, including 684 that will require a revisit). I might do more journals today, but I’ll count them as “tomorrow,” in the fifth week (and last week before schedule changes).

So:  1,300 more journals checked, The overall counts at this point are 5,000 journals checked, of which 4,400 published 376,863 articles in 2022 and 4,635 published 374,548 articles in 2021. Note that the 2022 article count finally exceeds the 2021 count, even disregarding 683 journals many of which will wind up with more 2022 articles. One reason is the first few dozen Frontiers in journals, many of which are growing rapidly. (I also finished Elsevier, to be sure–but some of the Frontiers numbers are pretty dramatic. There are 60-odd more to go…)

Some details–as always, about the full dataset to date, not this week’s portion.

  • Fee versus diamond/no-fee: 1,910 journals with fees, 3,090 without.
  • New vs. continuing: 657 newly-added, 4,343 continuing.
  • Need rechecking: 684 will be rechecked (including all of the “x”status below).
  • Status code:
    4,488 “a”–clean.
    120 “bi”– inactive (no articles since at least 2020).
    25 “bx”–done but at a different URL.
    25 “xd”–defunct, no articles since at least 2016.
    64 “xm”–malware (but not last year).
    10 “xn”–not an OA journal.
    177 “xx”–unreachable or unworkable. (Last week’s number was a typo.)
    And the two oddities:
    85 “xm2”–malware,also malware last year
    14 “xx2”–unreachable or unworkable, as was true last year.
  • Ease of article counting articles:
    “d” 2,729: easiest, taken directly from DOAJ
    “w”378: easy, journal website provides direct numbers at either volume or issue number
    “f” 1,346: middling; numbers calculated using Find function for constants (e.g. “doi.” or “pdf”)
    “c” 194: slowest; articles counted manually.

And that’s it for January…

GOA8: Week 3

Saturday, January 21st, 2023

Before providing an updated set of counts, a note about likely schedule: It’s become obvious that (a) the changes in handling this year are working well, potentially much better than expected and (b) as a result, I haven’t the foggiest notion how long this is all going to take–almost certainly not as long as my previous pessimistic estimates, fortunately. I now believe “sometime in the spring” is the most useful estimate for completing the first data gathering pass–and, with varying degrees of luck and other stuff, maybe the second pass, data normalizing, and adding derived data columns. It’s even possible that I’ll start on the book and published dataset during very late spring (that is, before July 1), but that’s less likely.

The change in numbers is astonishing, both because things went well this week and because I encountered EDP Sciences and its set of Web of Conferences megajournals and have started in on Elsevier.

Now the numbers:

This was an even more productive week, with 1,400 more journals checked, The overall counts at this point are 3,700 journals checked, of which 3,235 published 240,270 articles in 2022 and 3,431 published 258,091 articles in 2021.

Some details–as always, about the full dataset to date, not this week’s portion.

  • Fee versus diamond/no-fee: 1,359 journals with fees, 2,341 without,
  • New vs. continuing: 458 newly-added, 3,242 continuing.
  • Need rechecking: 538 will be rechecked (including all of the “x”status below).
  • Status code:
    3,311 “a”–clean.
    86 “bi”– inactive (no articles since at least 2020).
    20 “bx”–done but at a different URL.
    18 “xd”–defunct, no articles since at least 2016.
    28 “3m”–malware (but not last year).
    8 “xn”–not an OA journal.
    1367 “xx”–unreachable or unworkable.
    And the two oddities:
    75 “xm2”–malware,also malware last year
    8 “xx2”–unreachable or unworkable, as was true last year.
  • Ease of article counting articles:
    “d” 1,925: easiest, taken directly from DOAJ
    “w” 290: easy, journal website provides direct numbers at either volume or issue number
    “f” 1,054: middling; numbers calculated using Find function for constants (e.g. “doi.” or “pdf”)
    “c” 164: slowest; articles counted manually.

GOA8: Week 2

Saturday, January 14th, 2023

Somewhat unfortunately (see below), this was a very productive week, with 1,300 more journals checked, The overall counts at this point are 2,300 journals checked, of which 2,032 published 144,531 articles in 2022 and 2,146 published 147,617 articles in 2021.

Why somewhat unfortunately? Because–in addition to the speed with which BMC journals could be checked–that 1,300 came about partly because we had to skip two of our daily walks and I had to skip the usual Wednesday morning hike: just too wet. [Fortunately, our house is at the top of a rise, at 550 feet above sea level, and while Livermore’s regional parks were closed because of flooding and water hazards, we didn’t have flooding. Looking forward to the weeklong dry spell we’re supposed to get starting Monday–as long as it’s not the beginning of a months-long dry spell, as happened last year.]

Meanwhile, some details–and these will always be about the full dataset to date, not this week’s portion.

  • Fee versus diamond/no-fee: 809 journals with fees, 1,491 without,
  • New vs. continuing: 283 newly-added, 2017 continuing.
  • Need rechecking: 310 will be rechecked (including all of the “x”status below).
  • Status code: 2,091 “a”–clean and done. 57 “bi”– inactive (no articles since at least 2020). 13 “bx”–done but at a different URL. 12 “xd”–defunct, no articles since at least 2016. 28 “xm”–malware (but not last year). 4 “xn”–not an OA journal. 77 “xx”–unreachable or unworkable. And the two oddities: 11 “xm2”–malware, as was true last year; and 6 “xx2”–unreachable, as was true last year.
  • Ease of article counting (added 1/15):
    “d” 1,099: easiest, taken directly from DOAJ
    “w” 235: easy, journal website provides direct numbers at either volume or issue number
    “f” 716: middling; numbers calculated using Find function for constants (e.g. “doi.” or “pdf”)
    “c” 113: slowest; articles counted manually.

I am seeing suggestions that even the modest $28/year I spend for Malwarebytes Pro (which covers my wife’s notebook as well) is needless, that there’s no need to pay for security beyond Windows builtin functions. I’m not willing to take that chance, and can give you about 30 reasons so far [some of the 39 were certificate problems that Windows itself absolutely catches.

Where am I? The 2,300th journal is the Mesopotamia Journal of Agriculture, published in Iraq by the College of Agriculture. It has a $100 fee and published 36 articles in 2022

GOA8: Week 1

Saturday, January 7th, 2023

Here’s how things stand after the first week of visiting gold OA journals for GOA8 (in alphabetic order by publisher, then by journal):

1,000 journals scanned. 883 of them published 51,008 articles in 2022; 951 of them published 53,891 articles in 2021. (Both numbers subject to change on revisiting.)

At 1,000 journals a week, the first scan will be done in late May. There will be a known seven-week slowdown (which may or may not be major, and I don’t yet know when it will be–but not until at least January 30.) My daily minimum goal is 100 journals–which would take until July 9 to finish the first pass. I’m hoping the final time required will be somewhere in between.

Three details

The above numbers are from a pivot table in the “done” spreadsheet. I added three more tables to track items of interest during the scan–at least one of which might not be in the final report.

Counting codes

Of the 939 journals for which a count was feasible at this point:

  • 424 were code d–the DOAJ figure appears probable. This is the easiest.
  • 131 were code w–the journal web  pages offered easy direct numbers for each issue or for the year. Also easy.
  • 328 were code f–I could use Find to determine the count for each issue (e.g., counting “pdf” or “doi.”). Not as easy, for various reasons.
  • 56 were code c–Counting articles by hand. By far the hardest.

Coded status

Of the 1000 journals:

  • 910 are code a–which is the best code.
  • 21 are bi: inactive, with no articles since 2020
  • 9 are bx: journals is findable but at a different URL
  • 6 are xd: ceased or duplicate, with no articles since 2016
  • 19 are xm: malware or bad certificate (with luck, rechecks will reduce this number)
  • 3 are xn: Not an OA journal (two appear to require registration, one is an encyclopedia)
  • 32 are xx: currently unreachable or unworkable: rechecks should reduce this number

Recheck?

141 of the first 1,000 journals need rechecking–either because xm or xx, or because they appear to be missing some 2022 data.

Off to a good start. Some weeks might show more journals, some may show (a lot) less.

GOA8 progress posts mostly on Mastodon

Monday, January 2nd, 2023

I’ve started the scan, and it’s looking reasonably promising. I may do a monthly update here, but more frequent updates will be on Mastodon.