GOA7 Pass 2: Updating

I’ll add to this post as I progress through Pass 2…

April 22: Parts 1 and 2

Part 1 (adding possible 2021 articles to journals that had none) and Part 2 (adding later 2021 issues to journals that seemed as though they should have more) have both been completed, adding around 3,500 2021 articles and increasing the count of journals with 2021 articles by around 100.

The key numbers now, excluding Parts 3 and 4 of Pass 2, are:

Journals that won’t be scanned further: 14,472.

Journals with 2021 articles: 14,675

2021 articles: 1,233,706.

April 22 (2): Journals removed

I reconsidered when journals no longer in DOAJ should be removed, doing this for Part 3 and Part 4 just now. (I went back to June 2021 for removal dates, but in fact all journals removed were done in 2022.)

In all, 62 journals were removed from Part 3 (xx), leaving 1,044 to be rechecked, and 15 were removed from Part 4 (xm), leaving 659 to be rechecked. These 77 journals–marked “xo”–will not be included in the study, although they may be included in one table in Chapter 2 (Exclusions and Special Cases).

April 30: Part 3 complete

The 1,044 xx journals have been rechecked, with reasonably good success. Although there were 312 more cases than in last year’s scan, the number that couldn’t be resolved only increased from 126 to 147. Of the remainder, 177 were fine when retested (which usually means temporary server problems); 41 were either fine or found on an alternate path but hadn’t published since 2019; 651 were found and counted using alternative routes; 10 were dead/duplicates; three aren’t OA journals (two now require login); and 35 were no longer in DOAJ. This was a “lumpy” pass: 223 xx cases came from Sciendo, 80 arose from the DergiPark move from .gov to .org, and 55 came from SciELO instances where URLs hadn’t been updated.

The current totals for non-problematic journals: 16,374 total; 15,446 with 2021 articles; 1,268,018 2021 articles; 385 with no post-2019 articles; 89 dead/duplicate cases (no articles since 2015). Of all these, 2,162 appear to be new to DOAJ and 14,212 are comntinuing.

Next step: Part 4 (xm), and a quick recheck on some items. Yes, I’m about 10 days ahead of last year. Cross fingers.

May 3: Part 4(a)

I’ve gone through the 658 xm (malware) journals, with –as expected–modest results: 5o now active, 5 OK but code bi (inactive since 2019), one bx (found through a different url), one xd (dead/duplicate, and one x0 (no longer in DOAJ).

At the moment, 16,431 journals are ready for processing, 15,496 of which have 2021 articles; there are 1,269,933 2021 articles,

The remaining 595 xm (malware) journals will have codes compared with those in last year’s study, as will the xx/xm journals previously double-checked: last year’s codes help inform whether journals are included in the full study or held out as exclusions (which appear on a separate page of the eventual Figshare spreadsheet). Then, 601 xm journals that weren’t also xm/xx last year will be checked for the possibility of an alternate route. I’d be pleasantly surprised to find many, but it’s worth the two or three days required.

A couple of notes about the current malware group (excluding 10 that changed from xx to xm in the last phase):

Big clusters by publisher include 37 from Universitas Udayana; 29 from Conselho Nacional de Pesquisa e Pós-graduação em Direito (CONPEDI); 28 from Universitas Negeri Malang; 19 from from Diponegoro University; and 14 from Universitas Pendidikan Indonesia.

You may notice something about all but one of those names–and, indeed, breaking down the malware by country shows 403 from Indonesia–just over two-thirds of the total–plus 69 from Brazil and 28 from Ukraine. No other country has more than eight.

May 5: Completion of online scans

I’ve now rechecked xm journals looking for possible alternate URLs and a few other checks, with some success–and gone through the complete dataset getting rid of pure duplicates (either from downloading issues or otherwise).

While these numbers may change very slightly as I do consistency checks in the next day or two, I’d guess such changes will be very small–probably less than 1%. I also have a more nuanced understanding of the malware and problematic issues, and it’s encouraging. I’ll lay that out in a separate post–but if you just want the biggest numbers, the final report is likely to include around 16,726 journals (with another 440 on an exclusions page), of which around 15.643 have 2021 articles, with a total of around 1,275,080 2021 articles. All figures subject to change.

Comments are closed.