GOA5: Journals 11,001-12,000 brief notes

March 25th, 2020


  • Of the 864 journals for which data has been recorded (136 are either unavailable or have malware or other issues), 15 (2%) have fees. (These are almost all university-based.)
  • Of that 15, I find that two have fees that vary based on article length or author count.
  • For 133 of the no-fee journals, I wasn’t certain of the no-fee status until I checked DOAJ.
  • Problematic cases include 39 malware cases, one that isn’t an OA journal, and 87 that couldn’t be reached or were unworkable. There were also six “xd” (renamed/ceased duplicate).
  • I’ll do a separate post (probably Thursday March 26) with a new spreadsheet of problematic journals (and send DOAJ a seperate list for 10,001-12,000, since they already have 1-10,000).



GOA: brief notes on journals 10,001-11,000

March 18th, 2020


  • Of the 898 journals for which data has been recorded (102 are either unavailable or have malware or other issues), 224 (25%) have fees.
  • Of that 224, I find that one has submission fees rather than processing fees–and three others have both submission and processing fees. 18 others have fees that vary based on article length (I don’t record that if the surcharge begins at 11 pages or higher) or author count.
  • In 14 of the 224 cases, I gathered the fee status and amount from the DOAJ record because it was not easy to locate within the journal’s website. That’s also the case for 116 journals with (apparently) no fees: info is from DOAJ rather than the journal website.
  • Problematic cases include 43 malware cases (most from one publisher) and 58 that couldn’t be reached or were unworkable. There was also one “xd” (renamed/ceased duplicate).
  • In two cases where I do have data, the URL in DOAJ did not yield the website but a journal title search in Chrome did yield the website. I’m now leaving most such cases for the second round so that I can finish the first scan a bit faster and get problem journals out sooner for fixing.
  • Noting the drop in fee-charging percentage: after Taylor & Francis, Thieme, and Ubiquity, most of these are from universities, with relatively few fees.



GOA and life: a quick update

March 16th, 2020

The scanning continues…but not as rapidly, for what may be obvious reasons.

To wit: trying to stay on top of The Crisis, our county’s shelter-in-place announcement, and an overload of information and “information.”

That’s slowing me down a lot, both in actual time and in coping. It’s also taking more time just to keep going.

My wife and I are both over 65 and also introverts, so we’re affected differently. (And anyone who says we shouldn’t go for walks in the fresh air will be ignored…)

So: it’s happening. Not rapidly.

Take care.

GOA5: Journals 9,001-10,000 and malware

March 11th, 2020


As noted in the last set of notes, I departed from the usual publisher/journal sorting to test all remaining Indonesian journals–around 880 of them–because nearly half of the malware cases in the first 9,000 were from Indonesia, which had a malware problem in last year’s scan–one that was totally cleared up with the help of DOAJ people. (All I did was send them the list of problems.)

That turned out to be a good thing: these journals, mostly from universities, have a serious malware problem. Maybe there are readers out there who can help correct the problems there and in other countries (yes, I’m keeping DOAJ informed).

How bad? In all, these 1,000 journals had 249 malware cases and 51 other unusable cases.

I should note that 73% of Indonesia’s 1,582 gold OA journals do not have these problems. Unless the UK or US have added a lot of OA journals in 2019, Indonesia publishes more DOAJ-listed journals than anybody. But they have a recurring problem with malware…

If you think you can help…

You should (cross fingers) find a Google Sheet with all problematic journals from the first 10,000 scanned (which includes all Indonesian journals) here: that is, https://docs.google.com/spreadsheets/d/1CL7AY5VuS9KuIYmpfnaTua20pH-GiZLpE6bTwg5_Qhw/edit?usp=sharing

The sheet includes all “xm” journals (malware), 481 of them but also all “xn” (apparently not OA) [20 so far] and “xx” (unavailable or unworkable) [353 so far]. There should be 854 rows, including (in descending order) 442 from Indonesia, 71 from Brazil, 60 from Iran, 34 from Ukraine, 33 from Romania, 21 from Turkey, 20 from Poland, 17 from Spain, 15 from Russia, 12 from the United States, ten from Colombia, and smaller numbers from 41 other countries for a total of 119 additional.

The code–xm, xn, xx–appears in the “Cod” column. A note may appear in the “Note” column offering a brief comment on why something’s there–e.g., for “xx” journals such notes as “404” (27), “500” (an internal error, 3), “ad” (three), “blog” (four), “dbs” error (29), dns failure (23), “park” (parking page, 21), “to” (timeout, 34) and a few others.

For malware, the common codes include “cert” (security certificate problems, 20), “mal” (just flagged as malware, 33), “mult” (MalwareBytes Pro finds more than one included page and multiple malware categories, six), “phish” (phishing, 69), “ransom” (ransomware, six), and the biggie “troj” (trojan, 308). In some cases, I didn’t jot down MalwareBytes’ code.

Dates

I’ll post another version when I scan the next 2,000 journals and a final one when I finish the initial scan (4.128 more journals in all).

The final scan for, hopefully, corrected malware and unavailable/unworkable journals will begin either May 15, 2020 or two weeks after I post that final list, whichever comes last.

Last year, DOAJ and others got the total of xm and xx journals down to 117, of which only 17 were malware. Here’s hoping they (and you?) can do even better this year.

A few other notes on journals 6,001-10,000

  • Of the 3,528 journals for which data has been recorded (472 are either unavailable or have malware or other issues), 1,075 (30%) have fees.
  • Of that 1,075, I find that five (still) have submission fees rather than processing fees–and 28 others have both submission and processing fees. 80 others have fees that vary based on article length (I don’t record that if the surcharge begins at 11 pages or higher) or author count. Five have membership or similar fee requirements, and seventeen are questionable. (Most of the latter are Indonesian cases where I believe the stated fee is missing three zeroes. There will be rows in the final spreadsheet where the amount shows as $0 but the status is “f” for “fee”–selecting the actual cell will find the unrounded stated fee, sometimes under four cents.) [By the way, the “curr” page on the Google Sheet provides the conversion rates used for this project and whether they’re the median 2019 rate or the actual rate on the day in late December 2019 that I did the checks.]
  • In 115 of the 1,075 cases, I gathered the fee status and amount from the DOAJ record because it was not easy to locate within the journal’s website. That’s also the case for 457 journals with (apparently) no fees: info is from DOAJ rather than the journal website.
  • Malware is still with us: see the first part of this message.
  • In 67 cases where I do have data, the URL in DOAJ did not yield the website but a journal title search in Chrome did yield the website.

 



Notes on journals 6,001-9,000

March 3rd, 2020


Followup: some notes on the next 3,000 journals in my scan of DOAJ; compare to the first 6,000… (I sort by publisher, then journal, because that speeds things up). [But note at end: that won’t be true for the next 885 journals…]

A few items do seem interesting.

  • Of the 2,825 journals for which data has been recorded (175 are either unavailable or have malware issues), 758 (27%) have fees.
  • Of that 758, I find that five (still) have submission fees rather than processing fees–and 22 others have both submission and processing fees. 74 others have fees that vary based on article length (I don’t record that if the surcharge begins at 11 pages or higher) or author count. Five have membership or similar fee requirements, and three are questionable.
  • In 101 of the 758 cases, I gathered the fee status and amount from the DOAJ record because it was not easy to locate within the journal’s website. That’s also the case for 399 journals with (apparently) no fees: info is from DOAJ rather than the journal website.
  • Malware is still with us: 55 of the 175 missing cases have malware; 85 are missing or useless; nine are not OA journals (one needs a login, one is an encyclopedia, and several from SpringerNature self-report as hybrid); and 20 are dead or duplicates (most duplicates are renamed journals, with the old name still appearing.
  • In 60 cases where I do have data, the URL in DOAJ did not yield the website but a journal title search in Chrome did yield the website.

The malware problem and a departure

Of the 55 malware cases in this group of 3,000, fully 33 are from Indonesia. That was true for 61 of the 99 malware cases in journals 3,001-6,000 and 19 of the 78 in the first 3,000. That low first 3,o00 showing led me to believe that Indonesia’s malware problem in last year’s study (which DOAJ and others almost totally solved) was pretty much gone.

I no longer believe that, since 113 of a total 232 cases–just under half–are Indonesian journals, So I’m going to do what I did last year (somewhat earlier in the scan): I’ll scan the 880-odd remaining Indonesian journals before the remainder of the 5,100-odd journals. That’s likely to slow things down…

At the end of that process, in addition to reporting the problematic journals to DOAJ (as I’m doing at 3,000-journal intervals), I’ll mount a copy for anyone to view and publicize it: perhaps others can help.

So don’t expect to see the “next thousand” for a while (I may not scan any journals for the next day or two, given life requirements), and when it appears, it’s likely to be…unusual.

GOA: February 2020 update

March 2nd, 2020


Readership for the new edition and GOAJ3. I changed hosts in January, and in the process lost statistics for January 1-20, 2016–and I’m no longer bothering to report paperback sales (essentially none) or GOAJ3 Cites & Insights numbers. (Figures prior to February 2020 also lack most of the last day of each month; that’s no longer the case.)

All links available from the project home page, as always.

GOA4: 2013-2018

  • The dataset: 481 views, 164 downloads.
  • GOA4: 1,907 PDF ebooks
  • Countries 4: 407 PDF ebooks
  • Subjects and Publishers: 308 PDF ebooks

GOAJ3: 2012-2017

  • The dataset: 1,801 views, 326 downloads
  • GOAJ3: 3,733 PDF ebooks
  • Countries: 1,157 PDF ebooks



Notes on journals 6,001-8,000

February 25th, 2020


Followup: some notes on the next 2,000 journals in my scan of DOAJ; compare to the first 6,000… (I sort by publisher, then journal, because that speeds things up). Since these notes combine 6,001-8,000, they may usefully be compared to the set of notes on journals 6,001–7,000

A few items do seem interesting.

  • Of the 1,905 journals for which data has been recorded (95 are either unavailable or have malware issues), 707 (37%) have fees.
  • Of that 707, I find that five (still) have submission fees rather than processing fees–and eight others have both submission and processing fees. 59 others have fees that vary based on article length (I don’t record that if the surcharge begins at 11 pages or higher) or author count. Four have membership or similar fee requirements, and two are questionable.
  • In 51 of the 430 cases, I gathered the fee status and amount from the DOAJ record because it was not easy to locate within the journal’s website. That’s also the case for 210 journals with (apparently) no fees: info is from DOAJ rather than the journal website.
  • Malware is still with us: 32 of the 95 missing cases have malware; 43 are missing or useless; one requires a login, which makes it not an OA journals; and 12 are dead or duplicates (most duplicates are renamed journals, with the old name still appearing.
  • In 43 cases where I do have data, the URL in DOAJ did not yield the website but a journal title search in Chrome did yield the website.

Another guesstimate on totals

Journals 6,001-8,000 add a lot of articles: the first 6,000 had just under 353,000 articles (that figure will increase on the second round of counting, but probably not by much), while the next 2,000 have over 222,000 (same remark). With more than 6,000 journals left to go, the 2019 article count is already over 597,000.

Comparing where I am in this year’s survey with the comparable point last year (that is, the same point in a publisher/journal sort), a straight projection would yield just under 829,000 articles for 2019. Such a projection is heavily flawed, but I now believe there’s a better than even chance that the figure will be more than 800,000. (I started this year’s count hoping for “14 and 800”: 14,000 fully analyzed journals and 800,000 articles. Hard to say whether that will be the case; to reach 14,000, a fair number of problematic journals need to be fixed. Last year, they were.



Notes on journals 6,001-7,000

February 18th, 2020

Followup: some notes on the next 1,000 journals in my scan of DOAJ; compare to the first 6,000… (I sort by publisher, then journal, because that speeds things up).

A few items do seem interesting.

  • Of the 961 journals for which data has been recorded (39 are either unavailable or have malware issues), 430 (45%) have fees.
  • Of that 430, I find that five have submission fees rather than processing fees–and six others have both submission and processing fees. 31 others have fees that vary based on article length (I don’t record that if the surcharge begins at 11 pages or higher) or author count. Two have membership or similar fee requirements, and one is questionable (it states boldly that there is no fee, then–in the next paragraph–states the mandated fee but says it’s a gift).
  • In 28 of the 430 cases, I gathered the fee status and amount from the DOAJ record because it was not easy to locate within the journal’s website.
  • Malware is still with us: ten of the 40 missing cases have malware (six of the ten from Indonesia); twenty are missing or useless; one requires a login, which makes it not an OA journals; and eight are dead or duplicates (most duplicates are renamed journals, with the old name still appearing.
  • In cases where I do have data, the URL in DOAJ did not yield the website but a journal title search in Chrome did yield the website.

Timing

So I’ve done 7,000 in (exactly) seven weeks. That leaves 7.128 to go. Will I be done in seven weeks (and a d ay)?

Almost certainly not. Other stuff happens–and the huge chunk of university-based journals is likely to be slow going. I’m hoping to finish the first pass by the end of April–and then there’s a second pass plus a final pass for malware (after journals have had some time to clean up those cases). Then there’s the normalization, data manipulation, table creation and writing the book(s).

GOA4 (2013-2018) appeared on May 4, 2019; that was unusual. GOAJ3 (2012-2017) appeared on May 28, 2018, and even that was earlier than I’d expected. I’ll be delighted if this year’s GOA5 is ready in early June; I won’t be surprised if it takes into July…

A few notes on the first 6,000

February 10th, 2020

Followup: some notes on the first 6,000 journals in my scan of DOAJ; compare to the first 5,000… (I sort by publisher, then journal, because that speeds things up).

Just for fun–and NOT MEANINGFUL at least partly because a number of journals will show larger numbers or have problems cleared up in the “recount” segment–I’m also comparing this to the equivalent portion of the 2019 scan (that is, the same breakpoint for publisher and journal).

A few items do seem interesting.

  • Of the 5,555 journals for which data has been recorded (445 are either unavailable or have malware issues), 1,981 (36%) have fees.
  • Of that 1,981, I find that 19 have submission fees rather than processing fees–and 36 others have both submission and processing fees. 178 others have fees that vary based on article length (I don’t record that if the surcharge begins at 11 pages or higher) or author count.
  • In 118 of the 1,981 cases, I gathered the fee status and amount from the DOAJ record because it was not easy to locate within the journal’s website. (The higher number last time included no-fee cases, where that wasn’t obvious from the site.) This is a much better number!
  • Malware is still with us: 177 of the 445 for which I don’t yet have data recorded were flagged by Malwarebytes–an uncomfortably high figure. 147 others don’t seem to be there or are unworkable…and ten aren’t OA journals, AFAICT. (Yes, I’m sending DOAJ problems in chunks; yes, I hope we/they can reduce the malware count to a trivial amount as they did last year. The big trouble spots so far are Indonesia with 64 cases, Brazil with 46 and Romania with 15.)
  • In 117 cases where I do have data, the URL in DOAJ did not yield the website but a journal title search in Chrome did yield the website.

One extra note

This will be the last set of trivial notes covering the first chunk of the scan. If I do more notes, they’ll start with the 6,001st journal–that is, any comparisons would leave out the first 6,000. (But I might see whether anybody’s reading these or whether I’m wasting the 15-20 minutes to write them…)

A few (more) notes on the first 5,000

February 3rd, 2020


Followup: some notes on the first 5,000 journals in my scan of DOAJ; compare to the first 4,000… (I sort by publisher, then journal, because that speeds things up).

Just for fun–and NOT MEANINGFUL at least partly because a number of journals will show larger numbers or have problems cleared up in the “recount” segment–I’m also comparing this to the equivalent portion of the 2019 scan (that is, the same breakpoint for publisher and journal).

A few items do seem interesting.

  • Of the 4,629 journals for which data has been recorded (371 are either unavailable or have malware issues), 1,731 (37%) have fees.
  • Of that 1,363, I find that 16 have submission fees–and 30 others have both submission and processing fees. 142 others have fees that vary based on article length (I don’t record that if the surcharge begins at 11 pages or higher) or author count.
  • In 98 of the 1,731 cases, I gathered the fee status and amount from the DOAJ record because it was not easy to locate within the journal’s website.
  • Malware is still with us: 144 of the 371 for which I don’t yet have data recorded were flagged by Malwarebytes–an uncomfortably high figure. 147 others don’t seem to be there or are unworkable…and eight aren’t OA journals, AFAICT. (Yes, I’m sending DOAJ problems in chunks; yes, I hope we/they can reduce the malware count to a trivial amount as they did last year. The big trouble spots so far are Indonesia with 64 cases, Brazil with 46 and Romania with 15.)
  • In 100 cases where I do have data, the URL in DOAJ did not yield the website but a journal title search in Chrome did yield the website.


Comparisons to 2019

I’m just a bit more than 1/3 of the way done, and things will change, but here’s what I see at the moment:

  • At this point last year, I’d done 4,519 journals of which 4,412 were in the analysis (that subgroup included 10 malware cases, three not-OA cases and 43 unavailable/unworkable). That’s almost the same percentage of the whole–35.5% compared to this year’s 35.3%.
  • For this portion, the 2018 article total was 290,982 compared to 302,978 this year (but that number should grow a little). For 2017, the numbers are 251,118 and 260,556 respectively.
  • If articles were evenly spread among journals, I could project more than 900,000 total 2019 articles (since 35.3% yield 321,346)–but that’s obvious nonsense, since that projection technique yields just under 820,000 total 2018 articles for last year’s count, not the 711,670 articles actually counted. And I’d expect to see the 2019 article count for this year’s pass go up by at least 2,000-4,000. The closest thing to a SWAG for possible totals this time around might be around 786,000–but I’d suggest “somewhere between 750,000 and 850,000” is as close as I’d want to come to an actual estimate.