Archive for 2020

GOA5: Status and Questions

Monday, May 11th, 2020

As of today, I’ve completed the initial scan of 14,128 DOAJ-listed gold OA journals, the second scan (starting April 14) to pick up additional articles and do some follow-up, and a “pre-third” scan (starting April 29) to clear up as many problematic journals as possible. The final third scan will begin Friday, May 15, a deadline shared earlier, to see whether more malware and problematic cases have been cleared up. (There are 786 journals in that final scan, including around 140 malware cases where I was able to gather the data needed but am hoping that the malware will be cleared up.

Once that final scan is complete–around a week to ten days, barring other issues–I’ll start adding calculated data (e.g., 2019 journal revenue), then start in on the books. I’m still saying “July or maybe August” for completion, especially given the state of things (and some known causes of likely delay), but it’s possible that it could be done in late June. MODIFIED May 14: Things have opened up enough that I will be having cataract surgery during June, both eyes, two weeks apart. It’s likely that I’ll get either no work or very little work done on the project while resting my eyes…so don’t expect the books and dataset until late July or some time in August.

Oh, and if you’re wondering: almost certainly not more than 14,000 fully-analyzed journals (with articles later than 2013), but also certainly more than 800,000 articles for 2019, probably quite a bit more than 800,000.

Meanwhile, I’m looking at changes in the books/reports and would be delighted to get feedback (definitely before May 22, preferably by May 14) on a couple of issus. To wit:

  • Should I reduce the growth/shrinkage tables from seven rows to three? I did this for the Subject and Publisher book last year (which won’t be repeated due to lack of interest), using Grew 25% +, Even +/- 24.99%, and Shrank 25% + instead of the finer categories in the main book and Country book. My inclination is to make this change, but I’d love feedback.
  • Should I change the format to the Country style? Which is to say: drop the captions for tables and figures but add third-level headings with the same information. The differences are that headings appear above the tables and figures rather than below, that the tables and figures don’t have numbered captions (no “Table 10.43); that there’s some space savings; and that you don’t get commentary for one table appearing immediately above the next table. (A book designer would say that I’d also be violating a classic tenet, as many heading3 cases would appear without prior heading2 cases.) I can still create an index of the tables and figures, since the only heading3 instances would be these labels. [Page 57 of GOA4, the paragraph beginning “Table 7.11.,” is one case where the paragraph seems “attached” to the next table.] My inclination is also to make this change.
  • Should I move subject coverage to follow region coverage? Here, I don’t think there’s much choice, Since there is no Cites & Insights in which to provide expanded subject coverage, and since I believe it’s not enough to just provide three tables for each subject, and since ginormously long and complicated subject-group chapters seem absurd, I think the solution is to have what was Chapters 12-18 in GOA4 appear as Chapters 9-15 and have 31 subject chapters (three groups and 28 subjects) follow. Objections? Other suggestions?

I believe that’s it. There may be other tweaks, but for consistency the fee price ranges and article count ranges will be the same as in previous years.

Oh: if you’re wondering: no-fee journals are still right around 70% of all the journals, but the percentage of articles with fees seems to have gone up a bit, maybe crossing the 60% mark. In other words, on average fee-based journals have about twice as many articles as no-fee journals.

Responses welcome as comments here or as email to waltcrawford@gmail.com. Preferably by May 14, absolutely by May 22.

GOA: April 2020 update

Saturday, May 2nd, 2020


Readership for the new edition and GOAJ3.

All links available from the project home page, as always.

GOA4: 2013-2018

  • The dataset: 584 views, 106 downloads.
  • GOA4: 2,599 PDF ebooks
  • Countries 4: 461 PDF ebooks
  • Subjects and Publishers: 349 PDF ebooks [Note: based on an almost total lack of interest–two responses to repeated feedback requests, and only one of the two positive–there won’t be another subjects/publishers book.]

GOAJ3: 2012-2017

  • The dataset: 1,916 views, 358 downloads
  • GOAJ3: 3,843 PDF ebooks
  • Countries: 1,201 PDF ebooks



GOA5: All problematic journals

Monday, April 13th, 2020

I’ve completed the first pass, and posted a Google Sheet with 1,290 problematic journals–22 xn (apparently not gold OA), 622 xm (malware or certificate problems), and 646 xx (unreachable or unworkable). Here’s the link: https://docs.google.com/spreadsheets/d/1qEkowH_-tUkmoeYcwou6AqNZOfG5hMD2PqDtPxUovYY/edit?usp=sharing

If you’re in a position to help get these fixed, with special emphasis on the malware cases, note that the FINAL PASS will begin on May 15, 2020–a pass of those that haven’t been fixed earlier. (At that point, I’ll distinguish between malware inclusions that can be blocked but leave the journals reachable–usually bad “free service” modules like counters or contact mappers–and journals that can’t be reached.)

DOAJ is working on these as well. Last year, their efforts reduced hundreds and hundreds of malware cases to a mere 17 (and three more with malware inclusions). Can we do as well this year?

If you’re wondering where the trouble hotspots are, they’re actually on the “Sheet 1” worksheet. The most difficult cases:

Indonesia443
Brazil215
Iran72
Poland55
Spain51
Ukraine46
Romania45
Colombia33
Turkey33
Argentina27
United States18
Malaysia17
Russia15
Cuba12
India12
Pakistan12
Chile11
Portugal11
United Kingdom11
Venezuela11

43 other countries have 9 or fewer each.

Incidentally, of my two “aspirational goals” for this year’s GOA project, one is a clear success, the other possible but not likely:

  • Clear success: There were definitely more than 800,000 articles in serious gold OA journals (that is, those in DOAJ) in 2019.
  • Possible but not likely: 14,000 fully-analyzed journals. Of the 14,128 scanned, 96 are “xd” journals–ones with no articles more recent than 2013, usually because a renamed or merged journal replaced them. With no articles in the 2014-2019 period, those aren’t fully analyzed–and that leaves only 32 to spare, including 22 that appear not to be OA. I think it unlikely that the xm+xx count can be reduced to nine or less, but one can always hope.

Otherwise? I won’t know the overall fee/nofee percentage until all the retesting is done, but so far the fee percentage seems to be right around 30%, which is what I’d expect: very few existing no-fee journals switch to fees (APCs and otherwise) and most newly-added journals are also no-fee.

GOA: Brief notes on journals 12,001-13,000

Friday, April 3rd, 2020


  • Of the 869 journals for which data has been recorded (131 are either unavailable or have malware or other issues), 58 (7%) have fees. (These are almost all university-based.)
  • Of that 58, I find that one is a membership fee, two are questionable (internal disagreement in the site or some other problem) and 13 consist of both submission and processing/publishing charges.
  • For 115 of the no-fee journals, I wasn’t certain of the no-fee status until I checked DOAJ.
  • Problematic cases include 55 malware cases, one that isn’t an OA journal, and 73 that couldn’t be reached or were unworkable. There was also one “xd” (renamed/ceased duplicate).
  • This is the last “thousand” note. Once I finish the last 1,128 in the first pass, I’ll send DOAJ a final group of problematic journals and post a new Google Sheet with the full set of problems–WAY too many of them.
  • I’ll recheck journals that seemed likely to have additional 2019 articles and most “xx” journals, and do a final pass for problematic journals beginning May 15.
  • Yes, of course I’m slowing down. If you’re at full productivity during the current situation, I’d wonder why.



GOA: March 2020 update

Thursday, April 2nd, 2020


Readership for the new edition and GOAJ3.

All links available from the project home page, as always.

GOA4: 2013-2018

  • The dataset: 528 views, 179 downloads.
  • GOA4: 2,197 PDF ebooks
  • Countries 4: 432 PDF ebooks
  • Subjects and Publishers: 331 PDF ebooks [Note: based on an almost total lack of interest–two responses to repeated feedback requests, and only one of the two positive–there won’t be another subjects/publishers book.]

GOAJ3: 2012-2017

  • The dataset: 1,865 views, 346 downloads
  • GOAJ3: 3,791 PDF ebooks
  • Countries: 1,182 PDF ebooks

Not directly related, but I’m also checking total Cites & Insights visits–since I’ve promised to keep the site, now static, up at least through December 2021 but possibly no longer.

February 2020: 1,219 visits (as reported by AWStats and excluding visits with no actual reads or downloads).

March 2020: 2,493 visits.

GOA: Problematic journals from first 12,000

Thursday, March 26th, 2020

Here’s the link: https://docs.google.com/spreadsheets/d/1vkLNuaVfk6WPqW3ulzu-Azzw0kokPdLqMN6lRZdoKUU/edit?usp=sharing

This spreadsheet includes all problematic (XM/malware or XX/unavailable or unworkable) from the first 12,000 journals scanned. Your help in encouraging journal owners to fix these (or for XX, in many cases, update their DOAJ metadata) is appreciated. For more notes, see this earlier post.

A final version will appear after I scan the remaining 2,128 journals.

While I will start a second pass of testing in mid-April, I will not begin a final scan of XX/XM journals until May 15, 2020.

GOA5: Journals 11,001-12,000 brief notes

Wednesday, March 25th, 2020


  • Of the 864 journals for which data has been recorded (136 are either unavailable or have malware or other issues), 15 (2%) have fees. (These are almost all university-based.)
  • Of that 15, I find that two have fees that vary based on article length or author count.
  • For 133 of the no-fee journals, I wasn’t certain of the no-fee status until I checked DOAJ.
  • Problematic cases include 39 malware cases, one that isn’t an OA journal, and 87 that couldn’t be reached or were unworkable. There were also six “xd” (renamed/ceased duplicate).
  • I’ll do a separate post (probably Thursday March 26) with a new spreadsheet of problematic journals (and send DOAJ a seperate list for 10,001-12,000, since they already have 1-10,000).



GOA: brief notes on journals 10,001-11,000

Wednesday, March 18th, 2020


  • Of the 898 journals for which data has been recorded (102 are either unavailable or have malware or other issues), 224 (25%) have fees.
  • Of that 224, I find that one has submission fees rather than processing fees–and three others have both submission and processing fees. 18 others have fees that vary based on article length (I don’t record that if the surcharge begins at 11 pages or higher) or author count.
  • In 14 of the 224 cases, I gathered the fee status and amount from the DOAJ record because it was not easy to locate within the journal’s website. That’s also the case for 116 journals with (apparently) no fees: info is from DOAJ rather than the journal website.
  • Problematic cases include 43 malware cases (most from one publisher) and 58 that couldn’t be reached or were unworkable. There was also one “xd” (renamed/ceased duplicate).
  • In two cases where I do have data, the URL in DOAJ did not yield the website but a journal title search in Chrome did yield the website. I’m now leaving most such cases for the second round so that I can finish the first scan a bit faster and get problem journals out sooner for fixing.
  • Noting the drop in fee-charging percentage: after Taylor & Francis, Thieme, and Ubiquity, most of these are from universities, with relatively few fees.



GOA and life: a quick update

Monday, March 16th, 2020

The scanning continues…but not as rapidly, for what may be obvious reasons.

To wit: trying to stay on top of The Crisis, our county’s shelter-in-place announcement, and an overload of information and “information.”

That’s slowing me down a lot, both in actual time and in coping. It’s also taking more time just to keep going.

My wife and I are both over 65 and also introverts, so we’re affected differently. (And anyone who says we shouldn’t go for walks in the fresh air will be ignored…)

So: it’s happening. Not rapidly.

Take care.

GOA5: Journals 9,001-10,000 and malware

Wednesday, March 11th, 2020


As noted in the last set of notes, I departed from the usual publisher/journal sorting to test all remaining Indonesian journals–around 880 of them–because nearly half of the malware cases in the first 9,000 were from Indonesia, which had a malware problem in last year’s scan–one that was totally cleared up with the help of DOAJ people. (All I did was send them the list of problems.)

That turned out to be a good thing: these journals, mostly from universities, have a serious malware problem. Maybe there are readers out there who can help correct the problems there and in other countries (yes, I’m keeping DOAJ informed).

How bad? In all, these 1,000 journals had 249 malware cases and 51 other unusable cases.

I should note that 73% of Indonesia’s 1,582 gold OA journals do not have these problems. Unless the UK or US have added a lot of OA journals in 2019, Indonesia publishes more DOAJ-listed journals than anybody. But they have a recurring problem with malware…

If you think you can help…

You should (cross fingers) find a Google Sheet with all problematic journals from the first 10,000 scanned (which includes all Indonesian journals) here: that is, https://docs.google.com/spreadsheets/d/1CL7AY5VuS9KuIYmpfnaTua20pH-GiZLpE6bTwg5_Qhw/edit?usp=sharing

The sheet includes all “xm” journals (malware), 481 of them but also all “xn” (apparently not OA) [20 so far] and “xx” (unavailable or unworkable) [353 so far]. There should be 854 rows, including (in descending order) 442 from Indonesia, 71 from Brazil, 60 from Iran, 34 from Ukraine, 33 from Romania, 21 from Turkey, 20 from Poland, 17 from Spain, 15 from Russia, 12 from the United States, ten from Colombia, and smaller numbers from 41 other countries for a total of 119 additional.

The code–xm, xn, xx–appears in the “Cod” column. A note may appear in the “Note” column offering a brief comment on why something’s there–e.g., for “xx” journals such notes as “404” (27), “500” (an internal error, 3), “ad” (three), “blog” (four), “dbs” error (29), dns failure (23), “park” (parking page, 21), “to” (timeout, 34) and a few others.

For malware, the common codes include “cert” (security certificate problems, 20), “mal” (just flagged as malware, 33), “mult” (MalwareBytes Pro finds more than one included page and multiple malware categories, six), “phish” (phishing, 69), “ransom” (ransomware, six), and the biggie “troj” (trojan, 308). In some cases, I didn’t jot down MalwareBytes’ code.

Dates

I’ll post another version when I scan the next 2,000 journals and a final one when I finish the initial scan (4.128 more journals in all).

The final scan for, hopefully, corrected malware and unavailable/unworkable journals will begin either May 15, 2020 or two weeks after I post that final list, whichever comes last.

Last year, DOAJ and others got the total of xm and xx journals down to 117, of which only 17 were malware. Here’s hoping they (and you?) can do even better this year.

A few other notes on journals 6,001-10,000

  • Of the 3,528 journals for which data has been recorded (472 are either unavailable or have malware or other issues), 1,075 (30%) have fees.
  • Of that 1,075, I find that five (still) have submission fees rather than processing fees–and 28 others have both submission and processing fees. 80 others have fees that vary based on article length (I don’t record that if the surcharge begins at 11 pages or higher) or author count. Five have membership or similar fee requirements, and seventeen are questionable. (Most of the latter are Indonesian cases where I believe the stated fee is missing three zeroes. There will be rows in the final spreadsheet where the amount shows as $0 but the status is “f” for “fee”–selecting the actual cell will find the unrounded stated fee, sometimes under four cents.) [By the way, the “curr” page on the Google Sheet provides the conversion rates used for this project and whether they’re the median 2019 rate or the actual rate on the day in late December 2019 that I did the checks.]
  • In 115 of the 1,075 cases, I gathered the fee status and amount from the DOAJ record because it was not easy to locate within the journal’s website. That’s also the case for 457 journals with (apparently) no fees: info is from DOAJ rather than the journal website.
  • Malware is still with us: see the first part of this message.
  • In 67 cases where I do have data, the URL in DOAJ did not yield the website but a journal title search in Chrome did yield the website.