Archive for 2020

Notes on journals 6,001-8,000

Tuesday, February 25th, 2020


Followup: some notes on the next 2,000 journals in my scan of DOAJ; compare to the first 6,000… (I sort by publisher, then journal, because that speeds things up). Since these notes combine 6,001-8,000, they may usefully be compared to the set of notes on journals 6,001–7,000

A few items do seem interesting.

  • Of the 1,905 journals for which data has been recorded (95 are either unavailable or have malware issues), 707 (37%) have fees.
  • Of that 707, I find that five (still) have submission fees rather than processing fees–and eight others have both submission and processing fees. 59 others have fees that vary based on article length (I don’t record that if the surcharge begins at 11 pages or higher) or author count. Four have membership or similar fee requirements, and two are questionable.
  • In 51 of the 430 cases, I gathered the fee status and amount from the DOAJ record because it was not easy to locate within the journal’s website. That’s also the case for 210 journals with (apparently) no fees: info is from DOAJ rather than the journal website.
  • Malware is still with us: 32 of the 95 missing cases have malware; 43 are missing or useless; one requires a login, which makes it not an OA journals; and 12 are dead or duplicates (most duplicates are renamed journals, with the old name still appearing.
  • In 43 cases where I do have data, the URL in DOAJ did not yield the website but a journal title search in Chrome did yield the website.

Another guesstimate on totals

Journals 6,001-8,000 add a lot of articles: the first 6,000 had just under 353,000 articles (that figure will increase on the second round of counting, but probably not by much), while the next 2,000 have over 222,000 (same remark). With more than 6,000 journals left to go, the 2019 article count is already over 597,000.

Comparing where I am in this year’s survey with the comparable point last year (that is, the same point in a publisher/journal sort), a straight projection would yield just under 829,000 articles for 2019. Such a projection is heavily flawed, but I now believe there’s a better than even chance that the figure will be more than 800,000. (I started this year’s count hoping for “14 and 800”: 14,000 fully analyzed journals and 800,000 articles. Hard to say whether that will be the case; to reach 14,000, a fair number of problematic journals need to be fixed. Last year, they were.



Notes on journals 6,001-7,000

Tuesday, February 18th, 2020

Followup: some notes on the next 1,000 journals in my scan of DOAJ; compare to the first 6,000… (I sort by publisher, then journal, because that speeds things up).

A few items do seem interesting.

  • Of the 961 journals for which data has been recorded (39 are either unavailable or have malware issues), 430 (45%) have fees.
  • Of that 430, I find that five have submission fees rather than processing fees–and six others have both submission and processing fees. 31 others have fees that vary based on article length (I don’t record that if the surcharge begins at 11 pages or higher) or author count. Two have membership or similar fee requirements, and one is questionable (it states boldly that there is no fee, then–in the next paragraph–states the mandated fee but says it’s a gift).
  • In 28 of the 430 cases, I gathered the fee status and amount from the DOAJ record because it was not easy to locate within the journal’s website.
  • Malware is still with us: ten of the 40 missing cases have malware (six of the ten from Indonesia); twenty are missing or useless; one requires a login, which makes it not an OA journals; and eight are dead or duplicates (most duplicates are renamed journals, with the old name still appearing.
  • In cases where I do have data, the URL in DOAJ did not yield the website but a journal title search in Chrome did yield the website.

Timing

So I’ve done 7,000 in (exactly) seven weeks. That leaves 7.128 to go. Will I be done in seven weeks (and a d ay)?

Almost certainly not. Other stuff happens–and the huge chunk of university-based journals is likely to be slow going. I’m hoping to finish the first pass by the end of April–and then there’s a second pass plus a final pass for malware (after journals have had some time to clean up those cases). Then there’s the normalization, data manipulation, table creation and writing the book(s).

GOA4 (2013-2018) appeared on May 4, 2019; that was unusual. GOAJ3 (2012-2017) appeared on May 28, 2018, and even that was earlier than I’d expected. I’ll be delighted if this year’s GOA5 is ready in early June; I won’t be surprised if it takes into July…

A few notes on the first 6,000

Monday, February 10th, 2020

Followup: some notes on the first 6,000 journals in my scan of DOAJ; compare to the first 5,000… (I sort by publisher, then journal, because that speeds things up).

Just for fun–and NOT MEANINGFUL at least partly because a number of journals will show larger numbers or have problems cleared up in the “recount” segment–I’m also comparing this to the equivalent portion of the 2019 scan (that is, the same breakpoint for publisher and journal).

A few items do seem interesting.

  • Of the 5,555 journals for which data has been recorded (445 are either unavailable or have malware issues), 1,981 (36%) have fees.
  • Of that 1,981, I find that 19 have submission fees rather than processing fees–and 36 others have both submission and processing fees. 178 others have fees that vary based on article length (I don’t record that if the surcharge begins at 11 pages or higher) or author count.
  • In 118 of the 1,981 cases, I gathered the fee status and amount from the DOAJ record because it was not easy to locate within the journal’s website. (The higher number last time included no-fee cases, where that wasn’t obvious from the site.) This is a much better number!
  • Malware is still with us: 177 of the 445 for which I don’t yet have data recorded were flagged by Malwarebytes–an uncomfortably high figure. 147 others don’t seem to be there or are unworkable…and ten aren’t OA journals, AFAICT. (Yes, I’m sending DOAJ problems in chunks; yes, I hope we/they can reduce the malware count to a trivial amount as they did last year. The big trouble spots so far are Indonesia with 64 cases, Brazil with 46 and Romania with 15.)
  • In 117 cases where I do have data, the URL in DOAJ did not yield the website but a journal title search in Chrome did yield the website.

One extra note

This will be the last set of trivial notes covering the first chunk of the scan. If I do more notes, they’ll start with the 6,001st journal–that is, any comparisons would leave out the first 6,000. (But I might see whether anybody’s reading these or whether I’m wasting the 15-20 minutes to write them…)

A few (more) notes on the first 5,000

Monday, February 3rd, 2020


Followup: some notes on the first 5,000 journals in my scan of DOAJ; compare to the first 4,000… (I sort by publisher, then journal, because that speeds things up).

Just for fun–and NOT MEANINGFUL at least partly because a number of journals will show larger numbers or have problems cleared up in the “recount” segment–I’m also comparing this to the equivalent portion of the 2019 scan (that is, the same breakpoint for publisher and journal).

A few items do seem interesting.

  • Of the 4,629 journals for which data has been recorded (371 are either unavailable or have malware issues), 1,731 (37%) have fees.
  • Of that 1,363, I find that 16 have submission fees–and 30 others have both submission and processing fees. 142 others have fees that vary based on article length (I don’t record that if the surcharge begins at 11 pages or higher) or author count.
  • In 98 of the 1,731 cases, I gathered the fee status and amount from the DOAJ record because it was not easy to locate within the journal’s website.
  • Malware is still with us: 144 of the 371 for which I don’t yet have data recorded were flagged by Malwarebytes–an uncomfortably high figure. 147 others don’t seem to be there or are unworkable…and eight aren’t OA journals, AFAICT. (Yes, I’m sending DOAJ problems in chunks; yes, I hope we/they can reduce the malware count to a trivial amount as they did last year. The big trouble spots so far are Indonesia with 64 cases, Brazil with 46 and Romania with 15.)
  • In 100 cases where I do have data, the URL in DOAJ did not yield the website but a journal title search in Chrome did yield the website.


Comparisons to 2019

I’m just a bit more than 1/3 of the way done, and things will change, but here’s what I see at the moment:

  • At this point last year, I’d done 4,519 journals of which 4,412 were in the analysis (that subgroup included 10 malware cases, three not-OA cases and 43 unavailable/unworkable). That’s almost the same percentage of the whole–35.5% compared to this year’s 35.3%.
  • For this portion, the 2018 article total was 290,982 compared to 302,978 this year (but that number should grow a little). For 2017, the numbers are 251,118 and 260,556 respectively.
  • If articles were evenly spread among journals, I could project more than 900,000 total 2019 articles (since 35.3% yield 321,346)–but that’s obvious nonsense, since that projection technique yields just under 820,000 total 2018 articles for last year’s count, not the 711,670 articles actually counted. And I’d expect to see the 2019 article count for this year’s pass go up by at least 2,000-4,000. The closest thing to a SWAG for possible totals this time around might be around 786,000–but I’d suggest “somewhere between 750,000 and 850,000” is as close as I’d want to come to an actual estimate.

GOA4: January 2020 update

Friday, January 31st, 2020

Readership for the new edition and GOAJ3. I changed hosts in January, and in the process lost statistics for January 1-20, 2016–and I’m no longer bothering to report paperback sales (essentially none) or GOAJ3 Cites & Insights numbers. (These figures lack 1/31: I may be able to fix that in February.)

All links available from the project home page, as always.

GOA4: 2013-2018

  • The dataset: 439 views, 137 downloads.
  • GOA4: 1,746 PDF ebooks
  • Countries 4: 389 PDF ebooks
  • Subjects and Publishers: 300 PDF ebooks

GOAJ3: 2012-2017

  • The dataset: 1,735 views, 314 downloads
  • GOAJ3: 3,699 PDF ebooks
  • Countries: 1,141 PDF ebooks

A few notes on the first 4,000

Tuesday, January 28th, 2020


Followup: some notes on the first 4,000 journals–partly to show just how unrepresentative any sample is. Compare this to the first 3,000… (I sort by publisher, then journal, because that speeds things up).

A few items do seem interesting.

  • Of the 3,747 journals for which data has been recorded (253 are either unavailable or have malware issues), 1,363 (36%) have fees.
  • Of that 1,363, I find that 11 have submission fees–and 24 others have both submission and processing fees. 121 others have fees that vary based on article length (I don’t record that if the surcharge begins at 11 pages or higher).
  • In 79 of the 1,363 cases, I gathered the fee status and amount from the DOAJ record because it was not easy to locate within the journal’s website.
  • Malware is still with us: 100 of the 214 for which I don’t yet have data recorded were flagged by Malwarebytes–an uncomfortably high figure. 109 others don’t seem to be there or are unworkable…and two aren’t OA journals, AFAICT.
  • In 78 cases where I do have data, the URL in DOAJ did not yield the website but a journal title search in Chrome did yield the website.
  • Some of the numbers in the 3,000-journal notes are typos caused by sloppy editing. I’ll go fix those now…



Gold Open Access 2014-2019: Announcement and a Question

Monday, January 27th, 2020

There will be a Gold Open Access 2014-2019: Articles in Journals (GOA5), probably out in summer or very late spring. It will be similar to GOA4, but covering a lot more journals (starting with more than 14,000!). As before, thanks to SPARC for underwriting this project and assuring that all results are freely available.

As usual, there will be a spreadsheet on figshare, a free PDF ebook (also available as a paperback priced at the cost of production), and another free/nominal PDF ebook detailing journals by country.

The question: should I do the final piece, a second version of Gold Open Access 2013-2018: Subject and Publisher Profiles?

To put it another way: do a significant number of you find those profiles valuable–enough to warrant an extra week or two of my time? (That’s partly to do a little publisher normalizing, mostly to generate the tables and add a sentence or two on each publisher.)

Last year’s version saw 279 ebook downloads, about half as many as the previous year’s subject coverage in Cites & Insights. I find the profiles interesting, but I’m not sure they’re worthwhile. If I drop the extra publication, I’ll expand the subject coverage in Gold Open Access 2014-2019, making that book a little longer–and the publisher profiles will just disappear.

If you have an opinion, you can comment here or send me email at waltcrawford@gmail.com. Comments are typically open for two weeks.

Trouble finding Cites & Insights?

Sunday, January 26th, 2020

If you get a 404 when attempting to reach Cites & Insights or any issue of the now-closed magazine, please change “citesandinsights.info” in the URL to “cical.info” and it should work.

I’m attempting to get the aliasing problem solved. (The original domain was always cical.info, with citesandinsights.info as an alias.)

FIXED, and it’s possible it was just my failure to clear a cache. All domains running properly (and the new host, A2Hosting, has fabulous tech support).

FIXED (and maybe never broken) at the domain level but not at the issue level: if you look for an issue and get a 404, make the change above.

Here’s the current status:

  1. The “real” Cites & Insights has always been at cical.info, now hosted by A2Hosting rather than lishost.org. It continues to be available and will be (barring disasters) through at least 2022.
  2. That site is static–other than possible messages on the home page (e.g., changes in paperback pricing), there will be no new or modified content.
  3. I’m working on the citesandinsights.info issues. As part of that work, that domain might disappear entirely for a little while. URLs may yield various error messages (e,g., 404, failure to find domain, security certificate errors). In all cases, if the URL is correct, just change citesandinsights.info to cical.info and it should work just fine. NOTE: As of 4:15 pm Pacific Standard Time on January 28, 2020, I find that URLs are working properly in Edge, Chrome and–once I added a PDF-viewing extension–Firefox. Unless I hear differently, I’m calling this problem resolved.

A few notes on the first 3,000

Tuesday, January 21st, 2020

Followup: some notes on the first 3,000 journals–partly to show just how unrepresentative any sample is. Compare this to the first21,000… This is not at all a representative sample (I sort by publisher, then journal, because that speeds things up).

Some typos correcvted 1/28

A few items do seem interesting.

  • Of the 2,826 journals for which data has been recorded (174 are either unavailable or have malware issues), 1,086 have fees.
  • Of that 1,086, I find that ten have submission fees–and 20 others have both submission and processing fees. 101 others have fees that vary based on article length (I don’t record that if the surcharge begins at 15 pages or higher).
  • In 68 of the 1,086 cases, I gathered the fee status and amount from the DOAJ record because it was not easy to locate within the journal’s website.
  • Malware is still with us: 78 of the 174 for which I don’t yet have data recorded were flagged by Malwarebytes–an uncomfortably high figure. 71 others don’t seem to be there or are unworkable…and two aren’t OA journals, AFAICT.
  • In 56 cases where I do have data, the URL in DOAJ did not yield the website but a journal title search in Chrome did yield the website.
  • At DOAJ’s request, I’ve sent them the spreadsheet segment involving malware and unavailability. If the project continues, I’ll do that for every 3,000 journals.

If you’re looking for C&I..

Tuesday, January 21st, 2020

At this point, cical.info will get you to the version of Cites & Insights that will stick around for at least two more years. (That was the original domain: citesandinsights.info is a pseudonym.) I’m working on getting the pseudonym restored.

Meanwhile, waltcrawford.name is now on its long-term host…

As for Walt at Random: working on it.

A little plug for A2Hosting.com. my new host: VERY service-oriented and fast.