A metrics update

For those who care about the issue of Google Analytics metrics vs. Urchin (5) metrics–which is either “quite a few people” (if you believe Urchin) or “pretty much nobody” (if you believe Google Analytics), here’s an update:

  • It was pointed out to me that GA won’t track if the user doesn’t have cookies enabled and Javascript enabled. Nothing I can do about that.
  • Seth Finkelstein thought it might have to do with HTML errors, and noted that the W3C Validator found a bunch of those on the Walt at Random home page.

So I thought I’d see how tough it was to correct those errors–and whether it made a difference. (I also thought I’d see whether the errors were mine or were in the templates & addons I used.)

There were a bunch of errors, but that includes cascading errors (where one apparent error is really the result of another error–boy, do I remember those from programming, especially in PL/I!). It turns out that about 80% of the “errors” were mine, mostly because I’m used to HTML parsing being fairly forgiving–namely:

  • Using all-caps operators where HTML requires all-lower-case.
  • Using <br> as a standalone, rather than <br />–but that was both in my own code and in a portion of the template.

I managed to fix them all, although in one case that made the right sidebar a bit less attractive (Validator just wouldn’t accept one particular nested-list). Took me 2, maybe 2.5 hours. Except for the added infelicity in the right margin, it made no difference to the average viewer, I believe, since the visible results were the same. But, presumably, it would make Google Analytic results a little more plausible. Maybe?

Depends on your definition of “a little.”

The changes have been in place since February 23. I’ve had a chance to look at two full days running on a clean, zero-errors home page vs. the same days on Urchin.

There may have been a little increase in pageviews and visits logged by Google Analytics–but not much of one. Here’s what I see for comparisons on the 22, 23 and 24:

  • Sessions: February 22: Google Analytics 58, Urchin 1,492.
    February 23: Google Analytics 79, Urchin 1,439
    February 24: Google Analytics 81, Urchin 1,398.
  • Pageviews: February 22: Google Analytics 77, Urchin 4,455
    February 23: Google Analytics 115, Urchin 3,213
    February 24: Google Analytics 132, Urchin 3,093.

And, mysteriously, the second-highest post in a full page reports on Google Analytics is a post from the very first year of the blog (on mondegreens), with 34 views…where that post is not even in the top 50 on Urchin.

Possibilities

I do note that none of the GA reported pages is a /feed/index page, where quite a few of the higher ones in Urchin are (these presumably being RSS views of pages?). That could account for some of it–since the GA code is, as recommended, right before </body> in the page, it’s part of the footer, which doesn’t get fed to RSS. Since I regard readers-via-RSS as fully equivalent to readers-“in person,” I’m not thrilled about losing those counts.

But if I filter the Urchin pages report to eliminate everything with “feed” anywhere in it, that eliminates less than one-third of the views, still leaving them way more than 10x as high as GA shows.

I’m not sure what else might be going on. I flat-out don’t believe that 90% of Walt at Random viewers have either cookies or Javascript disabled. (But I could be wrong.)

Resolution

For me, for now, for my own sites, the solution is simple: I’ll take the Google Analytics tracking code out of the template and rely on Urchin for my statistics, since it’s actually (presumably) looking at logs. The GA code is extra overhead for the internet; why waste it?

For my work? They’re looking into it. (There, I think the “plausible to reported” multiple is nowhere near as high…)

5 Responses to “A metrics update”

  1. Mark says:

    Thanks for these reports, Walt. I’ve been thinking about looking into Google Analytics but perhaps I ought just learn to a) look at my Urchin stats more often and b) learn to make sense of them.

  2. So all I need to do to read your blog in a stealthy fashion is to use Lynx/Links/Elinks/W3M/LibCurl? Interesting… 🙂

  3. walt says:

    Stephen: Not so. I’m dropping the Google Analytics code and keeping Urchin active–and Urchin analyzes logs directly. Not that I’ve ever had reasons to look at IP addresses…or care.

  4. Thanks for the experiment report. FYI, the caps and “br” errors probably didn’t matter, but the nested-list issue(s) might have been significant to parsing.

    If you’re interested, one quick other test would be to move the Google analytics code from before the end-body tag to just at the start of the body tag (I don’t believe anything requires it be at the end). This would make it parse sooner, so any problems further down in the document wouldn’t affect it.

  5. walt says:

    Seth: I thought about that, but decided that I’ve got better things to do. Also heard from someone who was using a different tracking mechanism that’s not log analysis, and they have much the same issues: the numbers are MUCH lower than they are in Urchin or other log-analysis programs. It may be the nature of the beast.