Replacing Google Analytics, respecting user privacy, and owning your data

If you just want simple stats about your blog or site (i.e. popular pages and referrals), you don't need Google Analytics - the data is already on your server! You just need a tool that can display it nicely for you, and there's a number of privacy-minded alternatives.

Replacing Google Analytics, respecting user privacy, and owning your data

Whenever you visit a web page, your browser includes some basic info about your environment in the request, like your IP address, screen resolution, the page you're requesting (duh), the page you came from, etc. You can see it in action here or here. Individual websites can log that information to see which pages (or posts in the case of bloggers) are the most popular over time, which sites are linking to their site, etc. It's pretty useful stuff.

Feeding the Machine

Well, the IP address part is a little creepy. It's an easy way to track someone doing something shady (if they're not using a VPN), but it's also an easy way for an advertiser to track you across the web. Wait, how's that possible? It's not like advertisers have access to all those servers. I can login to my server and view logs about visitors, but advertisers can't. Enter Google Analytics.

Just plug Google's code into the header of every page on your site, and you get access to things you already had access to, along with questionable stuff like demographics (gender, age, etc). But how?

That code sends all your private visitor data to Google, who slurps it up, feeds it to their big data monster, combines it with all the data coming from millions of other sites running the same code, along with cookies and identifiers, their DoubleClick tracker, etc, and presents a bunch of stats to you.

Like I said, advertisers couldn't have access to all that data unless you voluntarily sent it to them. In essence, you're trading your visitor's privacy for some stats that you might not even need or understand, but Google absolutely understands it all, and happily uses it to help advertisers serve up ads across the web. You can read more about where demographics and interests data comes from as well as a tracking code overview, straight from the horse's mouth.

Starving the Machine

There's another way. For the vast majority of bloggers and small website owners, the data you're interested in is already at your fingertips. Do you really need Google to tell you whether your content is being consumed by men, women, kids, or any other particular group? Do you even care, as long as someone finds it useful?

Personally, I'm only interested in which pages are most popular (so I can invest my time wisely in updating posts), who's referring visitors to my site (it might indicate a site where I should engage more), and how many total visitors I'm getting (if I choose to display an ad for some service I think is useful, it might be helpful to tell them I get xx number of visitors per month). Aaaaand.. that's about it. I don't care about your IP address or anything else.

There's a number of services that are more privacy-minded than Google Analytics, though admittedly that particular bar is pretty low. Just to name a few, Simple Analytics starts at $10/mo, Matomo is $20/mo or free to self-host, and GoAccess is a free self-hosted solution too. Another one is Fathom, which is $12/mo or free to self-host.

Replacing the Machine

I looked at a few, and ultimately settled on Fathom. The interface is clean, they don't use cookies to track visitors, and they even respect the "Do Not Track" setting in your browser. First though, I gotta say that one of my favorite features on DigitalOcean is the ability to take a snapshot of a server before making a change, and if installing some piece of software goes horribly wrong, a restore is only one click away. ๐Ÿ˜…

The instructions for installing Fathom are pretty straight-forward. If you want to deploy a brand new instance, try out DigitalOcean's Fathom Analytics droplet. Since I'm running this site on an pre-existing Ubuntu server that's already running Ghost, I had to take some extra steps. I'll toss them out here - whether or not they're helpful depends on your setup.

  • I selected the latest "fathom_1.2.1_linux_amd64.tar.gz" release.
  • I followed the "Configuring Fathom" section - even though it says it's optional, it seems to be required in the section for setting up an admin user.
  • I already had SSL configured for my blog, so I created an /etc/nginx/sites-enabled/my-fathom-site.conf file and copied the details from my blog's SSL file. After a few changes, I can now use the same Let's Encrypt certificate for my blog and the Fathom dashboard.
  • I configured UFW to allow the secure port for the dashboard.

And here's 10,000 words.. er, 10 images comparing the results of Google to Fathom. In general, the numbers are similar though definitely not identical. General trends and spikes throughout the day seem on par. Popular pages (and referrals, not shown in the Google captures) are similar enough too, as is bounce rate (probably easy to calculate if the "referrer" is the same site).

The page views and numbers are fairly close, although they diverge when comparing several days. The numbers look a bit higher with Fathom, but that could be caused by adblockers (and possibly even some browsers) that block Google Analytics.

The average time per page seems to consistently be about half (or less) on Fathom than what Google reports. That's a metric I can't really wrap my head around though. It seems at best a guess, since there's an HTTP request when coming in to the site, but there's no indication of when a visitor leaves... or just closes the tab. Maybe if someone hits "back" to try the next hit on Google's search page, they can make a reasonable guess as to when a visitor left your page.

Anyway, I'm happy enough with the results to disable Google Analytics. Thanks Fathom! (disclaimer: IMO, YMMV, BOGO, YOLO, and any other acronyms you like...)