From 2417a6b7603524dc5cd30d29b153f91024b9443d Mon Sep 17 00:00:00 2001 From: Mitja Felicijan Date: Wed, 1 Nov 2023 22:54:27 +0100 Subject: Move to Jekyll --- content/posts/2021-01-25-goaccess.md | 204 ----------------------------------- 1 file changed, 204 deletions(-) delete mode 100644 content/posts/2021-01-25-goaccess.md (limited to 'content/posts/2021-01-25-goaccess.md') diff --git a/content/posts/2021-01-25-goaccess.md b/content/posts/2021-01-25-goaccess.md deleted file mode 100644 index 84ea3cd..0000000 --- a/content/posts/2021-01-25-goaccess.md +++ /dev/null @@ -1,204 +0,0 @@ ---- -title: Using GoAccess with Nginx to replace Google Analytics -url: using-goaccess-with-nginx-to-replace-google-analytics.html -date: 2021-01-25T12:00:00+02:00 -type: post -draft: false ---- - -## Introduction - -I know! You cannot simply replace Google Analytics with parsing access logs and -displaying a couple of charts. But to be honest, I actually never used Google -Analytics to the fullest extent and was usually interested in seeing page hits -and which pages were visited most often. - -I recently moved my blog from Firebase to a VPS and also decided to remove -Google Analytics tracking code from the site since its quite malicious and -tracks users across other pages also and is creating a profile of a user, and -I've had it. But I also need some insight of what is happening on a server and -which content is being read the most etc. - -I have looked at many existing solutions like: - -- [Umami](https://umami.is/) -- [Freshlytics](https://github.com/sheshbabu/freshlytics) -- [Matomo](https://matomo.org/) - -But the more I looked at them the more I noticed that I am replacing one evil -with another one. Don't get me wrong. Some of these solutions are absolutely -fantastic but would require installation of databases and something like PHP or -Node. And I was not ready to put those things on my fresh server. Also having -Docker installed is out of the question. - -## Opting for log parsing - -So, I defaulted to parsing already existing logs and generating HTML reports -from this data. - -I found this amazing software [GoAccess](https://goaccess.io/) which provides -all the functionalities I need, and it's a single binary. Written in Go. - -GoAccess can be used in two different modes. - -![GoAccess Terminal](/posts/goaccess/goaccess-dash-term.png) - -*Running in a terminal* - -![GoAccess HTML](/posts/goaccess/goaccess-dash-html.png) - -*Running in a browser* - -I, however, need this to run in a browser. So, the second option is the way to -go. The Idea is to periodically run cronjob and export this report into a folder -that gets then server by Nginx behind a Basic authentication. - -## Getting Nginx ready - -I choose Ubuntu on [DigitalOcean](https://www.digitalocean.com/). First I -installed [Nginx](https://nginx.org/en/), and -[Letsencrypt](https://letsencrypt.org/getting-started/) certbot and all the -necessary dependencies. - -```sh -# log in as root user -sudo su - - -# first let's update the system -apt update && apt upgrade -y - -# let's install -apt install nginx certbot python3-certbot-nginx apache2-utils -``` - -After all this is installed we can create a new configuration for a statistics. -Stats will be available at `stats.domain.com`. - -```sh -# creates directory where html will be hosted -mkdir -p /var/www/html/stats.domain.com - -cp /etc/nginx/sites-available/default /etc/nginx/sites-available/stats.domain.com -nano /etc/nginx/sites-available/stats.domain.com -``` - -```nginx -server { - root /var/www/html/stats.domain.com; - server_name stats.domain.com; - - index index.html; - location / { - try_files $uri $uri/ =404; - } -} -``` - -Now we check if the configuration is ok. We can do this with `nginx -t`. If all -is ok, we can restart Nginx with `service nginx restart`. - -After all that you should add A record for this domain that points to IP of a -droplet. - -Before enabling SSL you should test if DNS records have propagated with `curl -stats.domain.com`. - -Now, it's time to provision TLS certificate. To achieve this, you execute -command `certbot --nginx`. Follow the wizard and when you are asked about -redirection always choose 2 (always redirect to HTTPS). - -When this is done you can visit https://stats.domain.com and you should get 404 -not found error which is correct. - -## Getting GoAccess ready - -If you are using Debian like system GoAccess should be available in repository. -Otherwise refer to the official website. - -```sh -apt install goaccess -``` - -To enable Geo location we also need one additiona thing. - -```sh -cd /var/www/html/stats.stats.com -wget https://github.com/P3TERX/GeoLite.mmdb/raw/download/GeoLite2-City.mmdb -``` - -Now we create a shell script that will be executed every 10 minutes. - -```sh -nano /var/www/html/stats.domain.com/generate-stats.sh -``` - -Contents of this file should look like this. - -```sh -#!/bin/sh - -zcat -f /var/log/nginx/access.log* > /var/log/nginx/access-all.log - -goaccess \ - --log-file=/var/log/nginx/access-all.log \ - --log-format=COMBINED \ - --exclude-ip=0.0.0.0 \ - --geoip-database=/var/www/html/stats.domain.com/GeoLite2-City.mmdb \ - --ignore-crawlers \ - --real-os \ - --output=/var/www/html/stats.domain.com/index.html - -rm /var/log/nginx/access-all.log -``` - -Because after a while nginx creates multiple files with access logs we use -[`zcat`](https://linux.die.net/man/1/zcat) to extract Gziped contents and create -a file that has all the access logs. After this file is used we delete it. - -If you want to exclude your home IP's result look at the `--exclude-ip` option -in script and instead of `0.0.0.0` add your own home IP address. You can find -your home IP by executing `curl ifconfig.me` from your local machine and NOT -from the droplet. - -Test the script by executing `sh -/var/www/html/stats.domain.com/generate-stats.sh` and then checking -`https://stats.domain.com`. If you can see stats instead of 404 than you are -set. - -It's time to add this script to cron with `cron -e`. - -```go -*/10 * * * * sh /var/www/html/stats.domain.com/generate-stats.sh -``` - -## Securing with Basic authentication - -You probably don't want stats to be publicly available, so we should create a -user and a password for Basic authentication. - -First we create a password for a user `stats` with `htpasswd -c /etc/nginx/.htpasswd stats`. - -Now we update config file with `nano -/etc/nginx/sites-available/stats.domain.com`. You probably noticed that the -file looks a bit different from before. This is because `certbot` added -additional rules for SSL. - -Your location portion the config file should now look like. You should add -`auth_basic` and `auth_basic_user_file` lines to the file. - -```nginx -location / { - try_files $uri $uri/ =404; - auth_basic "Private Property"; - auth_basic_user_file /etc/nginx/.htpasswd; -} -``` - -Test if config is still ok with `nginx -t` and if it is you can restart Nginx -with `service nginx restart`. - -If you now visit `https://stats.domain.com` you should be prompted for username -and password. If not, try reopening your browser. - -That is all. You now have analytics for your server that gets refreshed every 10 -minutes. -- cgit v1.2.3