From 1d9592cf84957bd6dbe98f85a61837a1c773faae Mon Sep 17 00:00:00 2001 From: Mitja Felicijan Date: Mon, 25 Jan 2021 21:25:07 +0100 Subject: Added post about goaccess --- posts/2021-01-24-replacing-dropbox-with-s3.md | 90 +++++++++++++++ posts/2021-01-25-goaccess.md | 156 ++++++++++++++++++++++++++ posts/2021-01-25-replacing-dropbox-with-s3.md | 90 --------------- 3 files changed, 246 insertions(+), 90 deletions(-) create mode 100644 posts/2021-01-24-replacing-dropbox-with-s3.md create mode 100644 posts/2021-01-25-goaccess.md delete mode 100644 posts/2021-01-25-replacing-dropbox-with-s3.md (limited to 'posts') diff --git a/posts/2021-01-24-replacing-dropbox-with-s3.md b/posts/2021-01-24-replacing-dropbox-with-s3.md new file mode 100644 index 0000000..f3714b6 --- /dev/null +++ b/posts/2021-01-24-replacing-dropbox-with-s3.md @@ -0,0 +1,90 @@ +--- +Title: Replacing Dropbox in favor of DigitalOcean spaces +Description: Replacing Dropbox in favor of DigitalOcean spaces +Slug: replacing-dropbox-in-favor-of-digitalocean-spaces +Listing: true +Created: 2021, January 24 +Tags: [] +--- + +A few months ago I experimented with DigitalOcean spaces as my backup solution that could [replace Dropbox eventually](/digitalocean-spaces-to-sync-between-computers.html). That solution worked quite nicely, and I was amazed how smashing together a couple of existing solutions would work this fine. + +I have been running that solution in the background for a couple of months now and kind of forgot about it. But recent developments around deplatforming and having us people hostages of technology and big companies speed up my goals to become less dependent on [Google](https://edition.cnn.com/2020/12/17/tech/google-antitrust-lawsuit/index.html), [Dropbox](https://www.pcworld.com/article/2048680/dropbox-takes-a-peek-at-files.html) etc and take back some control. + +I am not a conspiracy theory nut, but to be honest, what these companies are doing lately is out of control. It is a matter of principle at this point. I have almost completely degoogled my life all the way from ditching Gmail, YouTube and most of the services surrounding Google. And I must tell you, I feel so good. I haven't felt this way for a long time. + +**Anyways. Let's get to the meat of things.** + +Before you continue you should read my post about [syncing to Dropbox]((/digitalocean-spaces-to-sync-between-computers.html)). + +> Also to note, I am using Linux on my machine with Gnome desktop environment. This should work on MacOS too. To use this on Windows I suggest using [Subsystem for Linux](https://docs.microsoft.com/en-us/windows/wsl/install-win10) or [Cygwin](https://www.cygwin.com/). + +## Folder structure + +I liked structure from Dropbox. One folder where everything is located and synced. So, that's why adopted this also for my sync setup. + +```go +~/Vault + ↳ backup + ↳ bin + ↳ documents + ↳ projects +``` + +All of my code is located in `~/Vault/projects` folder. And most of the projects are Git repositories. I do not use this sync method for backup per see but in case I reinstall my machine I can easily recreate all the important folder structure with one quick command. No external drives needed that can fail etc. + +## Sync script + +My sync script is located in `~/Vault/bin/vault-backup.sh` + +```bash +#!/bin/bash + +# dconf load /com/gexperts/Tilix/ < tilix.dconf +# 0 2 * * * sh ~/Vault/bin/vault-backup.sh + +cd ~/Vault/backup/dotfiles + +MACHINE=$(whoami)@$(hostname) +mkdir -p $MACHINE +cd $MACHINE + +cp ~/.config/VSCodium/User/settings.json settings.json +cp ~/.s3cfg s3cfg +cp ~/.bash_extended bash_extended +cp ~/.ssh ssh -rf + +codium --list-extensions > vscode-extension.txt +dconf dump /com/gexperts/Tilix/ > tilix.dconf + +cd ~/Vault +s3cmd sync --delete-removed --exclude 'node_modules/*' --exclude '.git/*' --exclude '.venv/*' ./ s3://bucket-name/backup/ + +echo `date +"%D %T"` >> ~/.vault.log + +notify-send \ + -u normal \ + -i /usr/share/icons/Adwaita/96x96/status/security-medium-symbolic.symbolic.png \ + "Vault sync succeded at `date +"%D %T"`" +``` + +This script also backups some of the dotfiles I use and sends notification to Gnome notification center. It is a straightforward solution. Nothing special going on. + +> One obvious benefit of this is that I can omit syncing Node's `node_modules` or Python's `.venv` and `.git` folders. + +You can use this script in a combination with [Cron](https://en.wikipedia.org/wiki/Cron). + +``` +0 2 * * * sh ~/Vault/bin/vault-backup.sh +``` + +When you start syncing your local stuff with a remote server you can review your items on DigitalOcean. + +![Dropbox Spaces](/assets/dropbox-sync/dropbox-spaces.png) + +I have been using this script now for quite some time, and it's working flawlessly. I also uninstalled Dropbox and stopped using it completely. + +All I need to do is write a Bash script that does the reverse and downloads from remote server to local folder. This could be another post. + + + diff --git a/posts/2021-01-25-goaccess.md b/posts/2021-01-25-goaccess.md new file mode 100644 index 0000000..c72430a --- /dev/null +++ b/posts/2021-01-25-goaccess.md @@ -0,0 +1,156 @@ +--- +Title: Using GoAccess with Nginx to replace Google Analytics +Description: Using GoAccess with Nginx to replace Google Analytics +Slug: using-goaccess-with-nginx-to-replace-google-analytics +Listing: true +Created: 2021, January 25 +Tags: [] +--- + +1. [Opting for log parsing](#opting-for-log-parsing) +2. [Getting Nginx ready](#getting-nginx-ready) +3. [Getting GoAccess ready](#getting-goaccess-ready) +4. [Securing with Basic authentication](#securing-with-basic-authentication) + +I know! You cannot simply replace Google Analytics with parsing access logs and displaying a couple of charts. But to be honest, I actually never used Google Analytics to the fullest extent and was usually interested in seeing page hits and which pages were visited most often. + +I recently moved my blog from Firebase to a VPS and also decided to remove Google Analytics tracking code from the site since its quite malicious and tracks users across other pages also and is creating a profile of a user, and I've had it. But I also need some insight of what is happening on a server and which content is being read the most etc. + +I have looked at many existing solutions like: +- [Umami](https://umami.is/) +- [Freshlytics](https://github.com/sheshbabu/freshlytics) +- [Matomo](https://matomo.org/) + +But the more I looked at them the more I noticed that I am replacing one evil with another one. Don't get me wrong. Some of these solutions are absolutely fantastic but would require installation of databases and something like PHP or Node. And I was not ready to put those things on my fresh server. Also having Docker installed is out of the question. + +## Opting for log parsing + +So, I defaulted to parsing already existing logs and generating HTML reports from this data. + +I found this amazing software [GoAccess](https://goaccess.io/) which provides all the functionalities I need, and it's a single binary. Written in Go. + +GoAccess can be used in two different modes. + +![GoAccess Terminal](/assets/goaccess/goaccess-dash-term.png) +
Running in a terminal
+ +![GoAccess HTML](/assets/goaccess/goaccess-dash-html.png) +
Running in a browser
+ +I, however, need this to run in a browser. So, the second option is the way to go. The Idea is to periodically run cronjob and export this report into a folder that gets then server by Nginx behind a Basic authentication. + +## Getting Nginx ready + +I choose Ubuntu on [DigitalOcean](https://www.digitalocean.com/). First I installed [Nginx](https://nginx.org/en/), and [Letsencrypt](https://letsencrypt.org/getting-started/) certbot and all the necessary dependencies. + +```sh +# log in as root user +sudo su - + +# first let's update the system +apt update && apt upgrade -y + +# let's install +apt install nginx certbot python3-certbot-nginx apache2-utils +``` + +After all this is installed we can create a new configuration for a statistics. Stats will be available at `stats.domain.com`. + +```sh +# creates directory where html will be hosted +mkdir -p /var/www/html/stats.domain.com + +cp /etc/nginx/sites-available/default /etc/nginx/sites-available/stats.domain.com +nano /etc/nginx/sites-available/stats.domain.com +``` + +```nginx +server { + root /var/www/html/stats.domain.com; + server_name stats.domain.com; + + index index.html; + location / { + try_files $uri $uri/ =404; + } +} +``` + +Now we check if the configuration is ok. We can do this with `nginx -t`. If all is ok, we can restart Nginx with `service nginx restart`. + +After all that you should add A record for this domain that points to IP of a droplet. + +Before enabling SSL you should test if DNS records have propagated with `curl stats.domain.com`. + +Now, it's time to provision TLS certificate. To achieve this, you execute command `certbot --nginx`. Follow the wizard and when you are asked about redirection always choose 2 (always redirect to HTTPS). + +When this is done you can visit https://stats.domain.com and you should get 404 not found error which is correct. + + +## Getting GoAccess ready + +If you are using Debian like system GoAccess should be available in repository. Otherwise refer to the official website. + +```sh +apt install goaccess +``` + +Now we create a shell script that will be executed every 10 minutes. + +```sh +nano /var/www/html/stats.stats.com/generate-stats.sh +``` + +Contents of this file should look like this. + +```sh +#!/bin/sh + +zcat -f /var/log/nginx/access.log* > /var/log/nginx/access-all.log + +goaccess \ + --log-file=/var/log/nginx/access-all.log \ + --log-format=COMBINED \ + --exclude-ip=0.0.0.0 \ + --ignore-crawlers \ + --real-os \ + --output=/var/www/html/stats.domain.com/index.html + +rm /var/log/nginx/access-all.log +``` + +Because after a while nginx creates multiple files with access logs we use [`zcat`](https://linux.die.net/man/1/zcat) to extract Gziped contents and create a file that has all the access logs. After this file is used we delete it. + +If you want to exclude your home IP's result look at the `--exclude-ip` option in script and instead of `0.0.0.0` add your own home IP address. You can find your home IP by executing `curl ifconfig.me` from your local machine and NOT from the droplet. + +Test the script by executing `sh /var/www/html/stats.domain.com/generate-stats.sh` and then checking `https://stats.domain.com`. If you can see stats instead of 404 than you are set. + +It's time to add this script to cron with `cron -e`. + +```go +*/10 * * * * sh /var/www/html/stats.domain.com/generate-stats.sh +``` + +## Securing with Basic authentication + +You probably don't want stats to be publicly available, so we should create a user and a password for Basic authentication. + +First we create a password for a user `stats` with `htpasswd -c /etc/nginx/.htpasswd stats`. + +Now we update config file with `nano /etc/nginx/sites-available/stats.domain.com`. You probably noticed that the file looks a bit different from before. This is because `certbot` added additional rules for SSL. + +Your location portion the config file should now look like. You should add `auth_basic` and `auth_basic_user_file` lines to the file. + +```go +location / { + try_files $uri $uri/ =404; + auth_basic "Private Property"; + auth_basic_user_file /etc/nginx/.htpasswd; +} +``` + +Test if config is still ok with `nginx -t` and if it is you can restart Nginx with `service nginx restart`. + +If you now visit `https://stats.domain.com` you should be prompted for username and password. If not, try reopening your browser. + +That is all. You now have analytics for your server that gets refreshed every 10 minutes. diff --git a/posts/2021-01-25-replacing-dropbox-with-s3.md b/posts/2021-01-25-replacing-dropbox-with-s3.md deleted file mode 100644 index 7d137bc..0000000 --- a/posts/2021-01-25-replacing-dropbox-with-s3.md +++ /dev/null @@ -1,90 +0,0 @@ ---- -Title: Replacing Dropbox in favor of DigitalOcean spaces -Description: Replacing Dropbox in favor of DigitalOcean spaces -Slug: replacing-dropbox-in-favor-of-digitalocean-spaces -Listing: true -Created: 2021, January 25 -Tags: [] ---- - -A few months ago I experimented with DigitalOcean spaces as my backup solution that could [replace Dropbox eventually](/digitalocean-spaces-to-sync-between-computers.html). That solution worked quite nicely, and I was amazed how smashing together a couple of existing solutions would work this fine. - -I have been running that solution in the background for a couple of months now and kind of forgot about it. But recent developments around deplatforming and having us people hostages of technology and big companies speed up my goals to become less dependent on [Google](https://edition.cnn.com/2020/12/17/tech/google-antitrust-lawsuit/index.html), [Dropbox](https://www.pcworld.com/article/2048680/dropbox-takes-a-peek-at-files.html) etc and take back some control. - -I am not a conspiracy theory nut, but to be honest, what these companies are doing lately is out of control. It is a matter of principle at this point. I have almost completely degoogled my life all the way from ditching Gmail, YouTube and most of the services surrounding Google. And I must tell you, I feel so good. I haven't felt this way for a long time. - -**Anyways. Let's get to the meat of things.** - -Before you continue you should read my post about [syncing to Dropbox]((/digitalocean-spaces-to-sync-between-computers.html)). - -> Also to note, I am using Linux on my machine with Gnome desktop environment. This should work on MacOS too. To use this on Windows I suggest using [Subsystem for Linux](https://docs.microsoft.com/en-us/windows/wsl/install-win10) or [Cygwin](https://www.cygwin.com/). - -## Folder structure - -I liked structure from Dropbox. One folder where everything is located and synced. So, that's why adopted this also for my sync setup. - -```go -~/Vault - ↳ backup - ↳ bin - ↳ documents - ↳ projects -``` - -All of my code is located in `~/Vault/projects` folder. And most of the projects are Git repositories. I do not use this sync method for backup per see but in case I reinstall my machine I can easily recreate all the important folder structure with one quick command. No external drives needed that can fail etc. - -## Sync script - -My sync script is located in `~/Vault/bin/vault-backup.sh` - -```bash -#!/bin/bash - -# dconf load /com/gexperts/Tilix/ < tilix.dconf -# 0 2 * * * sh ~/Vault/bin/vault-backup.sh - -cd ~/Vault/backup/dotfiles - -MACHINE=$(whoami)@$(hostname) -mkdir -p $MACHINE -cd $MACHINE - -cp ~/.config/VSCodium/User/settings.json settings.json -cp ~/.s3cfg s3cfg -cp ~/.bash_extended bash_extended -cp ~/.ssh ssh -rf - -codium --list-extensions > vscode-extension.txt -dconf dump /com/gexperts/Tilix/ > tilix.dconf - -cd ~/Vault -s3cmd sync --delete-removed --exclude 'node_modules/*' --exclude '.git/*' --exclude '.venv/*' ./ s3://bucket-name/backup/ - -echo `date +"%D %T"` >> ~/.vault.log - -notify-send \ - -u normal \ - -i /usr/share/icons/Adwaita/96x96/status/security-medium-symbolic.symbolic.png \ - "Vault sync succeded at `date +"%D %T"`" -``` - -This script also backups some of the dotfiles I use and sends notification to Gnome notification center. It is a straightforward solution. Nothing special going on. - -> One obvious benefit of this is that I can omit syncing Node's `node_modules` or Python's `.venv` and `.git` folders. - -You can use this script in a combination with [Cron](https://en.wikipedia.org/wiki/Cron). - -``` -0 2 * * * sh ~/Vault/bin/vault-backup.sh -``` - -When you start syncing your local stuff with a remote server you can review your items on DigitalOcean. - -![Dropbox Spaces](/assets/dropbox-sync/dropbox-spaces.png) - -I have been using this script now for quite some time, and it's working flawlessly. I also uninstalled Dropbox and stopped using it completely. - -All I need to do is write a Bash script that does the reverse and downloads from remote server to local folder. This could be another post. - - - -- cgit v1.2.3