aboutsummaryrefslogtreecommitdiff
path: root/content/posts/2021-01-25-goaccess.md
diff options
context:
space:
mode:
Diffstat (limited to 'content/posts/2021-01-25-goaccess.md')
-rw-r--r--content/posts/2021-01-25-goaccess.md109
1 files changed, 55 insertions, 54 deletions
diff --git a/content/posts/2021-01-25-goaccess.md b/content/posts/2021-01-25-goaccess.md
index efbd175..1b6a330 100644
--- a/content/posts/2021-01-25-goaccess.md
+++ b/content/posts/2021-01-25-goaccess.md
@@ -7,16 +7,16 @@ draft: false
7 7
8## Introduction 8## Introduction
9 9
10I know! You cannot simply replace Google Analytics with parsing access logs and 10I know! You cannot simply replace Google Analytics with parsing access logs and
11displaying a couple of charts. But to be honest, I actually never used Google 11displaying a couple of charts. But to be honest, I actually never used Google
12Analytics to the fullest extent and was usually interested in seeing page 12Analytics to the fullest extent and was usually interested in seeing page hits
13hits and which pages were visited most often. 13and which pages were visited most often.
14 14
15I recently moved my blog from Firebase to a VPS and also decided to remove 15I recently moved my blog from Firebase to a VPS and also decided to remove
16Google Analytics tracking code from the site since its quite malicious and 16Google Analytics tracking code from the site since its quite malicious and
17tracks users across other pages also and is creating a profile of a user, and 17tracks users across other pages also and is creating a profile of a user, and
18I've had it. But I also need some insight of what is happening on a server 18I've had it. But I also need some insight of what is happening on a server and
19and which content is being read the most etc. 19which content is being read the most etc.
20 20
21I have looked at many existing solutions like: 21I have looked at many existing solutions like:
22 22
@@ -24,18 +24,18 @@ I have looked at many existing solutions like:
24- [Freshlytics](https://github.com/sheshbabu/freshlytics) 24- [Freshlytics](https://github.com/sheshbabu/freshlytics)
25- [Matomo](https://matomo.org/) 25- [Matomo](https://matomo.org/)
26 26
27But the more I looked at them the more I noticed that I am replacing one evil 27But the more I looked at them the more I noticed that I am replacing one evil
28with another one. Don't get me wrong. Some of these solutions are absolutely 28with another one. Don't get me wrong. Some of these solutions are absolutely
29fantastic but would require installation of databases and something like PHP 29fantastic but would require installation of databases and something like PHP or
30or Node. And I was not ready to put those things on my fresh server. Also 30Node. And I was not ready to put those things on my fresh server. Also having
31having Docker installed is out of the question. 31Docker installed is out of the question.
32 32
33## Opting for log parsing 33## Opting for log parsing
34 34
35So, I defaulted to parsing already existing logs and generating HTML reports 35So, I defaulted to parsing already existing logs and generating HTML reports
36from this data. 36from this data.
37 37
38I found this amazing software [GoAccess](https://goaccess.io/) which provides 38I found this amazing software [GoAccess](https://goaccess.io/) which provides
39all the functionalities I need, and it's a single binary. Written in Go. 39all the functionalities I need, and it's a single binary. Written in Go.
40 40
41GoAccess can be used in two different modes. 41GoAccess can be used in two different modes.
@@ -46,15 +46,16 @@ GoAccess can be used in two different modes.
46![GoAccess HTML](/assets/goaccess/goaccess-dash-html.png) 46![GoAccess HTML](/assets/goaccess/goaccess-dash-html.png)
47<center><i>Running in a browser</i></center> 47<center><i>Running in a browser</i></center>
48 48
49I, however, need this to run in a browser. So, the second option is the way 49I, however, need this to run in a browser. So, the second option is the way to
50to go. The Idea is to periodically run cronjob and export this report into a 50go. The Idea is to periodically run cronjob and export this report into a folder
51folder that gets then server by Nginx behind a Basic authentication. 51that gets then server by Nginx behind a Basic authentication.
52 52
53## Getting Nginx ready 53## Getting Nginx ready
54 54
55I choose Ubuntu on [DigitalOcean](https://www.digitalocean.com/). First I 55I choose Ubuntu on [DigitalOcean](https://www.digitalocean.com/). First I
56installed [Nginx](https://nginx.org/en/), and [Letsencrypt](https://letsencrypt.org/getting-started/) 56installed [Nginx](https://nginx.org/en/), and
57certbot and all the necessary dependencies. 57[Letsencrypt](https://letsencrypt.org/getting-started/) certbot and all the
58necessary dependencies.
58 59
59```sh 60```sh
60# log in as root user 61# log in as root user
@@ -90,26 +91,25 @@ server {
90} 91}
91``` 92```
92 93
93Now we check if the configuration is ok. We can do this with `nginx -t`. If 94Now we check if the configuration is ok. We can do this with `nginx -t`. If all
94all is ok, we can restart Nginx with `service nginx restart`. 95is ok, we can restart Nginx with `service nginx restart`.
95 96
96After all that you should add A record for this domain that points to IP of 97After all that you should add A record for this domain that points to IP of a
97a droplet. 98droplet.
98 99
99Before enabling SSL you should test if DNS records have propagated with 100Before enabling SSL you should test if DNS records have propagated with `curl
100`curl stats.domain.com`. 101stats.domain.com`.
101 102
102Now, it's time to provision TLS certificate. To achieve this, you execute command 103Now, it's time to provision TLS certificate. To achieve this, you execute
103`certbot --nginx`. Follow the wizard and when you are asked about redirection 104command `certbot --nginx`. Follow the wizard and when you are asked about
104always choose 2 (always redirect to HTTPS). 105redirection always choose 2 (always redirect to HTTPS).
105 106
106When this is done you can visit https://stats.domain.com and you should get 404 107When this is done you can visit https://stats.domain.com and you should get 404
107not found error which is correct. 108not found error which is correct.
108 109
109
110## Getting GoAccess ready 110## Getting GoAccess ready
111 111
112If you are using Debian like system GoAccess should be available in repository. 112If you are using Debian like system GoAccess should be available in repository.
113Otherwise refer to the official website. 113Otherwise refer to the official website.
114 114
115```sh 115```sh
@@ -148,19 +148,19 @@ goaccess \
148rm /var/log/nginx/access-all.log 148rm /var/log/nginx/access-all.log
149``` 149```
150 150
151Because after a while nginx creates multiple files with access logs we use 151Because after a while nginx creates multiple files with access logs we use
152[`zcat`](https://linux.die.net/man/1/zcat) to extract Gziped contents and 152[`zcat`](https://linux.die.net/man/1/zcat) to extract Gziped contents and create
153create a file that has all the access logs. After this file is used we 153a file that has all the access logs. After this file is used we delete it.
154delete it.
155 154
156If you want to exclude your home IP's result look at the `--exclude-ip` option 155If you want to exclude your home IP's result look at the `--exclude-ip` option
157in script and instead of `0.0.0.0` add your own home IP address. You can find 156in script and instead of `0.0.0.0` add your own home IP address. You can find
158your home IP by executing `curl ifconfig.me` from your local machine and NOT 157your home IP by executing `curl ifconfig.me` from your local machine and NOT
159from the droplet. 158from the droplet.
160 159
161Test the script by executing `sh /var/www/html/stats.domain.com/generate-stats.sh` 160Test the script by executing `sh
162and then checking `https://stats.domain.com`. If you can see stats instead of 161/var/www/html/stats.domain.com/generate-stats.sh` and then checking
163404 than you are set. 162`https://stats.domain.com`. If you can see stats instead of 404 than you are
163set.
164 164
165It's time to add this script to cron with `cron -e`. 165It's time to add this script to cron with `cron -e`.
166 166
@@ -170,16 +170,17 @@ It's time to add this script to cron with `cron -e`.
170 170
171## Securing with Basic authentication 171## Securing with Basic authentication
172 172
173You probably don't want stats to be publicly available, so we should create a 173You probably don't want stats to be publicly available, so we should create a
174user and a password for Basic authentication. 174user and a password for Basic authentication.
175 175
176First we create a password for a user `stats` with `htpasswd -c /etc/nginx/.htpasswd stats`. 176First we create a password for a user `stats` with `htpasswd -c /etc/nginx/.htpasswd stats`.
177 177
178Now we update config file with `nano /etc/nginx/sites-available/stats.domain.com`. 178Now we update config file with `nano
179You probably noticed that the file looks a bit different from before. This is 179/etc/nginx/sites-available/stats.domain.com`. You probably noticed that the
180because `certbot` added additional rules for SSL. 180file looks a bit different from before. This is because `certbot` added
181additional rules for SSL.
181 182
182Your location portion the config file should now look like. You should add 183Your location portion the config file should now look like. You should add
183`auth_basic` and `auth_basic_user_file` lines to the file. 184`auth_basic` and `auth_basic_user_file` lines to the file.
184 185
185```nginx 186```nginx
@@ -190,12 +191,12 @@ location / {
190} 191}
191``` 192```
192 193
193Test if config is still ok with `nginx -t` and if it is you can restart 194Test if config is still ok with `nginx -t` and if it is you can restart Nginx
194Nginx with `service nginx restart`. 195with `service nginx restart`.
195 196
196If you now visit `https://stats.domain.com` you should be prompted for username 197If you now visit `https://stats.domain.com` you should be prompted for username
197and password. If not, try reopening your browser. 198and password. If not, try reopening your browser.
198 199
199That is all. You now have analytics for your server that gets refreshed every 200That is all. You now have analytics for your server that gets refreshed every 10
20010 minutes. 201minutes.
201 202