diff options
| author | Mitja Felicijan <mitja.felicijan@gmail.com> | 2019-10-22 03:40:14 +0200 |
|---|---|---|
| committer | Mitja Felicijan <mitja.felicijan@gmail.com> | 2019-10-22 03:40:14 +0200 |
| commit | 28dd784a088a35739cdfdc4ce79f8ee6d50bf816 (patch) | |
| tree | c198abb97177f60864530ee46f5cdcf0ae88d2bf /src/blog | |
| parent | 421677613114bb40780d3a5516b6930d386d0b09 (diff) | |
| download | mitjafelicijan.com-28dd784a088a35739cdfdc4ce79f8ee6d50bf816.tar.gz | |
Cleanup of repo and move to gostatic
Diffstat (limited to 'src/blog')
| -rw-r--r-- | src/blog/golang-profiling-simplified.md | 110 | ||||
| -rw-r--r-- | src/blog/profiling-python-web-applications-with-visual-tools.md | 184 | ||||
| -rw-r--r-- | src/blog/simple-iot-application.md | 486 | ||||
| -rw-r--r-- | src/blog/simplifying-and-reducing-clutter.md | 21 | ||||
| -rw-r--r-- | src/blog/what-i-ve-learned-developing-ad-server.md | 133 |
5 files changed, 934 insertions, 0 deletions
diff --git a/src/blog/golang-profiling-simplified.md b/src/blog/golang-profiling-simplified.md new file mode 100644 index 0000000..a49de67 --- /dev/null +++ b/src/blog/golang-profiling-simplified.md | |||
| @@ -0,0 +1,110 @@ | |||
| 1 | title: Golang profiling simplified | ||
| 2 | date: 2017-03-07 | ||
| 3 | tags: blog | ||
| 4 | hide: false | ||
| 5 | ---- | ||
| 6 | |||
| 7 | Many posts have been written regarding profiling in Golang and I haven’t found proper tutorial regarding this. Almost all of them are missing some part of important information and it gets pretty frustrating when you have a deadline and are not finding simple distilled solution. | ||
| 8 | |||
| 9 | Nevertheless, after searching and experimenting I have found a solution that works for me and probably should also for you. | ||
| 10 | |||
| 11 | ## Where are my pprof files? | ||
| 12 | |||
| 13 | By default pprof files are generated in /tmp/ folder. You can override folder where this files are generated programmatically in your golang code as we will see below in example. | ||
| 14 | |||
| 15 | ## Why is my CPU profile empty? | ||
| 16 | |||
| 17 | I have found out that sometimes CPU profile is empty because program was not executing long enough. Programs, that execute too quickly don’t produce pprof file in my cases. Well, file is generated but only contains 4KB of information. | ||
| 18 | |||
| 19 | ## Profiling | ||
| 20 | |||
| 21 | As you can see from examples we are executing dummy_benchmark functions to ensure some sort of execution. Memory profiling can be done without such a “complex” function. But CPU profiling needs it. | ||
| 22 | |||
| 23 | Both memory and CPU profiling examples are almost the same. Only parameters in main function when calling profile.Start are different. When we set profile.ProfilePath(“.”) we tell profiler to store pprof files in the same folder as our program. | ||
| 24 | |||
| 25 | ### Memory profiling | ||
| 26 | |||
| 27 | ```go | ||
| 28 | package main | ||
| 29 | |||
| 30 | import ( | ||
| 31 | "fmt" | ||
| 32 | "time" | ||
| 33 | "github.com/pkg/profile" | ||
| 34 | ) | ||
| 35 | |||
| 36 | func dummy_benchmark() { | ||
| 37 | |||
| 38 | fmt.Println("first set ...") | ||
| 39 | for i := 0; i < 918231333; i++ { | ||
| 40 | i *= 2 | ||
| 41 | i /= 2 | ||
| 42 | } | ||
| 43 | |||
| 44 | <-time.After(time.Second*3) | ||
| 45 | |||
| 46 | fmt.Println("sencond set ...") | ||
| 47 | for i := 0; i < 9182312232; i++ { | ||
| 48 | i *= 2 | ||
| 49 | i /= 2 | ||
| 50 | } | ||
| 51 | } | ||
| 52 | |||
| 53 | func main() { | ||
| 54 | defer profile.Start(profile.MemProfile, profile.ProfilePath("."), profile.NoShutdownHook).Stop() | ||
| 55 | dummy_benchmark() | ||
| 56 | } | ||
| 57 | ``` | ||
| 58 | |||
| 59 | ### CPU profiling | ||
| 60 | |||
| 61 | ```go | ||
| 62 | package main | ||
| 63 | |||
| 64 | import ( | ||
| 65 | "fmt" | ||
| 66 | "time" | ||
| 67 | "github.com/pkg/profile" | ||
| 68 | ) | ||
| 69 | |||
| 70 | func dummy_benchmark() { | ||
| 71 | |||
| 72 | fmt.Println("first set ...") | ||
| 73 | for i := 0; i < 918231333; i++ { | ||
| 74 | i *= 2 | ||
| 75 | i /= 2 | ||
| 76 | } | ||
| 77 | |||
| 78 | <-time.After(time.Second*3) | ||
| 79 | |||
| 80 | fmt.Println("sencond set ...") | ||
| 81 | for i := 0; i < 9182312232; i++ { | ||
| 82 | i *= 2 | ||
| 83 | i /= 2 | ||
| 84 | } | ||
| 85 | } | ||
| 86 | |||
| 87 | func main() { | ||
| 88 | defer profile.Start(profile.CPUProfile, profile.ProfilePath("."), profile.NoShutdownHook).Stop() | ||
| 89 | dummy_benchmark() | ||
| 90 | } | ||
| 91 | ``` | ||
| 92 | |||
| 93 | ### Generating profiling reports | ||
| 94 | |||
| 95 | ```bash | ||
| 96 | # memory profiling | ||
| 97 | go build mem.go | ||
| 98 | ./mem | ||
| 99 | go tool pprof -pdf ./mem mem.pprof > mem.pdf | ||
| 100 | |||
| 101 | # cpu profiling | ||
| 102 | go build cpu.go | ||
| 103 | ./cpu | ||
| 104 | go tool pprof -pdf ./cpu cpu.pprof > cpu.pdf | ||
| 105 | ``` | ||
| 106 | |||
| 107 | This will generate PDF document with visualized profile. | ||
| 108 | |||
| 109 | - [Memory PDF profile example](/files/go-profiling/golang-profiling-mem.pdf) | ||
| 110 | - [CPU PDF profile example](/files/go-profiling/golang-profiling-cpu.pdf) | ||
diff --git a/src/blog/profiling-python-web-applications-with-visual-tools.md b/src/blog/profiling-python-web-applications-with-visual-tools.md new file mode 100644 index 0000000..e99b9ff --- /dev/null +++ b/src/blog/profiling-python-web-applications-with-visual-tools.md | |||
| @@ -0,0 +1,184 @@ | |||
| 1 | title: Profiling Python web applications with visual tools | ||
| 2 | date: 2017-04-21 | ||
| 3 | tags: blog | ||
| 4 | hide: false | ||
| 5 | ---- | ||
| 6 | |||
| 7 | I have been profiling my software with KCachegrind for a long time now and I was missing this option when I am developing API's or other web services. I always knew that this is possible but never really took the time and dive into it. | ||
| 8 | |||
| 9 | Before we begin there are some requirements. We will need to: | ||
| 10 | |||
| 11 | - implement [cProfile](https://docs.python.org/2/library/profile.html#module-cProfile) into our web app, | ||
| 12 | - convert output to [callgrind](http://valgrind.org/docs/manual/cl-manual.html) format with [pyprof2calltree](https://pypi.python.org/pypi/pyprof2calltree/), | ||
| 13 | - visualize data with [KCachegrind](http://kcachegrind.sourceforge.net/html/Home.html) or [Profiling Viewer](http://www.profilingviewer.com/). | ||
| 14 | |||
| 15 | |||
| 16 | If you are using MacOS you should check out [Profiling Viewer](http://www.profilingviewer.com/) or [MacCallGrind](http://www.maccallgrind.com/). | ||
| 17 | |||
| 18 |  | ||
| 19 | |||
| 20 | We will be dividing this post into two main categories: | ||
| 21 | |||
| 22 | - writing simple web-service, | ||
| 23 | - visualize profile of this web-service. | ||
| 24 | |||
| 25 | ## Simple web-service | ||
| 26 | |||
| 27 | Let's use virtualenv so we won't pollute our base system. If you don't have virtualenv installed on your system you can install it with pip command. | ||
| 28 | |||
| 29 | ```bash | ||
| 30 | # let's install virtualenv globally | ||
| 31 | $ sudo pip install virtualenv | ||
| 32 | |||
| 33 | # let's also install pyprof2calltree globally | ||
| 34 | $ sudo pip install pyprof2calltree | ||
| 35 | |||
| 36 | # now we create project | ||
| 37 | $ mkdir demo-project | ||
| 38 | $ cd demo-project/ | ||
| 39 | |||
| 40 | # now let's create folder where we will store profiles | ||
| 41 | $ mkdir prof | ||
| 42 | |||
| 43 | # now we create empty virtualenv in venv/ folder | ||
| 44 | $ virtualenv --no-site-packages venv | ||
| 45 | |||
| 46 | # we now need to activate virtualenv | ||
| 47 | $ source venv/bin/activate | ||
| 48 | |||
| 49 | # you can check if virtualenv was correctly initialized by | ||
| 50 | # checking where your python interpreter is located | ||
| 51 | # if command bellow points to your created directory and not some | ||
| 52 | # system dir like /usr/bin/python then everything is fine | ||
| 53 | $ which python | ||
| 54 | |||
| 55 | # we can check now if all is good ➜ if ok couple of | ||
| 56 | # lines will be displayed | ||
| 57 | $ pip freeze | ||
| 58 | # appdirs==1.4.3 | ||
| 59 | # packaging==16.8 | ||
| 60 | # pyparsing==2.2.0 | ||
| 61 | # six==1.10.0 | ||
| 62 | |||
| 63 | # now we are ready to install bottlepy ➜ web micro-framework | ||
| 64 | $ pip install bottle | ||
| 65 | |||
| 66 | # you can deactivate virtualenv but you will then go | ||
| 67 | # under system domain ➜ for now don't deactivate | ||
| 68 | $ deactivate | ||
| 69 | ``` | ||
| 70 | |||
| 71 | We are now ready to write simple web service. Let's create file app.py and paste code bellow in this newly created file. | ||
| 72 | |||
| 73 | ```python | ||
| 74 | # -*- coding: utf-8 -*- | ||
| 75 | |||
| 76 | import bottle | ||
| 77 | import random | ||
| 78 | import cProfile | ||
| 79 | |||
| 80 | app = bottle.Bottle() | ||
| 81 | |||
| 82 | # this function is a decorator and encapsulates function | ||
| 83 | # and performs profiling and then saves it to subfolder | ||
| 84 | # prof/function-name.prof | ||
| 85 | # in our example only awesome_random_number function will | ||
| 86 | # be profiled because it has do_cprofile defined | ||
| 87 | def do_cprofile(func): | ||
| 88 | def profiled_func(*args, **kwargs): | ||
| 89 | profile = cProfile.Profile() | ||
| 90 | try: | ||
| 91 | profile.enable() | ||
| 92 | result = func(*args, **kwargs) | ||
| 93 | profile.disable() | ||
| 94 | return result | ||
| 95 | finally: | ||
| 96 | profile.dump_stats("prof/" + str(func.__name__) + ".prof") | ||
| 97 | return profiled_func | ||
| 98 | |||
| 99 | |||
| 100 | # we use profiling over specific function with including | ||
| 101 | # @do_cprofile above function declaration | ||
| 102 | @app.route("/") | ||
| 103 | @do_cprofile | ||
| 104 | def awesome_random_number(): | ||
| 105 | awesome_random_number = random.randint(0, 100) | ||
| 106 | return "awesome random number is " + str(awesome_random_number) | ||
| 107 | |||
| 108 | @app.route("/test") | ||
| 109 | def test(): | ||
| 110 | return "dummy test" | ||
| 111 | |||
| 112 | if __name__ == '__main__': | ||
| 113 | bottle.run( | ||
| 114 | app = app, | ||
| 115 | host = "0.0.0.0", | ||
| 116 | port = 4000 | ||
| 117 | ) | ||
| 118 | |||
| 119 | # run with 'python app.py' | ||
| 120 | # open browser 'http://0.0.0.0:4000' | ||
| 121 | ``` | ||
| 122 | |||
| 123 | When browser hits awesome\_random\_number() function profile is created in prof/ subfolder. | ||
| 124 | |||
| 125 | ## Visualize profile | ||
| 126 | |||
| 127 | Now let's create callgrind format from this cProfile output. | ||
| 128 | |||
| 129 | ```bash | ||
| 130 | $ cd prof/ | ||
| 131 | $ pyprof2calltree -i awesome_random_number.prof | ||
| 132 | # this creates 'awesome_random_number.prof.log' file in the same folder | ||
| 133 | ``` | ||
| 134 | |||
| 135 | This file can be opened with visualizing tools listed above. In this case we will be using Profilling Viewer under MacOS. You can open image in new tab. As you can see from this example there is hierarchy of execution order of your code. | ||
| 136 | |||
| 137 |  | ||
| 138 | |||
| 139 | > Make sure you convert output of the cProfile output every time you want to refresh and take a look at your possible optimizations because cProfile updates .prof file every time browser hits the function. | ||
| 140 | |||
| 141 | This is just a simple example but when you are developing real-life applications this can be very illuminating, especially to see which parts of your code are bottlenecks and need to be optimized. | ||
| 142 | |||
| 143 | ## Update 2017-04-22 | ||
| 144 | |||
| 145 | Reddit user [mvt](https://www.reddit.com/user/mvt) also recommended this awesome web based profile visualizer [SnakeViz](https://jiffyclub.github.io/snakeviz/) that directly takes output from [cProfile](https://docs.python.org/2/library/profile.html#module-cProfile) module. | ||
| 146 | |||
| 147 | <div class="reddit-embed" data-embed-media="www.redditmedia.com" data-embed-parent="false" data-embed-live="false" data-embed-uuid="583880c1-002e-41ed-a373-020a0ef2cff9" data-embed-created="2017-04-22T19:46:54.810Z"><a href="https://www.reddit.com/r/Python/comments/66v373/profiling_python_web_applications_with_visual/dgljhsb/">Comment</a> from discussion <a href="https://www.reddit.com/r/Python/comments/66v373/profiling_python_web_applications_with_visual/">Profiling Python web applications with visual tools</a>.</div><script async src="https://www.redditstatic.com/comment-embed.js"></script> | ||
| 148 | |||
| 149 | ```bash | ||
| 150 | # let's install it globally as well | ||
| 151 | $ sudo pip install snakeviz | ||
| 152 | |||
| 153 | # now let's visualize | ||
| 154 | $ cd prof/ | ||
| 155 | $ snakeviz awesome_random_number.prof | ||
| 156 | # this automatically opens browser window and | ||
| 157 | # shows visualized profile | ||
| 158 | ``` | ||
| 159 | |||
| 160 |  | ||
| 161 | |||
| 162 | Reddit user [ccharles](https://www.reddit.com/user/ccharles) suggested a better way for installing pip software by targeting user level instead of using sudo. | ||
| 163 | |||
| 164 | <div class="reddit-embed" data-embed-media="www.redditmedia.com" data-embed-parent="false" data-embed-live="false" data-embed-uuid="f4f0459e-684d-441e-bebe-eb49b2f0a31d" data-embed-created="2017-04-22T19:46:10.874Z"><a href="https://www.reddit.com/r/Python/comments/66v373/profiling_python_web_applications_with_visual/dglpzkx/">Comment</a> from discussion <a href="https://www.reddit.com/r/Python/comments/66v373/profiling_python_web_applications_with_visual/">Profiling Python web applications with visual tools</a>.</div><script async src="https://www.redditstatic.com/comment-embed.js"></script> | ||
| 165 | |||
| 166 | ```bash | ||
| 167 | # now we need to add this path to our $PATH variable | ||
| 168 | # we do this my adding this line at the end of your | ||
| 169 | # ~/.bashrc file | ||
| 170 | PATH=$PATH:$HOME/.local/bin/ | ||
| 171 | |||
| 172 | # in order to use this new configuration you can close | ||
| 173 | # and reopen terminal or reload .bashrc file | ||
| 174 | $ source ~/.bashrc | ||
| 175 | |||
| 176 | # now let's test if new directory is present in $PATH | ||
| 177 | $ echo $PATH | ||
| 178 | |||
| 179 | # now we can install on user level by adding --user | ||
| 180 | # without use of sudo | ||
| 181 | $ pip install snakeviz --user | ||
| 182 | ``` | ||
| 183 | |||
| 184 | Or as suggested by [mvt](https://www.reddit.com/user/mvt) you can use [pipsi](https://github.com/mitsuhiko/pipsi). | ||
diff --git a/src/blog/simple-iot-application.md b/src/blog/simple-iot-application.md new file mode 100644 index 0000000..2b7d67f --- /dev/null +++ b/src/blog/simple-iot-application.md | |||
| @@ -0,0 +1,486 @@ | |||
| 1 | title: Simple IOT application supported by real-time monitoring and data history | ||
| 2 | date: 2017-08-11 | ||
| 3 | tags: blog | ||
| 4 | hide: false | ||
| 5 | ---- | ||
| 6 | |||
| 7 | ## Initial thoughts | ||
| 8 | |||
| 9 | I have been developing these kind of application for the better part of my last 5 years and people keep asking me how to approach developing such application and I will give a try explaining it here. | ||
| 10 | |||
| 11 | IOT applications are really no different than any other kind of applications. We have data that needs to be collected and visualized in some form of tables or charts. The main difference here is that most of the times these data is collected by some kind of device foreign to developer that mainly operates in web domain. But fear not, it's not that different than writing some JavaScript. | ||
| 12 | |||
| 13 | There are many devices able to transmit data via wireless or wired network by default but for the sake of example we will be using commonly known Arduino with wireless module already on the board → [Arduino MKR1000](https://store.arduino.cc/arduino-mkr1000). | ||
| 14 | |||
| 15 | In order to make this little project as accessible to others as possible I will try to make it as inexpensive as possible. And by this I mean that I will avoid using hosted virtual servers and will be using my own laptop as a server. But you must buy Arduino MKR1000 to follow steps below. But if you would want to deploy this software I would suggest using [DigitalOcean](https://www.digitalocean.com) → smallest VPS is only per month making this one of the most affordable option out there. Please notice that this software will not run on stock web hosting that only supports LAMP (Linux, Apache, MySQL, and PHP). | ||
| 16 | |||
| 17 | _But before we begin please take notice that this is strictly experimental code and not well optimized and there are much better ways in handling some aspects of the application but that requires much deeper knowledge of technology that is not needed for an example like this._ | ||
| 18 | |||
| 19 | **Development steps** | ||
| 20 | |||
| 21 | 1. Simple Python API that will receive and store incoming data. | ||
| 22 | 2. Prototype C++ code that will read "sensor data" and transmit it to API. | ||
| 23 | 3. Data visualization with charts → extends Python web application. | ||
| 24 | |||
| 25 | Step 1. and 3. will share the same web application. One route will be dedicated to API and another to serving HTML with chart. | ||
| 26 | |||
| 27 | Schema below represents what we will try to achieve and how different parts correlates to each other. | ||
| 28 | |||
| 29 |  | ||
| 30 | |||
| 31 | ## Simple Python API | ||
| 32 | |||
| 33 | I have always been a fan of simplicity so we will be using [Bottle: Python Web Framework](https://bottlepy.org/docs/dev/). It is a single file web framework that seriously simplifies working with routes, templating and has built-in web server that satisfies our need in this case. | ||
| 34 | |||
| 35 | First we need to install bottle package. This can be done by downloading ```bottle.py``` and placing it in the root of your application or by using pip software ```pip install bottle --user```. | ||
| 36 | |||
| 37 | If you are using Linux or MacOS then Python is already installed. If you will try to test this on Windows please install [Python for Windows](https://www.python.org/downloads/windows/). There may be some problems with path when you will try to launch ```python webapp.py``` so please take care of this before you continue. | ||
| 38 | |||
| 39 | ### Basic web application | ||
| 40 | |||
| 41 | Most basic bottle application is quite simple. Paste code below in ```webapp.py``` file and save. | ||
| 42 | |||
| 43 | ```python | ||
| 44 | # -*- coding: utf-8 -*- | ||
| 45 | |||
| 46 | import bottle | ||
| 47 | |||
| 48 | # initializing bottle app | ||
| 49 | app = bottle.Bottle() | ||
| 50 | |||
| 51 | # triggered when / is accessed from browser | ||
| 52 | # only accepts GET → no POST allowed | ||
| 53 | @app.route("/", method=["GET"]) | ||
| 54 | def route_default(): | ||
| 55 | return "howdy from python" | ||
| 56 | |||
| 57 | # starting server on http://0.0.0.0:5000 | ||
| 58 | if __name__ == "__main__": | ||
| 59 | bottle.run( | ||
| 60 | app = app, | ||
| 61 | host = "0.0.0.0", | ||
| 62 | port = 5000, | ||
| 63 | debug = True, | ||
| 64 | reloader = True, | ||
| 65 | catchall = True, | ||
| 66 | ) | ||
| 67 | ``` | ||
| 68 | |||
| 69 | To run this simple application you should open command prompt or terminal on your machine and go to the folder containing your file and type ```python webapp.py```. If everything goes ok then open your web browser and point it to ```http://0.0.0.0:5000```. | ||
| 70 | |||
| 71 | If you would like change the port of your application (like port 80) and not use root to run your app this will present a problem. The TCP/IP port numbers below 1024 are privileged ports → this is a security feature. So in order of simplicity and security use a port number above 1024 like I have used port 5000. | ||
| 72 | |||
| 73 | If this fails at any time please fix it before you continue, because nothing below will work otherwise. | ||
| 74 | |||
| 75 | We use 0.0.0.0 as default host so that this app is available over your local network. If you find your local ip ```ifconfig``` and try accessing this site with your phone (if on same network/router as your machine) this should work as well (example of such ip ```http://192.168.1.15:5000```). This is a must have because Arduino will be accessing this application to send it's data. | ||
| 76 | |||
| 77 | ### Web application security | ||
| 78 | |||
| 79 | There is a lot to be said about security and is a topic of many books. Of course all this can not be written here but to just establish some basic security → you should always use SSL with your application. Some fantastic free certificates are available by [Let's Encrypt - Free SSL/TLS Certificates](https://letsencrypt.org). With SSL certificate installed you should then make use of HTTP headers and send your "API key" via a header. If your key is send via header then this key is encrypted by SSL and send encrypted over the network. Never send your api keys by GET parameter like ```http://example.com/?api_key=somekeyvalue```. The problem that this kind of sending presents is that this key is visible in logs and by network sniffers. | ||
| 80 | |||
| 81 | There is a fantastic article describing some aspects about security: [11 Web Application Security Best Practices](https://www.keycdn.com/blog/web-application-security-best-practices/). Please check it out. | ||
| 82 | |||
| 83 | ### Simple API for writing data-points | ||
| 84 | |||
| 85 | We will now be using boilerplate code from example above and extend it to be able to write data received by API to local storage. For example use I will use SQLite3 because it plays well with Python and can store quite large amount of data. I have been using it to collect gigabytes of data in a single database without any corruption or problems → your experience may vary. | ||
| 86 | |||
| 87 | To avoid learning SQLite I will be using [Dataset: databases for lazy people](https://dataset.readthedocs.io/en/latest/index.html). This package abstracts SQL and simplifies writing and reading data from database. You should install this package with pip software ```pip install dataset --user```. | ||
| 88 | |||
| 89 | Because API will use POST method I will be testing if code works correctly by using [Restlet Client for Google Chrome](https://chrome.google.com/webstore/detail/restlet-client-rest-api-t/aejoelaoggembcahagimdiliamlcdmfm). This software also allows you to set headers → for basic security with API_KEY. | ||
| 90 | |||
| 91 | To quickly generate passwords or API keys I usually use this nifty website [RandomKeygen](https://randomkeygen.com/). | ||
| 92 | |||
| 93 | Copy and paste code below over your previous code in file ```webapp.py```. | ||
| 94 | |||
| 95 | ```python | ||
| 96 | # -*- coding: utf-8 -*- | ||
| 97 | |||
| 98 | import time | ||
| 99 | import bottle | ||
| 100 | import random | ||
| 101 | import dataset | ||
| 102 | |||
| 103 | # initializing bottle app | ||
| 104 | app = bottle.Bottle() | ||
| 105 | |||
| 106 | # connects to sqlite database | ||
| 107 | # check_same_thread=False allows using it in multi-threaded mode | ||
| 108 | app.config["dsn"] = dataset.connect("sqlite:///data.db?check_same_thread=False") | ||
| 109 | |||
| 110 | # api key that will be used in Arduino code | ||
| 111 | app.config["api_key"] = "JtF2aUE5SGHfVJBCG5SH" | ||
| 112 | |||
| 113 | # triggered when /api is accessed from browser | ||
| 114 | # only accepts POST → no GET allowed | ||
| 115 | @app.route("/api", method=["POST"]) | ||
| 116 | def route_default(): | ||
| 117 | status = 400 | ||
| 118 | ts = int(time.time()) # current timestamp | ||
| 119 | value = bottle.request.body.read() # data from device | ||
| 120 | api_key = bottle.request.get_header("Api_Key") # api key from header | ||
| 121 | |||
| 122 | # outputs to console received data for debug reason | ||
| 123 | print ">>> {} :: {}".format(value, api_key) | ||
| 124 | |||
| 125 | # if api_key is correct and value is present | ||
| 126 | # then writes attribute to point table | ||
| 127 | if api_key == app.config["api_key"] and value: | ||
| 128 | app.config["dsn"]["point"].insert(dict(ts=ts, value=value)) | ||
| 129 | status = 200 | ||
| 130 | |||
| 131 | # we only need to return status | ||
| 132 | return bottle.HTTPResponse(status=status, body="") | ||
| 133 | |||
| 134 | # starting server on http://0.0.0.0:5000 | ||
| 135 | if __name__ == "__main__": | ||
| 136 | bottle.run( | ||
| 137 | app = app, | ||
| 138 | host = "0.0.0.0", | ||
| 139 | port = 5000, | ||
| 140 | debug = True, | ||
| 141 | reloader = True, | ||
| 142 | catchall = True, | ||
| 143 | ) | ||
| 144 | ``` | ||
| 145 | |||
| 146 | To run this simply go to folder containing python file and run ```python webapp.py``` from terminal. If everything goes ok you should have simple API available via POST method on /api route. | ||
| 147 | |||
| 148 | After testing the service with Restlet Client you should be able to view your data in a database file ```data.db```. | ||
| 149 | |||
| 150 |  | ||
| 151 | |||
| 152 | You can also check the contents of new database file by using desktop client for SQLite → [DB Browser for SQLite](http://sqlitebrowser.org/). | ||
| 153 | |||
| 154 |  | ||
| 155 | |||
| 156 | Table structure is as simple as it can be. We have ts (timestamp) and value (value from Arduino). As you can see timestamp is generated on API side. If you would happen to have atomic clock on Arduino it would be then better to generate and send timestamp with the value. This would be particularity useful if we would be collecting sensor data at a higher frequency and then sending this data in bulk to API. | ||
| 157 | |||
| 158 | If you will deploy this app with uWSGI and multi-threaded, use DSN (Data Source Name) url with ```?check_same_thread=False```. | ||
| 159 | |||
| 160 | Ok, now that we have some sort of a working API with some basic security so unwanted people can not post data to your database can we proceed further and try to program Arduino to send data to API. | ||
| 161 | |||
| 162 | ## Sending data to API with Arduino MKR1000 | ||
| 163 | |||
| 164 | First of all you should have MKR1000 module and microUSB cable to proceed. If you have ever done any work with Arduino you should know that you also need [Arduino IDE](https://www.arduino.cc/en/Main/Software). On provided link you should be able to download and install IDE. Once that task is completed and you have successfully run blink example you should proceed to the next step. | ||
| 165 | |||
| 166 | In order to use wireless capabilities of MKR1000 you need to first install [WiFi101 library](https://www.arduino.cc/en/Reference/WiFi101) in Arduino IDE. Please check before you install, you may already have it installed. | ||
| 167 | |||
| 168 | Code below is a working example that sends data to API. Before you try to test your code make sure you have run Python web application. Then change settings for wifi, api endpoint and api_key. If by some reason code bellow doesn't work for you please leave a comment and I'll try to help. | ||
| 169 | |||
| 170 | Once you have opened IDE and copied this code try to compile and upload it. Then open "Serial monitor" to see if any output is presented by Arduino. | ||
| 171 | |||
| 172 | ```c | ||
| 173 | #include <WiFi101.h> | ||
| 174 | |||
| 175 | // wifi settings | ||
| 176 | char ssid[] = "ssid-name"; | ||
| 177 | char pass[] = "ssid-password"; | ||
| 178 | |||
| 179 | // api server enpoint | ||
| 180 | char server[] = "192.168.6.22"; | ||
| 181 | int port = 5000; | ||
| 182 | |||
| 183 | // api key that must be the same as the one in Python code | ||
| 184 | String api_key = "JtF2aUE5SGHfVJBCG5SH"; | ||
| 185 | |||
| 186 | // frequency data is sent in ms - every 5 seconds | ||
| 187 | int timeout = 1000 * 5; | ||
| 188 | |||
| 189 | int status = WL_IDLE_STATUS; | ||
| 190 | |||
| 191 | void setup() { | ||
| 192 | |||
| 193 | // initialize serial and wait for port to open: | ||
| 194 | Serial.begin(9600); | ||
| 195 | delay(1000); | ||
| 196 | |||
| 197 | // check for the presence of the shield | ||
| 198 | if (WiFi.status() == WL_NO_SHIELD) { | ||
| 199 | Serial.println("WiFi shield not present"); | ||
| 200 | while (true); | ||
| 201 | } | ||
| 202 | |||
| 203 | // attempt to connect to wifi network | ||
| 204 | while (status != WL_CONNECTED) { | ||
| 205 | Serial.print("Attempting to connect to SSID: "); | ||
| 206 | Serial.println(ssid); | ||
| 207 | status = WiFi.begin(ssid, pass); | ||
| 208 | // wait 10 seconds for connection | ||
| 209 | delay(10000); | ||
| 210 | } | ||
| 211 | |||
| 212 | // output wifi status to serial monitor | ||
| 213 | Serial.print("SSID: "); | ||
| 214 | Serial.println(WiFi.SSID()); | ||
| 215 | |||
| 216 | IPAddress ip = WiFi.localIP(); | ||
| 217 | Serial.print("IP Address: "); | ||
| 218 | Serial.println(ip); | ||
| 219 | |||
| 220 | long rssi = WiFi.RSSI(); | ||
| 221 | Serial.print("signal strength (RSSI):"); | ||
| 222 | Serial.print(rssi); | ||
| 223 | Serial.println(" dBm"); | ||
| 224 | } | ||
| 225 | |||
| 226 | void loop() { | ||
| 227 | |||
| 228 | WiFiClient client; | ||
| 229 | |||
| 230 | if (client.connect(server, port)) { | ||
| 231 | |||
| 232 | // I use random number generator for this example | ||
| 233 | // but you can use analog or digital inputs from arduino | ||
| 234 | String content = String(random(1000)); | ||
| 235 | |||
| 236 | client.println("POST /api HTTP/1.1"); | ||
| 237 | client.println("Connection: close"); | ||
| 238 | client.println("Api-Key: " + api_key); | ||
| 239 | client.println("Content-Length: " + String(content.length())); | ||
| 240 | client.println(); | ||
| 241 | client.println(content); | ||
| 242 | |||
| 243 | delay(100); | ||
| 244 | client.stop(); | ||
| 245 | Serial.println("Data sent successfully ..."); | ||
| 246 | |||
| 247 | } else { | ||
| 248 | Serial.println("Problem sending data ..."); | ||
| 249 | } | ||
| 250 | |||
| 251 | // waits for x seconds and continue looping | ||
| 252 | delay(timeout); | ||
| 253 | |||
| 254 | } | ||
| 255 | ``` | ||
| 256 | |||
| 257 | As seen from example you can notice that Arduino is generating random integer between [ 0 .. 1000 ]. You can easily replace this with a temperature sensor or any other kind of sensor. | ||
| 258 | |||
| 259 | Now that we have API under the hood and Arduino is sending demo data we can now focus on data visualization. | ||
| 260 | |||
| 261 | ## Data visualization | ||
| 262 | |||
| 263 | Before we continue we should examine our project folder structure. Currently we only have two files in our project: | ||
| 264 | |||
| 265 | _simple-iot-app/_ | ||
| 266 | |||
| 267 | * _webapp.py_ | ||
| 268 | * _data.db_ | ||
| 269 | |||
| 270 | We will now add HTML template that will contain CSS and JavaScript code inline for the simplicity reason. And for the bottle framework to be able to scan root application folder for templates we will add ```bottle.TEMPLATE_PATH.insert(0, "./")``` in ```webapp.py```. By default bottle framework uses ```views/``` subfolder to store templates. This is not the ideal situation and if you will use bottle to develop web applications you should use native behavior and store templates in it's predefined folder. But for the sake of example we will over-ride this. Be careful to fully replace your code with new code that is provided below. Avoid partially replacing code in file :) Also new code for reading data-points is provided in Python example below. | ||
| 271 | |||
| 272 | First we add new route to our web application. It should be trigger when browser hits root of application ```http://0.0.0.0:5000/```. This route will do nothing more than render ```frontend.html``` template. This is done by ```return bottle.template("frontend.html")```. Check code below to further examine how exactly this is done. | ||
| 273 | |||
| 274 | Now we will expand ```/api``` route and use different methods to write or read data-points. For writing data-point we will use POST method and for reading points we will use GET method. GET method will return JSON object with latest readings and historical data. | ||
| 275 | |||
| 276 | There is a fantastic JavaScript library for plotting time-series charts called [MetricsGraphics.js](https://www.metricsgraphicsjs.org) that is based on [D3.js](https://d3js.org/) library for visualizing data. | ||
| 277 | |||
| 278 | Data schema required by MetricsGraphics.js → to achieve this we need to transform data from database into this format: | ||
| 279 | |||
| 280 | ```json | ||
| 281 | [ | ||
| 282 | { | ||
| 283 | "date": "2017-08-11 01:07:20", | ||
| 284 | "value": 933 | ||
| 285 | }, | ||
| 286 | { | ||
| 287 | "date": "2017-08-11 01:07:30", | ||
| 288 | "value": 743 | ||
| 289 | } | ||
| 290 | ] | ||
| 291 | ``` | ||
| 292 | |||
| 293 | Web application is now complete and we only need ```frontend.html``` that we will develop now. If you would try to start web app now and go to root app this will return error because we don't have frontend.html yet. | ||
| 294 | |||
| 295 | ```python | ||
| 296 | # -*- coding: utf-8 -*- | ||
| 297 | |||
| 298 | import time | ||
| 299 | import bottle | ||
| 300 | import json | ||
| 301 | import datetime | ||
| 302 | import random | ||
| 303 | import dataset | ||
| 304 | |||
| 305 | # initializing bottle app | ||
| 306 | app = bottle.Bottle() | ||
| 307 | |||
| 308 | # adds root directory as template folder | ||
| 309 | bottle.TEMPLATE_PATH.insert(0, "./") | ||
| 310 | |||
| 311 | # connects to sqlite database | ||
| 312 | # check_same_thread=False allows using it in multi-threaded mode | ||
| 313 | app.config["db"] = dataset.connect("sqlite:///data.db?check_same_thread=False") | ||
| 314 | |||
| 315 | # api key that will be used in Arduino code | ||
| 316 | app.config["api_key"] = "JtF2aUE5SGHfVJBCG5SH" | ||
| 317 | |||
| 318 | # triggered when / is accessed from browser | ||
| 319 | # only accepts GET → no POST allowed | ||
| 320 | @app.route("/", method=["GET"]) | ||
| 321 | def route_default(): | ||
| 322 | return bottle.template("frontend.html") | ||
| 323 | |||
| 324 | # triggered when /api is accessed from browser | ||
| 325 | # accepts POST and GET | ||
| 326 | @app.route("/api", method=["GET", "POST"]) | ||
| 327 | def route_default(): | ||
| 328 | |||
| 329 | # if method is POST then we write datapoint | ||
| 330 | if bottle.request.method == "POST": | ||
| 331 | status = 400 | ||
| 332 | ts = int(time.time()) # current timestamp | ||
| 333 | value = bottle.request.body.read() # data from device | ||
| 334 | api_key = bottle.request.get_header("Api-Key") # api key from header | ||
| 335 | |||
| 336 | # outputs to console recieved data for debug reason | ||
| 337 | print ">>> {} :: {}".format(value, api_key) | ||
| 338 | |||
| 339 | # if api_key is correct and value is present | ||
| 340 | # then writes attribute to point table | ||
| 341 | if api_key == app.config["api_key"] and value: | ||
| 342 | app.config["db"]["point"].insert(dict(ts=ts, value=value)) | ||
| 343 | status = 200 | ||
| 344 | |||
| 345 | # we only need to return status | ||
| 346 | return bottle.HTTPResponse(status=status, body="") | ||
| 347 | |||
| 348 | # if method is GET then we read datapoint | ||
| 349 | else: | ||
| 350 | response = [] | ||
| 351 | datapoints = app.config["db"]["point"].all() | ||
| 352 | |||
| 353 | for point in datapoints: | ||
| 354 | response.append({ | ||
| 355 | "date": datetime.datetime.fromtimestamp(int(point["ts"])).strftime("%Y-%m-%d %H:%M:%S"), | ||
| 356 | "value": point["value"] | ||
| 357 | }) | ||
| 358 | |||
| 359 | bottle.response.content_type = "application/json" | ||
| 360 | return json.dumps(response) | ||
| 361 | |||
| 362 | # starting server on http://0.0.0.0:5000 | ||
| 363 | if __name__ == "__main__": | ||
| 364 | bottle.run( | ||
| 365 | app = app, | ||
| 366 | host = "0.0.0.0", | ||
| 367 | port = 5000, | ||
| 368 | debug = True, | ||
| 369 | reloader = True, | ||
| 370 | catchall = True, | ||
| 371 | ) | ||
| 372 | ``` | ||
| 373 | |||
| 374 | And now finally we can implement ```frontend.html```. Create file with this name and copy code below. When you are done you can start web application. Steps for this part are listed below the code. | ||
| 375 | |||
| 376 | ```html | ||
| 377 | <!DOCTYPE html> | ||
| 378 | <html> | ||
| 379 | |||
| 380 | <head> | ||
| 381 | <meta charset="utf-8"> | ||
| 382 | <title>Simple IOT application</title> | ||
| 383 | </head> | ||
| 384 | |||
| 385 | <body> | ||
| 386 | |||
| 387 | <h1>Simple IOT application</h1> | ||
| 388 | |||
| 389 | <div class="chart-placeholder"> | ||
| 390 | <div id="chart"></div> | ||
| 391 | </div> | ||
| 392 | |||
| 393 | <!-- application main script --> | ||
| 394 | <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script> | ||
| 395 | <script src="https://cdnjs.cloudflare.com/ajax/libs/d3/4.10.0/d3.min.js"></script> | ||
| 396 | <script src="https://cdnjs.cloudflare.com/ajax/libs/metrics-graphics/2.11.0/metricsgraphics.min.js"></script> | ||
| 397 | <script> | ||
| 398 | function fetch_and_render() { | ||
| 399 | d3.json("/api", function(data) { | ||
| 400 | data = MG.convert.date(data, "date", "%Y-%m-%d %H:%M:%S"); | ||
| 401 | MG.data_graphic({ | ||
| 402 | data: data, | ||
| 403 | chart_type: "line", | ||
| 404 | full_width: true, | ||
| 405 | height: 270, | ||
| 406 | target: document.getElementById("chart"), | ||
| 407 | x_accessor: "date", | ||
| 408 | y_accessor: "value" | ||
| 409 | }); | ||
| 410 | }); | ||
| 411 | } | ||
| 412 | window.onload = function() { | ||
| 413 | // initial call for rendering | ||
| 414 | fetch_and_render(); | ||
| 415 | |||
| 416 | // updates chart every 5 seconds | ||
| 417 | setInterval(function() { | ||
| 418 | fetch_and_render(); | ||
| 419 | }, 5000); | ||
| 420 | } | ||
| 421 | </script> | ||
| 422 | |||
| 423 | <!-- application styles --> | ||
| 424 | <style> | ||
| 425 | body { | ||
| 426 | font: 13px sans-serif; | ||
| 427 | padding: 20px 50px; | ||
| 428 | } | ||
| 429 | .chart-placeholder { | ||
| 430 | border: 2px solid #ccc; | ||
| 431 | width: 100%; | ||
| 432 | user-select: none; | ||
| 433 | } | ||
| 434 | /* chart styles */ | ||
| 435 | .mg-line1-color { | ||
| 436 | stroke: red; | ||
| 437 | stroke-width: 2; | ||
| 438 | } | ||
| 439 | .mg-main-area, .mg-main-line { | ||
| 440 | fill: #fff; | ||
| 441 | } | ||
| 442 | .mg-x-axis line, .mg-y-axis line { | ||
| 443 | stroke: #b3b2b2; | ||
| 444 | stroke-width: 1px; | ||
| 445 | } | ||
| 446 | </style> | ||
| 447 | |||
| 448 | </body> | ||
| 449 | |||
| 450 | </html> | ||
| 451 | ``` | ||
| 452 | |||
| 453 | Now the folder structure should look like: | ||
| 454 | |||
| 455 | _simple-iot-app/_ | ||
| 456 | |||
| 457 | * _webapp.py_ | ||
| 458 | * _data.db_ | ||
| 459 | * _frontend.html_ | ||
| 460 | |||
| 461 | Ok, lets now start application and start feeding it data. | ||
| 462 | |||
| 463 | 1. ```python webapp.py``` | ||
| 464 | 2. connect Arduino MKR1000 to power source | ||
| 465 | 3. open browser and go to ```http://0.0.0.0:5000``` | ||
| 466 | |||
| 467 | If everything goes well you should be seeing new data-points rendered on chart every 5 seconds. | ||
| 468 | |||
| 469 | If you navigate to ```http://0.0.0.0:5000``` you should see rendered chart as shown on picture below. | ||
| 470 | |||
| 471 |  | ||
| 472 | |||
| 473 | Complete application with all the code is available for [download](/files/iot-application/simple-iot-application.zip). | ||
| 474 | |||
| 475 | ## Conclusion | ||
| 476 | |||
| 477 | I hope this clarifies some aspects of IOT application development. Of course this is a minimal example and is far from what can be done in real life with some further dive into other technologies. | ||
| 478 | |||
| 479 | If you would like to continue exploring IOT world here are some interesting resources for you to examine: | ||
| 480 | |||
| 481 | * [Reading Sensors with an Arduino](https://www.allaboutcircuits.com/projects/reading-sensors-with-an-arduino/) | ||
| 482 | * [MQTT 101 – How to Get Started with the lightweight IoT Protocol](http://www.hivemq.com/blog/how-to-get-started-with-mqtt) | ||
| 483 | * [Stream Updates with Server-Sent Events](https://www.html5rocks.com/en/tutorials/eventsource/basics/) | ||
| 484 | * [Internet of Things (IoT) Tutorials](http://www.tutorialspoint.com/internet_of_things/) | ||
| 485 | |||
| 486 | Any comment or additional ideas are welcomed in comments below. | ||
diff --git a/src/blog/simplifying-and-reducing-clutter.md b/src/blog/simplifying-and-reducing-clutter.md new file mode 100644 index 0000000..b435834 --- /dev/null +++ b/src/blog/simplifying-and-reducing-clutter.md | |||
| @@ -0,0 +1,21 @@ | |||
| 1 | title: Simplifying and reducing clutter in my life and work | ||
| 2 | date: 2019-10-14 | ||
| 3 | tags: blog | ||
| 4 | hide: false | ||
| 5 | ---- | ||
| 6 | |||
| 7 | I recently moved my main working machine back from Hachintosh to Linux. Well the experiment was interesting and I have done some great work on macOS but it was time to move back. | ||
| 8 | |||
| 9 | I actually really missed Linux. The simplicity of `apt-get` or just the amount of software that exists for Linux should be a no-brainer. I spent most of my time on macOS finding solutions to make things work. Using [Brew](https://brew.sh/) was just a horrible experience and far from package managers of Linux. At least they managed to get that `sudo` debacle sorted. | ||
| 10 | |||
| 11 | Not all was bad. macOS in general was a perfectly good environment. Things like Docker and tooling like this worked without any hiccups. My normal tools like coding IDE worked flawlessly and the whole look and feel is just superb. I have been using MacBook Air for couple of years so I was used to the system but never as a daily driver. | ||
| 12 | |||
| 13 | One of the things I did after I installed Linux back on my machine was cleaning up my Dropbox folder. I have everything on Dropbox. Even projects folder. I write code for living so my whole life revolves around couple of megs of code (with assets). So it's not like I have huge files on my machine. I don't have movies or music or pictures on my PC. All of that stuff is in cloud. I use Google music and I have Netflix account which is more than enough for me. | ||
| 14 | |||
| 15 | I also went and deleted some of the repositories on my Github account. I have deleted more code than deployed. People find this strange but for me deleting something feels so cathartic and also forces me to write better code next time around when I am faced with similar problem. That was a huge relief if I am being totally honest. | ||
| 16 | |||
| 17 | Next step was to do something with my webpage. I have been using some scripts I wrote a while ago to generate static pages from markdown source posts. I kept on adding and adding stuff on top of it and it became a source of a frustration. And this is just a simple blog and I was using gulp and npm. Anyways after couple of hours of searching and testing static generators I found an interesting one [https://github.com/piranha/gostatic](https://github.com/piranha/gostatic) and I just decided to use this one. It was the only one that had a simple templating engine, not that I really need one. But others had this convoluted way of trying to solve everything and at the end just required quite bigger learning curve I was ready to go with. So I deleted couple of old posts, simplified HTML, trashed most of the CSS and went with [https://motherfuckingwebsite.com/](https://motherfuckingwebsite.com/) aesthetics. Yeah, the previous site was more visually stimulating but all I really care is the content at this point. And Times New Roman font is kind of awesome. | ||
| 18 | |||
| 19 | I stopped working on most of the projects in the past couple of months because the overhead was just too insane. There comes a point when you stretch yourself too much and then you stop progressing and with that comes dissatisfaction. | ||
| 20 | |||
| 21 | So that's about it. Moving forward minimal style. | ||
diff --git a/src/blog/what-i-ve-learned-developing-ad-server.md b/src/blog/what-i-ve-learned-developing-ad-server.md new file mode 100644 index 0000000..527f9d0 --- /dev/null +++ b/src/blog/what-i-ve-learned-developing-ad-server.md | |||
| @@ -0,0 +1,133 @@ | |||
| 1 | title: What I've learned developing ad server | ||
| 2 | date: 2017-04-17 | ||
| 3 | tags: blog | ||
| 4 | hide: false | ||
| 5 | ---- | ||
| 6 | |||
| 7 | For the past year and half I have been developing native advertising server that contextually matches ads and displays them in different template forms on variety of websites. This project grew from serving thousands of ads per day to millions. | ||
| 8 | |||
| 9 | The system is made from couple of core components: | ||
| 10 | |||
| 11 | - API for serving ads, | ||
| 12 | - Utils - cronjobs and queue management tools, | ||
| 13 | - Dashboard UI. | ||
| 14 | |||
| 15 | Initial release was using [MongoDB](https://www.mongodb.com/) for full-text search but was later replaced by [Elasticsearch](https://www.elastic.co/) for better CPU utilization and better search performance. This provided us with many amazing functionalities of [Elasticsearch](https://www.elastic.co/). You should check it out if you do any search related operations. | ||
| 16 | |||
| 17 | Because the premise of the server is to provide native ad experience, they are rendered on the client side via simple templating engine. This ensures that ads can be displayed number of different ways based on the visual style of the page. And this makes JavaScript client library quite complex. | ||
| 18 | |||
| 19 | So now that you know basic information about the product lets get into the lessons we learned. | ||
| 20 | |||
| 21 | ## Aggregate everything | ||
| 22 | |||
| 23 | After beta version was released everything (impressions, clicks, etc) was written in nanosecond resolution in the database. At that time we were using [PostgreSQL](https://www.postgresql.org/) and database quickly grew way above 200GB in disk space. And that was problematic. Statistics took disturbingly long time to aggregate. Also using indexes on stats table in database was no help after we reached 500 million datapoints. | ||
| 24 | |||
| 25 | > There is a marketing product information and there is real life experience. And the tend to be quite the opposite. | ||
| 26 | |||
| 27 | This was the reason that now everything is aggregated on daily basis and this data is then fed to Elastic in form of daily summary. With this we achieved we can now track many more dimensions such as zone, channel and platform information. And with this information we can now adapt occurrences of ads on specific places more precisely. | ||
| 28 | |||
| 29 | We have also adapted [Redis](https://redis.io/) as a full-time citizen in our stack. Because Redis also stores information on a local disk we have some sort of backup if server would accidentally suffer some failure. | ||
| 30 | |||
| 31 | All the real-time statistics for ad serving and redirecting is presented as counters in Redis instance and daily extracted and pushed to Elastic. | ||
| 32 | |||
| 33 | ## Measure everything | ||
| 34 | |||
| 35 | The thing about software is that we really don't know how well it is performing under load until such load is presented. When testing locally everything is fine but when on production things tend to fall apart. | ||
| 36 | |||
| 37 | As a solution for this we are measuring everything we can. Function execution time (by encapsulating functions with timers), server performance (cpu, memory, disk, etc), Nginx and [uWSGI](https://uwsgi-docs.readthedocs.io/) performance. We sacrifice a bit of performance for the sake of this information. And we store all this information for later analysis. | ||
| 38 | |||
| 39 | **Example of function execution time** | ||
| 40 | |||
| 41 | ```json | ||
| 42 | { | ||
| 43 | "get_final_filtered_ads": { | ||
| 44 | "counter": 1931250, | ||
| 45 | "avg": 0.0066143431, | ||
| 46 | "elapsed": 12773.9500310003 | ||
| 47 | }, | ||
| 48 | "store_keywords_statistics": { | ||
| 49 | "counter": 1931011, | ||
| 50 | "avg": 0.0004605267, | ||
| 51 | "elapsed": 889.2821669996 | ||
| 52 | }, | ||
| 53 | "match_by_context": { | ||
| 54 | "counter": 1931011, | ||
| 55 | "avg": 0.0055960716, | ||
| 56 | "elapsed": 10806.0758889999 | ||
| 57 | }, | ||
| 58 | "match_by_high_performance": { | ||
| 59 | "counter": 262, | ||
| 60 | "avg": 0.0152770229, | ||
| 61 | "elapsed": 4.00258 | ||
| 62 | }, | ||
| 63 | "store_impression_stats": { | ||
| 64 | "counter": 1931250, | ||
| 65 | "avg": 0.0006189991, | ||
| 66 | "elapsed": 1195.4419869999 | ||
| 67 | } | ||
| 68 | } | ||
| 69 | ``` | ||
| 70 | |||
| 71 | We have also started profiling with [cProfile](https://pymotw.com/2/profile/) and then visualizing with [KCachegrind](http://kcachegrind.sourceforge.net/). This provides much more detailed look into code execution. | ||
| 72 | |||
| 73 | ## Cache control is your friend | ||
| 74 | |||
| 75 | Because we use Javascript library for rendering ads we rely on this script extensively and when in need we need to be able to change behavior of the script quickly. | ||
| 76 | |||
| 77 | In our case we can not simply replace javascript url in html code. It usually takes a day or two for the guys who maintain sites to change code or add ?ver=xxx attribute. And this makes rapid deployment and testing very difficult and time consuming. There is a limitation of how much you can test locally. | ||
| 78 | |||
| 79 | We are now in the process of integrating [Google Tag Manager](https://www.google.com/analytics/tag-manager/) but couple of websites are developed on ASP.net platform that have some problems with tag manager. With a solution below we are certain that we are serving latest version of the script. | ||
| 80 | |||
| 81 | And it only takes one mistake and users have the script cached and in case of caching it for 1 year you probably know where the problem is. | ||
| 82 | |||
| 83 | ```nginx | ||
| 84 | # nginx ➜ /etc/nginx/sites-available/default | ||
| 85 | location /static/ { | ||
| 86 | alias /path-to-static-content/; | ||
| 87 | autoindex off; | ||
| 88 | charset utf-8; | ||
| 89 | gzip on; | ||
| 90 | gzip_types text/plain application/javascript application/x-javascript text/javascript text/xml text/css; | ||
| 91 | location ~* \.(ico|gif|jpeg|jpg|png|woff|ttf|otf|svg|woff2|eot)$ { | ||
| 92 | expires 1y; | ||
| 93 | add_header Pragma public; | ||
| 94 | add_header Cache-Control "public"; | ||
| 95 | } | ||
| 96 | location ~* \.(css|js|txt)$ { | ||
| 97 | expires 3600s; | ||
| 98 | add_header Pragma public; | ||
| 99 | add_header Cache-Control "public, must-revalidate"; | ||
| 100 | } | ||
| 101 | } | ||
| 102 | ``` | ||
| 103 | |||
| 104 | Also be careful when redirecting to url in your python code. We noticed that if we didn't precisely setup cache control and expire headers in response we didn't get the request on the server and therefore couldn't measure clicks. So when redirecting do as follows and there will be no problems. | ||
| 105 | |||
| 106 | ```python | ||
| 107 | # python ➜ bottlepy web micro-framework | ||
| 108 | response = bottle.HTTPResponse(status=302) | ||
| 109 | response.set_header("Cache-Control", "no-store, no-cache, must-revalidate") | ||
| 110 | response.set_header("Expires", "Thu, 01 Jan 1970 00:00:00 GMT") | ||
| 111 | response.set_header("Location", url) | ||
| 112 | return response | ||
| 113 | ``` | ||
| 114 | |||
| 115 | > Cache control in browsers is quite aggressive and you need to be precise to avoid future problems. We learned that lesson the hard way. | ||
| 116 | |||
| 117 | ## Learn NGINX | ||
| 118 | |||
| 119 | When deciding on a web server we went with Nginx as a reverse proxy for our applications. We adapted micro-service oriented architecture early in the project to ensure when we scale we can easily add additional servers to our cluster. And Nginx was crucial to perform load balancing and static content delivery. | ||
| 120 | |||
| 121 | At first our config file was quite simple and later grew larger. After patching and adding new settings I sat down and learned more about the guts of Nginx. This proved to be very useful and we were able to squeeze much more out of our setup. So I advise you to take your time and read through the [documentation](https://nginx.org/en/docs/). This saved us a lot of headache. Googling for solutions only goes so far. | ||
| 122 | |||
| 123 | ## Use Redis/Memcached | ||
| 124 | |||
| 125 | As explained above we are using caching basically for everything. It is the corner stone of our services. At first we were very careful about the quantity of things we stored in [Redis](https://redis.io/). But we later found out that the memory footprint is very low even when storing large amount of data in it. | ||
| 126 | |||
| 127 | So we gradually increased our usage to caching whole HTML outputs of dashboard. This improved our performance in order of magnitude. And by using native TTL support this goes hand in hand with our needs. | ||
| 128 | |||
| 129 | The reason why we choose [Redis](https://redis.io/) over [Memcached](https://memcached.org/) was the nature of scalability of Redis out of the box. But all this can be achieved with Memcached. | ||
| 130 | |||
| 131 | ## Conclusion | ||
| 132 | |||
| 133 | There are a lot more details that could have been written and every single topic in here deserves it's own post but you probably got the idea about the problems we faced. | ||
