aboutsummaryrefslogtreecommitdiff
path: root/content/posts/2017-04-17-what-i-ve-learned-developing-ad-server.md
diff options
context:
space:
mode:
authorMitja Felicijan <mitja.felicijan@gmail.com>2023-05-26 00:40:40 +0200
committerMitja Felicijan <mitja.felicijan@gmail.com>2023-05-26 00:40:40 +0200
commit43b0708769eb61392050045b881f8e6ba39c5b66 (patch)
tree3939579a13b8325325d5ebb8e05324a41ed78a6d /content/posts/2017-04-17-what-i-ve-learned-developing-ad-server.md
parent49e7e7d555a6cd9810d81561fa3e98e3d64502be (diff)
downloadmitjafelicijan.com-43b0708769eb61392050045b881f8e6ba39c5b66.tar.gz
Massive update to posts, archetypes
Added a archetypes for creating notes and posts so it auto-populates fields. Fixed existing posts so they align with the rule of 80 columns now.
Diffstat (limited to 'content/posts/2017-04-17-what-i-ve-learned-developing-ad-server.md')
-rw-r--r--content/posts/2017-04-17-what-i-ve-learned-developing-ad-server.md114
1 files changed, 89 insertions, 25 deletions
diff --git a/content/posts/2017-04-17-what-i-ve-learned-developing-ad-server.md b/content/posts/2017-04-17-what-i-ve-learned-developing-ad-server.md
index 6fca4ce..1b9be06 100644
--- a/content/posts/2017-04-17-what-i-ve-learned-developing-ad-server.md
+++ b/content/posts/2017-04-17-what-i-ve-learned-developing-ad-server.md
@@ -1,11 +1,14 @@
1--- 1---
2title: What I've learned developing ad server 2title: What I've learned developing ad server
3url: what-i-ve-learned-developing-ad-server.html 3url: what-i-ve-learned-developing-ad-server.html
4date: 2017-04-17 4date: 2017-04-17T12:00:00+02:00
5draft: false 5draft: false
6--- 6---
7 7
8For the past year and half I have been developing native advertising server that contextually matches ads and displays them in different template forms on variety of websites. This project grew from serving thousands of ads per day to millions. 8For the past year and half I have been developing native advertising server
9that contextually matches ads and displays them in different template forms
10on variety of websites. This project grew from serving thousands of ads per
11day to millions.
9 12
10The system is made from couple of core components: 13The system is made from couple of core components:
11 14
@@ -13,29 +16,56 @@ The system is made from couple of core components:
13- Utils - cronjobs and queue management tools, 16- Utils - cronjobs and queue management tools,
14- Dashboard UI. 17- Dashboard UI.
15 18
16Initial release was using [MongoDB](https://www.mongodb.com/) for full-text search but was later replaced by [Elasticsearch](https://www.elastic.co/) for better CPU utilization and better search performance. This provided us with many amazing functionalities of [Elasticsearch](https://www.elastic.co/). You should check it out if you do any search related operations. 19Initial release was using [MongoDB](https://www.mongodb.com/) for full-text
20search but was later replaced by [Elasticsearch](https://www.elastic.co/)
21for better CPU utilization and better search performance. This provided us
22with many amazing functionalities of [Elasticsearch](https://www.elastic.co/).
23You should check it out if you do any search related operations.
17 24
18Because the premise of the server is to provide native ad experience, they are rendered on the client side via simple templating engine. This ensures that ads can be displayed number of different ways based on the visual style of the page. And this makes JavaScript client library quite complex. 25Because the premise of the server is to provide native ad experience, they
26are rendered on the client side via simple templating engine. This ensures
27that ads can be displayed number of different ways based on the visual style
28of the page. And this makes JavaScript client library quite complex.
19 29
20So now that you know basic information about the product lets get into the lessons we learned. 30So now that you know basic information about the product lets get into the
31lessons we learned.
21 32
22## Aggregate everything 33## Aggregate everything
23 34
24After beta version was released everything (impressions, clicks, etc) was written in nanosecond resolution in the database. At that time we were using [PostgreSQL](https://www.postgresql.org/) and database quickly grew way above 200GB in disk space. And that was problematic. Statistics took disturbingly long time to aggregate. Also using indexes on stats table in database was no help after we reached 500 million datapoints. 35After beta version was released everything (impressions, clicks, etc) was
36written in nanosecond resolution in the database. At that time we were using
37[PostgreSQL](https://www.postgresql.org/) and database quickly grew way above
38200GB in disk space. And that was problematic. Statistics took disturbingly
39long time to aggregate. Also using indexes on stats table in database was no
40help after we reached 500 million datapoints.
25 41
26> There is a marketing product information and there is real life experience. And the tend to be quite the opposite. 42> There is a marketing product information and there is real life experience.
43And the tend to be quite the opposite.
27 44
28This was the reason that now everything is aggregated on daily basis and this data is then fed to Elastic in form of daily summary. With this we achieved we can now track many more dimensions such as zone, channel and platform information. And with this information we can now adapt occurrences of ads on specific places more precisely. 45This was the reason that now everything is aggregated on daily basis and this
46data is then fed to Elastic in form of daily summary. With this we achieved we
47can now track many more dimensions such as zone, channel and platform information.
48And with this information we can now adapt occurrences of ads on specific
49places more precisely.
29 50
30We have also adapted [Redis](https://redis.io/) as a full-time citizen in our stack. Because Redis also stores information on a local disk we have some sort of backup if server would accidentally suffer some failure. 51We have also adapted [Redis](https://redis.io/) as a full-time citizen in our
52stack. Because Redis also stores information on a local disk we have some sort
53of backup if server would accidentally suffer some failure.
31 54
32All the real-time statistics for ad serving and redirecting is presented as counters in Redis instance and daily extracted and pushed to Elastic. 55All the real-time statistics for ad serving and redirecting is presented as
56counters in Redis instance and daily extracted and pushed to Elastic.
33 57
34## Measure everything 58## Measure everything
35 59
36The thing about software is that we really don't know how well it is performing under load until such load is presented. When testing locally everything is fine but when on production things tend to fall apart. 60The thing about software is that we really don't know how well it is performing
61under load until such load is presented. When testing locally everything is
62fine but when on production things tend to fall apart.
37 63
38As a solution for this we are measuring everything we can. Function execution time (by encapsulating functions with timers), server performance (cpu, memory, disk, etc), Nginx and [uWSGI](https://uwsgi-docs.readthedocs.io/) performance. We sacrifice a bit of performance for the sake of this information. And we store all this information for later analysis. 64As a solution for this we are measuring everything we can. Function execution
65time (by encapsulating functions with timers), server performance (cpu, memory,
66disk, etc), Nginx and [uWSGI](https://uwsgi-docs.readthedocs.io/) performance.
67We sacrifice a bit of performance for the sake of this information. And we
68store all this information for later analysis.
39 69
40**Example of function execution time** 70**Example of function execution time**
41 71
@@ -69,17 +99,28 @@ As a solution for this we are measuring everything we can. Function execution ti
69} 99}
70``` 100```
71 101
72We have also started profiling with [cProfile](https://pymotw.com/2/profile/) and then visualizing with [KCachegrind](http://kcachegrind.sourceforge.net/). This provides much more detailed look into code execution. 102We have also started profiling with [cProfile](https://pymotw.com/2/profile/)
103and then visualizing with [KCachegrind](http://kcachegrind.sourceforge.net/).
104This provides much more detailed look into code execution.
73 105
74## Cache control is your friend 106## Cache control is your friend
75 107
76Because we use Javascript library for rendering ads we rely on this script extensively and when in need we need to be able to change behavior of the script quickly. 108Because we use Javascript library for rendering ads we rely on this script
109extensively and when in need we need to be able to change behavior of the
110script quickly.
77 111
78In our case we can not simply replace javascript url in html code. It usually takes a day or two for the guys who maintain sites to change code or add ?ver=xxx attribute. And this makes rapid deployment and testing very difficult and time consuming. There is a limitation of how much you can test locally. 112In our case we can not simply replace javascript url in html code. It usually
113takes a day or two for the guys who maintain sites to change code or add
114?ver=xxx attribute. And this makes rapid deployment and testing very difficult
115and time consuming. There is a limitation of how much you can test locally.
79 116
80We are now in the process of integrating [Google Tag Manager](https://www.google.com/analytics/tag-manager/) but couple of websites are developed on ASP.net platform that have some problems with tag manager. With a solution below we are certain that we are serving latest version of the script. 117We are now in the process of integrating [Google Tag Manager](https://www.google.com/analytics/tag-manager/)
118but couple of websites are developed on ASP.net platform that have some
119problems with tag manager. With a solution below we are certain that we are
120serving latest version of the script.
81 121
82And it only takes one mistake and users have the script cached and in case of caching it for 1 year you probably know where the problem is. 122And it only takes one mistake and users have the script cached and in case of
123caching it for 1 year you probably know where the problem is.
83 124
84```nginx 125```nginx
85# nginx ➜ /etc/nginx/sites-available/default 126# nginx ➜ /etc/nginx/sites-available/default
@@ -102,7 +143,10 @@ location /static/ {
102} 143}
103``` 144```
104 145
105Also be careful when redirecting to url in your python code. We noticed that if we didn't precisely setup cache control and expire headers in response we didn't get the request on the server and therefore couldn't measure clicks. So when redirecting do as follows and there will be no problems. 146Also be careful when redirecting to url in your python code. We noticed that
147if we didn't precisely setup cache control and expire headers in response we
148didn't get the request on the server and therefore couldn't measure clicks.
149So when redirecting do as follows and there will be no problems.
106 150
107```python 151```python
108# python ➜ bottlepy web micro-framework 152# python ➜ bottlepy web micro-framework
@@ -113,22 +157,42 @@ response.set_header("Location", url)
113return response 157return response
114``` 158```
115 159
116> Cache control in browsers is quite aggressive and you need to be precise to avoid future problems. We learned that lesson the hard way. 160> Cache control in browsers is quite aggressive and you need to be precise
161to avoid future problems. We learned that lesson the hard way.
117 162
118## Learn NGINX 163## Learn NGINX
119 164
120When deciding on a web server we went with Nginx as a reverse proxy for our applications. We adapted micro-service oriented architecture early in the project to ensure when we scale we can easily add additional servers to our cluster. And Nginx was crucial to perform load balancing and static content delivery. 165When deciding on a web server we went with Nginx as a reverse proxy for our
166applications. We adapted micro-service oriented architecture early in the
167project to ensure when we scale we can easily add additional servers to our
168cluster. And Nginx was crucial to perform load balancing and static content
169delivery.
121 170
122At first our config file was quite simple and later grew larger. After patching and adding new settings I sat down and learned more about the guts of Nginx. This proved to be very useful and we were able to squeeze much more out of our setup. So I advise you to take your time and read through the [documentation](https://nginx.org/en/docs/). This saved us a lot of headache. Googling for solutions only goes so far. 171At first our config file was quite simple and later grew larger. After patching
172and adding new settings I sat down and learned more about the guts of Nginx.
173This proved to be very useful and we were able to squeeze much more out of our
174setup. So I advise you to take your time and read through the
175[documentation](https://nginx.org/en/docs/). This saved us a lot of headache.
176Googling for solutions only goes so far.
123 177
124## Use Redis/Memcached 178## Use Redis/Memcached
125 179
126As explained above we are using caching basically for everything. It is the corner stone of our services. At first we were very careful about the quantity of things we stored in [Redis](https://redis.io/). But we later found out that the memory footprint is very low even when storing large amount of data in it. 180As explained above we are using caching basically for everything. It is the
181corner stone of our services. At first we were very careful about the quantity
182of things we stored in [Redis](https://redis.io/). But we later found out that
183the memory footprint is very low even when storing large amount of data in it.
127 184
128So we gradually increased our usage to caching whole HTML outputs of dashboard. This improved our performance in order of magnitude. And by using native TTL support this goes hand in hand with our needs. 185So we gradually increased our usage to caching whole HTML outputs of dashboard.
186This improved our performance in order of magnitude. And by using native TTL
187support this goes hand in hand with our needs.
129 188
130The reason why we choose [Redis](https://redis.io/) over [Memcached](https://memcached.org/) was the nature of scalability of Redis out of the box. But all this can be achieved with Memcached. 189The reason why we choose [Redis](https://redis.io/) over [Memcached](https://memcached.org/)
190was the nature of scalability of Redis out of the box. But all this can be
191achieved with Memcached.
131 192
132## Conclusion 193## Conclusion
133 194
134There are a lot more details that could have been written and every single topic in here deserves it's own post but you probably got the idea about the problems we faced. 195There are a lot more details that could have been written and every single
196topic in here deserves it's own post but you probably got the idea about
197the problems we faced.
198