From d2da11feb54950d1181d2508238e1090ee66c68e Mon Sep 17 00:00:00 2001 From: Mitja Felicijan Date: Mon, 6 Aug 2018 07:02:42 +0200 Subject: update --- .jekyll-metadata | Bin 9105 -> 11393 bytes _posts/2015-11-10-software-development-pitfalls.md | 69 +++++++++++ ...lly-destroyed-the-joy-of-product-development.md | 37 ++++++ ...nd-why-i-choose-classic-vms-and-digitalocean.md | 41 +++++++ _posts/2017-03-07-golang-profiling-simplified.md | 110 +++++++++++++++++ ...-10-what-its-like-to-be-a-software-developer.md | 31 +++++ ...04-17-what-i-ve-learned-developing-ad-server.md | 133 +++++++++++++++++++++ ...ng-python-web-applications-with-visual-tools.md | 4 +- _posts/2017-08-11-simple-iot-application.md | 4 +- ...digitalocean-spaces-object-storage-with-fuse.md | 4 +- assets/site.css | 13 +- 11 files changed, 434 insertions(+), 12 deletions(-) create mode 100644 _posts/2015-11-10-software-development-pitfalls.md create mode 100644 _posts/2016-10-14-how-we-successfully-destroyed-the-joy-of-product-development.md create mode 100644 _posts/2017-01-12-gce-aws-docker-and-why-i-choose-classic-vms-and-digitalocean.md create mode 100644 _posts/2017-03-07-golang-profiling-simplified.md create mode 100644 _posts/2017-04-10-what-its-like-to-be-a-software-developer.md create mode 100644 _posts/2017-04-17-what-i-ve-learned-developing-ad-server.md diff --git a/.jekyll-metadata b/.jekyll-metadata index e841f08..cdffa01 100644 Binary files a/.jekyll-metadata and b/.jekyll-metadata differ diff --git a/_posts/2015-11-10-software-development-pitfalls.md b/_posts/2015-11-10-software-development-pitfalls.md new file mode 100644 index 0000000..862ae08 --- /dev/null +++ b/_posts/2015-11-10-software-development-pitfalls.md @@ -0,0 +1,69 @@ +--- +layout: post +title: Software development and my favorite pitfalls +description: Couple of observations regarding project management. +--- + +Over the years I had privilege to work on some very excited projects both in software development field and also in electronics field and every experience taught me some invaluable lessons about how NOT TO approach development. And through this post I will try to point out some of the absurd outdated techniques I find the most annoying and damaging during a development cycle. There will be swearing because this topic really gets on my nerves and I never coherently tried to explain them in writing. So if I get heated up please bare with me :) + +As new methods of project management are emerging, underlaying processes still stay old and outdated. This is mainly because we as people are unable to completely shift away from this approaches. + +I was always struggling with communication, and many times that cost me a relationship or two because I was not on the ball all the time. Through every experience I became more convinced that I am the problem and never ever doubted that the problem may be that communication never evolved a single step from emails. And if you think for a second, not many thing have changed around this topic. We just have different representations of email (message boards, chats, project management tools). And I believe this is the real problem we are facing now. + +There are many articles written about hyper connectivity and the effects that are a direct result of it. But mainstream does nothing towards it. We are just putting out fires and we do nothing to prevent it. I am certain this will be a major source of grief in coming years. And what we all can do to avoid this is to change our mindset and experiment on our communication skills, development approaches. We need to maximize possible output that a person can give. And to achieve this we need to listen to them, encourage them. I know that not everybody is a naturally born leader, but everybody has an opinion. + +There are many talks right now about methodologies such as Scrum, Kanban, Cleanroom and they all fucking piss me of :). These are all boxes that imprison people and take away their freedom of thought. This is a straight forward mindfuck / amputation of creativity. + +Let me list a couple of things that I find really destructive and bad for a project and in a long run company. + +## Ping emails + +Ping emails are emails you have to write as soon as you receive an email. It’s sole purpose is to inform sender that you received their email and you are working on it. It’s result is only to calm down the sender that their task is being dealt with. It’s intent basically is, I did my job by sending you this email so I am on clear ground. I categorize this email as fuck you email. This is one of the most irritating types of emails I need to write. This is the ultimate control freak show you can experience and it gives sender false feeling of control. Newsflash: We do not live in 1982 where there was a possibility that email never reached destination. I really fucking hate this from the bottom of my heart :) + +They should be like: “Yes, I am fucking alive and I am at your service my leash!”. I guess if I would reply like this, I wouldn’t have to write any more of this kind of messages :) + +## Everybody is a project manager + +Well, this is a tough one. I noticed that as soon as you let people to give their suggestions you are basically fucked. There is a truth in saying: “Give low expectations and deliver little bit more you promised.”. + +People tend to take a role of a manager as soon as they are presented with an opportunity. And by getting angry at them you only provoke yourself. They are not at fault. You just need to tell them they are only giving suggestions and not tasks at the beginning and everything will be alright. But if you give them a feeling that they are in control you will have immense problems explaining why their features are not in current release. + +Project mission must be always on the top of project requirements and any deviation from it will result in major project butchering. And by this I mean that project will get it’s own path and you will be left with half done software that helps nobody. Clear mission goal and clean execution will allow you to develop software will clear intent. + +## We are never wrong + +I find this type of arrogance the worst. We must always conduct ourselves that we are infallible and cannot make mistakes. As soon as procedure or process is established there is no room for changes or improvements. This is the most idiotic thing someone can say of think. If think that processes need to involve and change over time. This is imperative and need to have in your organization if you want to improve and develop company. We all need to grow balls and change everything in order to adapt to current situations. Being a prisoner of predefined processes kills creativity. + +I am constantly trying new software for project managing and communication. I believe every team has it’s own dynamic and it needs to be discovered organically and naturally through many experiments. By putting team in a box you are amputating their creativity and therefore minimizing their potential. But if you talk to an executive you will mainly find archetypical thinking and a strong need to compartmentalize everything from business processes to resource management. And this type of management that often displays micro management technique only works on short periods (couple of years) and then employees either leave company or become basically retarded drones on auto pilot. + +## Micromanaging + +This basically implies that everybody on the team is a fucking idiot that needs to have a todo list that they can not write themselves. How about spoon feeding the team at launch because besides the team leader everybody must be a retarded idiot at best. + +I prefer milestones as they give developers much more freedom and creativity developing and not waste their time checking some bizarre todo list that was not even thought through. Project always changes through development cycle and all you are left at the end is a list of unchecked tasks and the wrath of management why they are not completed. Best WTF moment! + +## Human contact - no need for it! + +We are vigorously trying to eliminate physical contact by replacing short meetings with software with no regards that we are not machines. Many times a simple 5min meeting at morning can solve most of the problems. In rapid development short bursts of man to man communication is possibly the best way to go. + +We now have all this software available and all what we get out of it is a huge clusterfuck. An obstacle and not a solution. So why we still use them? Because we strive to better ourselves. + +## MVP is killing innovation + +Many will disagree with me on this one but I stand strong by this statement. What I noticed in my experience that all this buzz words surrounding us only mislead and capture you in a circle of solving a problem that already has a solution but we are unable to see it without using some fancy word for it. The toughest this to do for a developer is to minimize requirements. Well this is though only for bad developers. Yes, I said is. There are many types of developers out there. And those unable to minimize feature scope are the ones you don’t need on your team. Their only goal is to solve problems that exist only in their fucking heads. And than you have to argue with them and waste energy on them instead of developing your awesome product. They are a cancer and I suggest you cut them off. + +MVP as an idea is great but sadly people don’t understand underlaying philosophy and they spent too much time focusing and fixating on something that every sane person with normal IQ will understand without some made up acronym. And the result is a lot of talking and barely no execution. + +Well MVP is not directly killing innovation but stupid people do when they try to understand it. + +## Pressure wasteland + +You must never allow to be pressured into confirming a deadline if you are not sure. We often feel a need that we are in service of others which is true to some extent. But it is also true that others are in service to us to some extent. And we forget this. We are all pressured all the time to make decisions just to calm other people down. And when they leave your office you experience WTF moment :) How the hell did they manage to fuck me up again :) + +People need to realize that more pressure you put on somebody less they will be able todo. So 5 min update email requests will only resolve in mental breakdown and inability to work that day. Constant poking is probably the only thing I loose my mind instantly. For all you that are doing this: “We are not fucking idiots and stop bothering us with your own insecurities and let us do our job. We will do it quicker and better without you moron breathing on our necks.” + +If this happens to me I end up with no energy at the end. Don’t you get it? You will get much more from and out of me if you ask me like a human person and not your personal butler. On a long run you are destroying your relationships and nobody would want to work with you. Your schizophrenic approach will damage only you in a long run. Nobody is anybody’s property. + +## Conclusion + +I am guilty of many things described in this post. And I find it hard sometimes to acknowledge this. And I lie to myself and try vigorously to find some explanation why I do this things. There is always space for growth. And maybe you will also find some of yourself in this post and realize what needs to change in order to evolve. diff --git a/_posts/2016-10-14-how-we-successfully-destroyed-the-joy-of-product-development.md b/_posts/2016-10-14-how-we-successfully-destroyed-the-joy-of-product-development.md new file mode 100644 index 0000000..28c936a --- /dev/null +++ b/_posts/2016-10-14-how-we-successfully-destroyed-the-joy-of-product-development.md @@ -0,0 +1,37 @@ +--- +layout: post +title: How we successfully destroyed the joy of product development +description: My take on project development. +--- + +No matter how hard we try to reinvent processes in software development we still haven’t found perfect solution for this. And to dismiss SDLC just because it’s something old is as ridiculous as the concept of designers being user experience gurus. As I have written couple of times before designers have their place and is not in the UX community. Most of them probably never heard of Jakob Nielsen and this proves a lot. Don’t get me wrong. There are designers out there that are absolutely amazing in what they do, but most of them are not. Good design has little to do with how things look in my opinion. But it has very much to do with how product behaves. And to take a chance on design look only is scary to me. + +I have this huge beef with so called UX “experts”. I really do. From the bottom of my heart. I almost hate them. Well, not the pure ghetto ones. There are many of them out there I am sure of. But I have not had the pleasure to work with such person. + +Good UX expert should have programming background and an eye for design. Being UX expert requires you to be analytical and precise. Not really qualities of designers. Design is much more about the feeling and emotional perception. And this two don’t dance well together. + +Natural progression of project focused on user should be: + +- detailed requirements and fantastic prototypes/wireframes with detailed user journey diagrams, +- design focused and restricted to serve requirements, +- code written just to fulfill design and requirements → nothing more and nothing less → no additional dead code should be allowed, +- testing should be done on all targeted devices → avoid bugs and you will avoid brand failure. + +Designer should never be allowed to have blank canvas. Good software is written because there are many restrictions either in requirements or real world. And most importantly → good software is solving only one problem at the time. I don’t see why this shouldn’t apply to design as well. + +Yes yes we get it, but we don’t have the time or the money to do project development like that. Well, you better find it or you will slowly decline into abyss of mediocre companies that have nothing to show for. Clients are not dumb and are in need of quality products and services. It is not enough anymore for a product that just works. It has to be technically precise and functionally on the spot. + +When developers and designers are forced to think and work from the scratch many new doors open. New ideas are born how to solve problems that were previously not possible because they were living in a box of limited thought and patterns. If you solve problems always only with your knowledge nothing new can be invented. When there is no room for experimentation there is no room for improvement. You want your developers and designers to be this fountain of innovation and you don’t really let them innovate, you are just slowly closing front doors of your company. Good developers and designers are hard to find and even easier to loose. + +Being agile does not mean to be a slave of constant changes. It does not mean that project managers can constantly change requirements at their will. And it sure does not mean that clear vision on product direction should be something we said goodby to. We have perverted initial intention of Manifesto for Agile Software Development as we always do. We have taken it so far and we have all become slaves of advertisement by consulting companies trying to cash in on this “new - but old” concept. + +Manifesto for Agile Software Development states: + +- individuals and interactions over processes and tools, +- working software over comprehensive documentation, +- customer collaboration over contract negotiation, +- responding to change over following a plan. + +This was written in times when software was developed very differently than how we do it now. We have eliminated many of the problems from old age just by listening to reason and not trendy hyped words that are just tools of marketing strategist to avoid the real issues. Being flat, being agile, being stupid is what I say. + +Development and design should be about improving yourself and consequently product you are working on. When this becomes a chore you should probably start thinking about changing companies. People make products not management. diff --git a/_posts/2017-01-12-gce-aws-docker-and-why-i-choose-classic-vms-and-digitalocean.md b/_posts/2017-01-12-gce-aws-docker-and-why-i-choose-classic-vms-and-digitalocean.md new file mode 100644 index 0000000..3f202a8 --- /dev/null +++ b/_posts/2017-01-12-gce-aws-docker-and-why-i-choose-classic-vms-and-digitalocean.md @@ -0,0 +1,41 @@ +--- +layout: post +title: GCE, AWS, Docker and why I choose classic VM’s and DigitalOcean for my current project +description: Reasons why I choose DigitalOcean for my project +--- + +I have been developing a product for the past few months and one of product’s requirement is the ability to automatically scale quickly on system’s demand. + +As most of you probably know system design is much more important then actual code that will drive the product. And this was my main concern when developing this product. I have read anything I could get my hands on about Docker as it was hyped so much in media for the past two years. At a first glance Docker was ideal fit for this platform. But then as I started to seriously experiment with it and developing around it several problems occurred. Well, it would be unfair to call them problems but lets say drawbacks when developing rapidly. + +To put it in perspective: this project is basically MVP that needs to automatically scale when new customers signs up. These customers are sending metrics to my system that is later visualized and analyzed. There were some basic requirements that I needed the answer before I choose technology. + +- Pricing involving hardware and infrastructure. +- Ease of implementation/deployment and scaling. +- How much will this cost me per customer? + +The way I envisioned the architecture was straight forward → simple nodes in cluster that take care of x number of customers (1 node ~ 10 customers). I found that pricing in GCE and AWS is very hard to predict → what the cost will be when system would scale. And this was necessary for me to know in order to make financial projection of costs. This is the most important thing for me at this time as I am deciding on prices we should charge future customers and establish healthy revenue model and subsequently business model. I want this product to organically scale and fuel its future development with money made by product itself → very little startup capital (10 nodes for six months & capital for company expanses). I have made many simulations but could not figure out with at least some certainty what will that cost be. Based on this both of the providers are currently not suited for me. So I choose DigitalOcean. They have really straight forward pricing model and this allowed me to make pretty accurate cost matrix for my infrastructure. + +I love hard metrics. By this I mean metrics I can test now and have trust they will hold in the future. This was the reason I found Docker too volatile as containers are spawned and halted and there is really no way in predicting this numbers. I have no problem with spawning multiple VMs and not using them but having basically limited control over that is at this time unacceptable for me. + +## Docker tools and complexity that comes with it + +Probably some of you will correct me on this one, but I find all this management tools like Kubernetes, Swarm etc a bit overkill for a startup project. All this tools are able to scale really massively but they all require extensive knowledge of DevOps. When you are a one man band trying to push a product out, there just is no time to learn these tools and concepts in depth in order to really take advantage of their features. It is much easier to use internal metrics of your app (uwsgi stats server, golang middleware stats) and simply fetch them to one server and visualize them. That task alone took me couple of hours and I had simple metrics system in place that with collaboration of DigitalOcean API enabled me to auto spawn new VMs on demand when users reached max number of users supported by current number of nodes. There is something to say about simplicity of this solution. And I love simple solutions. + +## Lack of real life examples of Docker in action + +I found many HelloWorld examples and tutorials showing how to spawn containers and deploying simple python apps but I haven’t found really clear example of showing how to battle permanent storage with containers, load balancing, disk management, ip & port management. + +This is not Docker’s nor community’s fault to be absolutely clear. It just shows that it is not that simple to deploy real-world application with Docker. Maybe my software architecture is not designed with Docker in mind. + +## Ease of deployment + +What I really love about Docker is ease of deployment of your application code via container. Multilayered architecture of Docker images also adds to pros list. And the fact that containers sit on top of host OS makes it very intriguing. But if you use container engine from Google you basically spawn VM’s and run containers in this machines and this takes bare-metal approach out of the equation. So at the end you still use hypervisior. I guess if I had my own hardware servers I would be able to fully take advantage of containers. + +Because most of my code in nodes is written in Golang and C++ deployment becomes pretty easy. All I have to do is replace binaries on node and that’s that. To avoid downtime I have two instances of one node and I load balance between them. So when I am updating software I first update on node1.A and then node1.B if first one is successful. + +## Where to go from here + +Docker is amazing technology. But the weird pricing model and steep learning curve for deployment of real live application at this time is too much of a hassle for me. I am sure I could lower costs with Docker approach but it would just took too much time at this stage to implement it properly. + +I am currently trying to adapt my project to fit Docker and I believe this would be an interesting solution. Idea is to use one container for one customer. I would just need to find the solution for auto-spawning containers on demand for a specific customer. I would then need a flexible load balancer to correctly forward traffic to container designated for this customer. The problem I have is that I need very flexible storage solution because the amount of data that will be aggregated will scale exponentially and I need to permanently store this on disk. And VM approach is allowing me to precisely calculate per customer per VM how much disk I need. Maybe one of you may have a better solution. diff --git a/_posts/2017-03-07-golang-profiling-simplified.md b/_posts/2017-03-07-golang-profiling-simplified.md new file mode 100644 index 0000000..f74c7b2 --- /dev/null +++ b/_posts/2017-03-07-golang-profiling-simplified.md @@ -0,0 +1,110 @@ +--- +layout: post +title: Golang profiling simplified +description: Golang profiling made easy +--- + +Many posts have been written regarding profiling in Golang and I haven’t found proper tutorial regarding this. Almost all of them are missing some part of important information and it gets pretty frustrating when you have a deadline and are not finding simple distilled solution. + +Nevertheless, after searching and experimenting I have found a solution that works for me and probably should also for you. + +## Where are my pprof files? + +By default pprof files are generated in /tmp/ folder. You can override folder where this files are generated programmatically in your golang code as we will see below in example. + +## Why is my cpu profile empty? + +I have found out that sometimes CPU profile is empty because program was not executing long enough. Programs, that execute too quickly don’t produce pprof file in my cases. Well, file is generated but only contains 4KB of information. + +## Profiling + +As you can see from examples we are executing dummy_benchmark functions to ensure some sort of execution. Memory profiling can be done without such a “complex” function. But CPU profiling needs it. + +Both memory and CPU profiling examples are almost the same. Only parameters in main function when calling profile.Start are different. When we set profile.ProfilePath(“.”) we tell profiler to store pprof files in the same folder as our program. + +### Memory profiling + +```go +package main + +import ( + "fmt" + "time" + "github.com/pkg/profile" +) + +func dummy_benchmark() { + + fmt.Println("first set ...") + for i := 0; i < 918231333; i++ { + i *= 2 + i /= 2 + } + + <-time.After(time.Second*3) + + fmt.Println("sencond set ...") + for i := 0; i < 9182312232; i++ { + i *= 2 + i /= 2 + } +} + +func main() { + defer profile.Start(profile.MemProfile, profile.ProfilePath("."), profile.NoShutdownHook).Stop() + dummy_benchmark() +} +``` + +### CPU profiling + +```go +package main + +import ( + "fmt" + "time" + "github.com/pkg/profile" +) + +func dummy_benchmark() { + + fmt.Println("first set ...") + for i := 0; i < 918231333; i++ { + i *= 2 + i /= 2 + } + + <-time.After(time.Second*3) + + fmt.Println("sencond set ...") + for i := 0; i < 9182312232; i++ { + i *= 2 + i /= 2 + } +} + +func main() { + defer profile.Start(profile.CPUProfile, profile.ProfilePath("."), profile.NoShutdownHook).Stop() + dummy_benchmark() +} +``` + +### Generating profiling reports + +```bash +# memory profiling +go build mem.go +./mem +go tool pprof -pdf ./mem mem.pprof > mem.pdf + +# cpu profiling +go build cpu.go +./cpu +go tool pprof -pdf ./cpu cpu.pprof > cpu.pdf +``` + +This will generate PDF document with visualized profile. + +- [Memory PDF profile example](/files/golang-profiling-mem.pdf) +- [CPU PDF profile example](/files/golang-profiling-cpu.pdf) diff --git a/_posts/2017-04-10-what-its-like-to-be-a-software-developer.md b/_posts/2017-04-10-what-its-like-to-be-a-software-developer.md new file mode 100644 index 0000000..52f5861 --- /dev/null +++ b/_posts/2017-04-10-what-its-like-to-be-a-software-developer.md @@ -0,0 +1,31 @@ +--- +layout: post +title: What it's like to be a software developer +description: Couple of observations regarding project management +--- + +I get asked a lot what the hell I actually do. I find it funny but I guess it is my fault in most cases. I try not to be the kind of a man that is always talking about his work. I live in a small village and most of my neighbours probably have no idea what I actually do. And I am ok with that. I prefer this. But on some occasions I find it disturbing how people judge other people just because they don't understand what they are all about. Many of them probably think I am some strange kind of a looser that is awake all the time and works from home. He probably plays games and type on a computer :) What kind of a job is that? That is no job at all! :) You work for eight hours, then you go home and drink a beer and go work in your workshop. This is what real men do! + +Well, you know. It's just the way it is. And it takes time for people to understand. Being home after many years in living elsewhere really grounded me in some cases. Coming back to the place where you grew up brings some sort of a humility back in your life. And this is ok. Nobody want's to be Icarus anyways. + +What I am meaning to say is if you are in a similar situation as me it will take time for people to start understanding you. Don't get discouraged by this. Take it as it is. People judge what they don't understand. + +I have this saying that sleeping is for pussies and we will sleep when we die. I am 32 years old now and I haven't slowed down regarding my work hours. I have steped up the pace. I usually work for about 16-18 hours a day every day. It doesn't matter if it's Monday or Saturday. Work needs to be done. + +I know that there are other ways. But if you want to be good there really is no other way. There are no shortcuts. There is no easier way to get to the point where you really know what the hell you are doing. Myth about this genius programmer truly is one huge bullshit. Without putting in the hours nothing can be achieved. There is no success without dedication. + +My friends and coworkers often ask me how the hell did I learn so much stuff. Where do I find the time to go through all this material. And I have a simple response for them: "When you go to sleep I begin reading and prototyping. When you go on a trip I make prototype projects just for the sake of learning. When you take your time for fucking around I read articles and books hunting that single small piece of information that will help me one day." And often they don't believe me. They think I am just that smart and everything is easy for me. They have this misguided belief that I just had all this knowledge implanted in me at birth. And this is not the case. I have read so much in my lifetime and most of this information was useful to me later in my life. But that didn't stop me even though I had no immediate use of it. This probably is the main difference between me and my friends. I don't learn because I need to but because I am piecing together this huge puzzle and I threat is like a game. This amazing game of enlightenment. + +I had many burn-downs in my career. Most of them come around new years. I guess around this time things slow down a bit and right then when you relax for a minute or two things get real :). They say when you enter your retirment you should never ever park your ass on a couch. You will die there :) When my burndown happens I fall into this huge depression and I start questioning my sanity. I question my decisions. I question my progress in life. I question everything. I try to understand if all this is worth it?! And every time this happens I struggle with this kind of questions. And by the time all this is over I come to the same conclusion every single time. Yes it fucking is worth it. And through the years I have noticed that this is some sort of a reset for me. This helps me maintain my sanity in the long run :) I love it when things get tough. It gets me to the next level. This teaches me progress is life. + +I don't even count anymore how many programming languages I have learned. I even stop noticing projects. They just fly by. It's like I am hunting this revelation that is set for me. And this drives me. This helps me every day to step up my game. Every single problem I solve I come little closer to my goal. My never reaching goal. And it's ok with me if I never reach this goal. + +The only problem I have now is time. There just ain't enough time to learn everything day has to offer. It's like I am on a quest to become this mini search machine :). + +This obsession with learning has come to the point where I stopped watching TV and news all together. I find this as noise that clutters your mind. The whole point about news is to frighten you and put your mind into a dangerous loop where you thinks that nothing matters anyways → world is going to shit. And the truth is so far away from this. We are living in this times where all this amazing possibilities are at hand. We just need to take control of our mindset and everything starts to look possible again. + +What else can say after more than 10 years in this space? What else can be said anyways? I still love what I do as much as I did 10 years ago. I love it even more. And if I would have a single suggestion for all of you is to stop worrying about immediate benefits and focus on the long run. Learn, prototype, experiment and have fun. We all get frustrated at times but that doesn't mean we should stop. Doing this kind of work is a privilege. We are making and creating. In the most pure sense we are creators. And there really is no better way to live your life. + +> A life without challenge, a life without hardship, a life without purpose, seems pale and pointless. With challenge come perseverance and gumption. With hardship come resilience and resolve. With purpose come strength and understanding. +> +> — Terry Fallis, The High Road diff --git a/_posts/2017-04-17-what-i-ve-learned-developing-ad-server.md b/_posts/2017-04-17-what-i-ve-learned-developing-ad-server.md new file mode 100644 index 0000000..772fb98 --- /dev/null +++ b/_posts/2017-04-17-what-i-ve-learned-developing-ad-server.md @@ -0,0 +1,133 @@ +--- +layout: post +title: What I've learned developing ad server +description: Lessons I learned developing contextual ad server +--- + +For the past year and half I have been developing native advertising server that contextually matches ads and displays them in different template forms on variety of websites. This project grew from serving thousands of ads per day to millions. + +The system is made from couple of core components: + +- API for serving ads, +- Utils - cronjobs and queue management tools, +- Dashboard UI. + +Initial release was using [MongoDB](https://www.mongodb.com/) for full-text search but was later replaced by [Elasticsearch](https://www.elastic.co/) for better CPU utilization and better search performance. This provided us with many amazing functionalities of [Elasticsearch](https://www.elastic.co/). You should check it out if you do any search related operations. + +Because the premise of the server is to provide native ad experience, they are rendered on the client side via simple templating engine. This ensures that ads can be displayed number of different ways based on the visual style of the page. And this makes Javascript client library quite complex. + +So now that you know basic information about the product lets get into the lessons we learned. + +## Aggregate everything + +After beta version was released everything (impressions, clicks, etc) was written in nanosecond resolution in the database. At that time we were using [PostgreSQL](https://www.postgresql.org/) and database quickly grew way above 200GB in disk space. And that was problematic. Statistics took disturbingly long time to aggregate. Also using indexes on stats table in database was no help after we reached 500 million datapoints. + +> There is a marketing product information and there is real life experience. And the tend to be quite the opposite. + +This was the reason that now everything is aggregated on daily basis and this data is then fed to Elastic in form of daily summary. With this we achieved we can now track many more dimensions such as zone, channel and platform information. And with this information we can now adapt occurrences of ads on specific places more precisely. + +We have also adapted [Redis](https://redis.io/) as a full-time citizen in our stack. Because Redis also stores information on a local disk we have some sort of backup if server would accidentally suffer some failure. + +All the real-time statistics for ad serving and redirecting is presented as counters in Redis instance and daily extracted and pushed to Elastic. + +## Measure everything + +The thing about software is that we really don't know how well it is performing under load until such load is presented. When testing locally everything is fine but when on production things tend to fall apart. + +As a solution for this we are measuring everything we can. Function execution time (by encapsulating functions with timers), server performance (cpu, memory, disk, etc), Nginx and [uWSGI](https://uwsgi-docs.readthedocs.io/) performance. We sacrifice a bit of performance for the sake of this information. And we store all this information for later analysis. + +**Example of function execution time** + +```json +{ + "get_final_filtered_ads": { + "counter": 1931250, + "avg": 0.0066143431, + "elapsed": 12773.9500310003 + }, + "store_keywords_statistics": { + "counter": 1931011, + "avg": 0.0004605267, + "elapsed": 889.2821669996 + }, + "match_by_context": { + "counter": 1931011, + "avg": 0.0055960716, + "elapsed": 10806.0758889999 + }, + "match_by_high_performance": { + "counter": 262, + "avg": 0.0152770229, + "elapsed": 4.00258 + }, + "store_impression_stats": { + "counter": 1931250, + "avg": 0.0006189991, + "elapsed": 1195.4419869999 + } +} +``` + +We have also started profiling with [cProfile](https://pymotw.com/2/profile/) and then visualizing with [KCachegrind](http://kcachegrind.sourceforge.net/). This provides much more detailed look into code execution. + +## Cache control is your friend + +Because we use Javascript library for rendering ads we rely on this script extensively and when in need we need to be able to change behavior of the script quickly. + +In our case we can not simply replace javascript url in html code. It usually takes a day or two for the guys who maintain sites to change code or add ?ver=xxx attribute. And this makes rapid deployment and testing very difficult and time consuming. There is a limitation of how much you can test locally. + +We are now in the process of integrating [Google Tag Manager](https://www.google.com/analytics/tag-manager/) but couple of websites are developed on ASP.net platform that have some problems with tag manager. With a solution below we are certain that we are serving latest version of the script. + +And it only takes one mistake and users have the script cached and in case of caching it for 1 year you probably know where the problem is. + +```nginx +# nginx ➜ /etc/nginx/sites-available/default +location /static/ { + alias /path-to-static-content/; + autoindex off; + charset utf-8; + gzip on; + gzip_types text/plain application/javascript application/x-javascript text/javascript text/xml text/css; + location ~* \.(ico|gif|jpeg|jpg|png|woff|ttf|otf|svg|woff2|eot)$ { + expires 1y; + add_header Pragma public; + add_header Cache-Control "public"; + } + location ~* \.(css|js|txt)$ { + expires 3600s; + add_header Pragma public; + add_header Cache-Control "public, must-revalidate"; + } +} +``` + +Also be careful when redirecting to url in your python code. We noticed that if we didn't precisely setup cache control and expire headers in response we didn't get the request on the server and therefore couldn't measure clicks. So when redirecting do as follows and there will be no problems. + +```python +# python ➜ bottlepy web micro-framework +response = bottle.HTTPResponse(status=302) +response.set_header("Cache-Control", "no-store, no-cache, must-revalidate") +response.set_header("Expires", "Thu, 01 Jan 1970 00:00:00 GMT") +response.set_header("Location", url) +return response +``` + +> Cache control in browsers is quite aggressive and you need to be precise to avoid future problems. We learned that lesson the hard way. + +## Learn NGINX + +When deciding on a web server we went with Nginx as a reverse proxy for our applications. We adapted micro-service oriented architecture early in the project to ensure when we scale we can easily add additional servers to our cluster. And Nginx was crucial to perform load balancing and static content delivery. + +At first our config file was quite simple and later grew larger. After patching and adding new settings I sat down and learned more about the guts of Nginx. This proved to be very useful and we were able to squeeze much more out of our setup. So I advise you to take your time and read through the [documentation](https://nginx.org/en/docs/). This saved us a lot of headache. Googling for solutions only goes so far. + +## Use Redis/Memcached + +As explained above we are using caching basically for everything. It is the corner stone of our services. At first we were very careful about the quantity of things we stored in [Redis](https://redis.io/). But we later found out that the memory footprint is very low even when storing large amount of data in it. + +So we gradually increased our usage to caching whole HTML outputs of dashboard. This improved our performance in order of magnitude. And by using native TTL support this goes hand in hand with our needs. + +The reason why we choose [Redis](https://redis.io/) over [Memcached](https://memcached.org/) was the nature of scalability of Redis out of the box. But all this can be achieved with Memcached. + +## Conclusion + +There are a lot more details that could have been written and every single topic in here deserves it's own post but you probably got the idea about the problems we faced. diff --git a/_posts/2017-04-21-profiling-python-web-applications-with-visual-tools.md b/_posts/2017-04-21-profiling-python-web-applications-with-visual-tools.md index e4eece4..5bbfe48 100644 --- a/_posts/2017-04-21-profiling-python-web-applications-with-visual-tools.md +++ b/_posts/2017-04-21-profiling-python-web-applications-with-visual-tools.md @@ -1,7 +1,7 @@ --- layout: post -title: "Profiling Python web applications with visual tools" -description: "Missing link when debugging and profiling python web applications" +title: Profiling Python web applications with visual tools +description: Missing link when debugging and profiling python web applications --- I have been profiling my software with KCachegrind for a long time now and I was missing this option when I am developing API's or other web services. I always knew that this is possible but never really took the time and dive into it. diff --git a/_posts/2017-08-11-simple-iot-application.md b/_posts/2017-08-11-simple-iot-application.md index 699b146..f2bcabc 100644 --- a/_posts/2017-08-11-simple-iot-application.md +++ b/_posts/2017-08-11-simple-iot-application.md @@ -1,7 +1,7 @@ --- layout: post -title: "Simple IOT application supported by real-time monitoring and data history" -description: "Develop simple IOT application with Arduino MKR1000 and Python" +title: Simple IOT application supported by real-time monitoring and data history +description: Develop simple IOT application with Arduino MKR1000 and Python --- I have been developing these kind of application for the better part of my last 5 years and people keep asking me how to approach developing such application and I will give a try explaining it here. diff --git a/_posts/2018-01-16-using-digitalocean-spaces-object-storage-with-fuse.md b/_posts/2018-01-16-using-digitalocean-spaces-object-storage-with-fuse.md index c172a9b..7442956 100644 --- a/_posts/2018-01-16-using-digitalocean-spaces-object-storage-with-fuse.md +++ b/_posts/2018-01-16-using-digitalocean-spaces-object-storage-with-fuse.md @@ -1,7 +1,7 @@ --- layout: post -title: "Using DigitalOcean Spaces Object Storage with FUSE" -description: "Using DigitalOcean Spaces Object Storage with FUSE" +title: Using DigitalOcean Spaces Object Storage with FUSE +description: Using DigitalOcean Spaces Object Storage with FUSE --- Couple of months ago [DigitalOcean](https://www.digitalocean.com) introduced new product called [Spaces](https://blog.digitalocean.com/introducing-spaces-object-storage/) which is Object Storage very similar to Amazon's S3. This really peaked my interest, because this was something I was missing and even the thought of going over the internet for such functionality was in no interest to me. Also in fashion with their previous pricing this also is very cheap and pricing page is a no-brainer compared to AWS or GCE. [Prices are clearly and precisely defined and outlined](https://www.digitalocean.com/pricing/). You must love them for that :) diff --git a/assets/site.css b/assets/site.css index 1546328..6e3da46 100644 --- a/assets/site.css +++ b/assets/site.css @@ -9,7 +9,7 @@ body { line-height: 1.6; color: #000; margin: 0; - padding: 0; + padding: 0 0 50px 0; } article, @@ -96,7 +96,7 @@ main ul div { blockquote { margin: 40px 0 40px 20px; border-left: 5px solid #eee; - padding: 5px 0 10px 20px + padding: 5px 0 10px 20px; } .highlighter-rouge { @@ -104,22 +104,23 @@ blockquote { padding: 0 15px; font-size: 80%; border: 2px solid #f1f1f1; - border-radius: 2px + border-radius: 2px; + overflow: auto; } ::selection { background: #ff0; - color: #000 + color: #000; } ::-moz-selection { background: #ff0; - color: #000 + color: #000; } @media only screen and (max-width:768px) { body { - padding: 0 20px + padding: 0 20px; } footer, header, -- cgit v1.2.3