aboutsummaryrefslogtreecommitdiff
path: root/_posts/posts/2021-12-25-running-golang-application-as-pid1.md
diff options
context:
space:
mode:
authorMitja Felicijan <mitja.felicijan@gmail.com>2024-03-10 14:59:14 +0100
committerMitja Felicijan <mitja.felicijan@gmail.com>2024-03-10 14:59:14 +0100
commit1100562e29f6476448b656dbddd4cf22505523f6 (patch)
tree442eec492199104bd49dfd74474ce89ade8fcac9 /_posts/posts/2021-12-25-running-golang-application-as-pid1.md
parenta40d80be378e46a6c490e1b99b0d8f4acd968503 (diff)
downloadmitjafelicijan.com-1100562e29f6476448b656dbddd4cf22505523f6.tar.gz
Move back to JBMAFP
Diffstat (limited to '_posts/posts/2021-12-25-running-golang-application-as-pid1.md')
-rw-r--r--_posts/posts/2021-12-25-running-golang-application-as-pid1.md348
1 files changed, 0 insertions, 348 deletions
diff --git a/_posts/posts/2021-12-25-running-golang-application-as-pid1.md b/_posts/posts/2021-12-25-running-golang-application-as-pid1.md
deleted file mode 100644
index edd5a57..0000000
--- a/_posts/posts/2021-12-25-running-golang-application-as-pid1.md
+++ /dev/null
@@ -1,348 +0,0 @@
1---
2title: Running Golang application as PID 1 with Linux kernel
3permalink: /running-golang-application-as-pid1.html
4date: 2021-12-25T12:00:00+02:00
5layout: post
6type: post
7draft: false
8---
9
10## Unikernels, kernels, and alike
11
12I have been reading a lot about
13[unikernernels](https://en.wikipedia.org/wiki/Unikernel) lately and found them
14very intriguing. When you push away all the marketing speak and look at the
15idea, it makes a lot of sense.
16
17> A unikernel is a specialized, single address space machine image constructed
18> by using library operating systems. ([Wikipedia](https://en.wikipedia.org/wiki/Unikernel))
19
20I really like the explanation from the article
21[Unikernels: Rise of the Virtual Library Operating System](https://queue.acm.org/detail.cfm?id=2566628).
22Really worth a read.
23
24If we compare a normal operating system to a unikernel side by side, they would
25look something like this.
26
27![Virtual machines vs Containers vs Unikernels](/assets/posts/pid1/unikernels.webp){:loading="lazy"}
28
29From this image, we can see how the complexity significantly decreases with
30the use of Unikernels. This comes with a price, of course. Unikernels are hard
31to get running and require a lot of work since you don't have an actual proper
32kernel running in the background providing network access and drivers etc.
33
34So as a half step to make the stack simpler, I started looking into using
35Linux kernel as a base and going from there. I came across this
36[Youtube video talking about Building the Simplest Possible Linux System](https://www.youtube.com/watch?v=Sk9TatW9ino)
37by [Rob Landley](https://landley.net) and apart from statically compiling the
38application to be run as PID1 there was really no other obstacles.
39
40## What is PID 1?
41
42PID 1 is the first process that Linux kernel starts after the boot process.
43It also has a couple of unique properties that are unique to it.
44
45- When the process with PID 1 dies for any reason, all other processes are
46 killed with KILL signal.
47- When any process having children dies for any reason, its children are
48 re-parented to process with PID 1.
49- Many signals which have default action of Term do not have one for PID 1.
50- When the process with PID 1 dies for any reason, kernel panics, which
51 result in system crash.
52
53PID 1 is considered as an Init application which takes care of running other
54and handling services like:
55
56- sshd,
57- nginx,
58- pulseaudio,
59- etc.
60
61If you are on a Linux machine, you can check what your process is with PID 1
62by running the following.
63
64```sh
65$ cat /proc/1/status
66Name: systemd
67Umask: 0000
68State: S (sleeping)
69Tgid: 1
70Ngid: 0
71Pid: 1
72PPid: 0
73...
74```
75
76As we can see on my machine the process with id of 1 is [systemd](https://systemd.io/)
77which is a software suite that provides an array of system components for Linux
78operating systems. If you look closely you can also see that the `PPid`
79(process id of the parent process) is `0` which additionally confirms that
80this process doesn't have a parent.
81
82## So why even run application as PID 1 instead of just using a container?
83
84Containers are wonderful, but they come with a lot of baggage. And because they
85are in their nature layered, the images require quite a lot of space and also a
86lot of additional software to handle them. They are not as lightweight as they
87seem, and many popular images require 500 MB plus disk space.
88
89The idea of running this as PID 1 would result in a significantly smaller footprint,
90as we will see later in the post.
91
92> You could run a simple init system inside Docker container described more
93> in this article [Docker and the PID 1 zombie reaping problem](https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem/).
94
95## The master plan
96
971. Compile Linux kernel with the default definitions.
982. Prepare a Hello World application in Golang that is statically compiled.
993. Run it with [QEMU](https://www.qemu.org/) and providing Golang application
100 as init application / PID 1.
101
102For the sake of simplicity we will not be cross-compiling any of it and just
103use the 64bit version.
104
105## Compiling Linux kernel
106
107```sh
108$ wget https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-5.15.7.tar.xz
109$ tar xf linux-5.15.7.tar.xz
110
111$ cd linux-5.15.7
112
113$ make clean
114
115# read more about this https://stackoverflow.com/a/41886394
116$ make defconfig
117
118$ time make -j `nproc`
119
120$ cd ..
121```
122
123At this point we have kernel image that is located in `arch/x86_64/boot/bzImage`.
124We will use this in QEMU later.
125
126To make our lives a bit easier lets move the kernel image to another place.
127Lets create a folder `bin/` in the root of our project with `mkdir -p bin`.
128
129
130At this point we can copy `bzImage` to `bin/` folder with
131`cp linux-5.15.7/arch/x86_64/boot/bzImage bin/bzImage`.
132
133The folder structure of this experiment should look like this.
134
135```txt
136pid1/
137 bin/
138 bzImage
139 linux-5.15.7/
140 linux-5.15.7.tar.xz
141```
142
143## Preparing PID 1 application in Golang
144
145This step is relatively easy. The only thing we must have in mind that we will
146need to compile the binary as a static one.
147
148Let's create `init.go` file in the root of the project.
149
150```go
151package main
152
153import (
154 "fmt"
155 "time"
156)
157
158func main() {
159 for {
160 fmt.Println("Hello from Golang")
161 time.Sleep(1 * time.Second)
162 }
163}
164```
165
166If you notice, we have a forever loop in the main, with a simple sleep of 1
167second to not overwhelm the CPU. This is because PID 1 should never complete
168and/or exit. That would result in a kernel panic. Which is BAD!
169
170There are two ways of compiling Golang application. Statically and dynamically.
171
172To statically compile the binary, use the following command.
173
174```sh
175$ go build -ldflags="-extldflags=-static" init.go
176```
177
178We can also check if the binary is statically compiled with:
179
180```sh
181$ file init
182init: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, Go BuildID=Ypu8Zw_4NBxm1Yxg2OYO/H5x721rQ9uTPiDVh-VqP/vZN7kXfGG1zhX_qdHMgH/9vBfmK81tFrygfOXDEOo, not stripped
183
184$ ldd init
185not a dynamic executable
186```
187
188At this point, we need to create [initramfs](https://www.linuxfromscratch.org/blfs/view/svn/postlfs/initramfs.html)
189(abbreviated from "initial RAM file system", is the successor of initrd. It
190is a cpio archive of the initial file system that gets loaded into memory
191during the Linux startup process).
192
193```sh
194$ echo init | cpio -o --format=newc > initramfs
195$ mv initramfs bin/initramfs
196```
197
198The projects at this stage should look like this.
199
200```txt
201pid1/
202 bin/
203 bzImage
204 initramfs
205 linux-5.15.7/
206 linux-5.15.7.tar.xz
207 init.go
208```
209
210## Running all of it with QEMU
211
212[QEMU](https://www.qemu.org/) is a free and open-source hypervisor. It emulates
213the machine's processor through dynamic binary translation and provides a set
214of different hardware and device models for the machine, enabling it to run a
215variety of guest operating systems.
216
217```sh
218$ qemu-system-x86_64 -serial stdio -kernel bin/bzImage -initrd bin/initramfs -append "console=ttyS0" -m 128
219```
220
221```sh
222$ qemu-system-x86_64 -serial stdio -kernel bin/bzImage -initrd bin/initramfs -append "console=ttyS0" -m 128
223[ 0.000000] Linux version 5.15.7 (m@khan) (gcc (GCC) 11.2.1 20211203 (Red Hat 11.2.1-7), GNU ld version 2.37-10.fc35) #7 SMP Mon Dec 13 10:23:25 CET 2021
224[ 0.000000] Command line: console=ttyS0
225[ 0.000000] x86/fpu: x87 FPU will use FXSAVE
226[ 0.000000] signal: max sigframe size: 1440
227[ 0.000000] BIOS-provided physical RAM map:
228[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
229[ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
230[ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
231[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000007fdffff] usable
232[ 0.000000] BIOS-e820: [mem 0x0000000007fe0000-0x0000000007ffffff] reserved
233[ 0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
234[ 0.000000] NX (Execute Disable) protection: active
235[ 0.000000] SMBIOS 2.8 present.
236[ 0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-6.fc35 04/01/2014
237[ 0.000000] tsc: Fast TSC calibration failed
238...
239[ 2.016106] ALSA device list:
240[ 2.016329] No soundcards found.
241[ 2.053176] Freeing unused kernel image (initmem) memory: 1368K
242[ 2.056095] Write protecting the kernel read-only data: 20480k
243[ 2.058248] Freeing unused kernel image (text/rodata gap) memory: 2032K
244[ 2.058811] Freeing unused kernel image (rodata/data gap) memory: 500K
245[ 2.059164] Run /init as init process
246Hello from Golang
247[ 2.386879] tsc: Refined TSC clocksource calibration: 3192.032 MHz
248[ 2.387114] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x2e02e31fa14, max_idle_ns: 440795264947 ns
249[ 2.387380] clocksource: Switched to clocksource tsc
250[ 2.587895] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3
251Hello from Golang
252Hello from Golang
253Hello from Golang
254```
255
256The whole [log file here](/assets/posts/pid1/qemu.log).
257
258## Size comparison
259
260The cool thing about this approach is that the Linux kernel and the application
261together only take around 12 MB, which is impressive as hell. And we need to
262also know that the size of bzImage (Linux kernel) could be greatly decreased
263by going into `make menuconfig` and removing a ton of features from the kernel,
264making the size even smaller. I managed to get kernel size down to 2 MB and
265still working properly.
266
267```sh
268total 12M
269-rw-r--r--. 1 m m 9.3M Dec 13 10:24 bzImage
270-rw-r--r--. 1 m m 1.9M Dec 27 01:19 initramfs
271```
272
273## Creating ISO image and running it with Gnome Boxes
274
275First we need to create proper folder structure with `mkdir -p iso/boot/grub`.
276
277Then we need to download the [grub binary](https://github.com/littleosbook/littleosbook/raw/master/files/stage2_eltorito).
278You can read more about this program on https://github.com/littleosbook/littleosbook.
279
280```sh
281$ wget -O iso/boot/grub/stage2_eltorito https://github.com/littleosbook/littleosbook/raw/master/files/stage2_eltorito
282```
283
284```sh
285$ tree iso/boot/
286iso/boot/
287├── bzImage
288├── grub
289│   ├── menu.lst
290│   └── stage2_eltorito
291└── initramfs
292```
293
294Let's copy files into proper folders.
295
296
297```sh
298$ cp stage2_eltorito iso/boot/grub/
299$ cp bin/bzImage iso/boot/
300$ cp bin/initramfs iso/boot/
301```
302
303Lets create a GRUB config file at `nano iso/boot/grub/menu.lst` with contents.
304
305```ini
306default=0
307timeout=5
308
309title GoAsPID1
310kernel /boot/bzImage
311initrd /boot/initramfs
312```
313
314Let's create iso file by using genisoimage:
315
316```sh
317genisoimage -R \
318 -b boot/grub/stage2_eltorito \
319 -no-emul-boot \
320 -boot-load-size 4 \
321 -A os \
322 -input-charset utf8 \
323 -quiet \
324 -boot-info-table \
325 -o GoAsPID1.iso \
326 iso
327```
328
329This will produce `GoAsPID1.iso` which you can use with [Virtualbox](https://www.virtualbox.org/)
330or [Gnome Boxes](https://apps.gnome.org/app/org.gnome.Boxes/).
331
332<video src="/assets/posts/pid1/boxes.mp4" controls></video>
333
334## Is running applications as PID 1 even worth it?
335
336Well, the answer to this is not as simple as one would think. Sometimes it is
337and sometimes it's not. For embedded systems and very specialized applications
338it is worth for sure. But in normal uses, I don't think so. It was an interesting
339exercise in compiling kernels and looking at the guts of the Linux kernel,
340but sticking to containers for most of the things is a better option in my
341opinion.
342
343An interesting experiment would be creating an image that supports networking
344and could be deployed to AWS as an EC2 instance and observing how it fares.
345But in that case, we would need to write some sort of supervisor that would
346run on a separate EC2 that would check if other EC2 instances are running
347properly. Remember that if your application fails, kernel panics and the
348whole machine is inoperable in this case.