1 files changed, 0 insertions, 332 deletions
diff --git a/_posts/2018-01-16-using-digitalocean-spaces-object-storage-with-fuse.md b/_posts/2018-01-16-using-digitalocean-spaces-object-storage-with-fuse.md
deleted file mode 100644
index d29bd09..0000000
--- a/_posts/2018-01-16-using-digitalocean-spaces-object-storage-with-fuse.md
+++ /dev/null
@@ -1,332 +0,0 @@
---
-title: Using DigitalOcean Spaces Object Storage with FUSE
-permalink: /using-digitalocean-spaces-object-storage-with-fuse.html
-date: 2018-01-16T12:00:00+02:00
-layout: post
-type: post
-draft: false
---
-Couple of months ago [DigitalOcean](https://www.digitalocean.com) introduced new
-product called
-[Spaces](https://blog.digitalocean.com/introducing-spaces-object-storage/) which
-is Object Storage very similar to Amazon's S3. This really peaked my interest,
-because this was something I was missing and even the thought of going over the
-internet for such functionality was in no interest to me. Also in fashion with
-their previous pricing this also is very cheap and pricing page is a no-brainer
-compared to AWS or GCE. [Prices are clearly and precisely defined and
-outlined](https://www.digitalocean.com/pricing/).  You must love them for that
-:)
-## Initial requirements
-* Is it possible to use them as a mounted drive with FUSE? (tl;dr YES)
-* Will the performance degrade over time and over different sizes of objects?
-  (tl;dr NO&YES)
-* Can storage be mounted on multiple machines at the same time and be writable?
-  (tl;dr YES)
-> Let me be clear. This scripts I use are made just for benchmarking and are not
-> intended to be used in real-life situations. Besides that, I am looking into
-> using this approaches but adding caching service in front of it and then
-> dumping everything as an object to storage. This could potentially be some
-> interesting post of itself. But in case you would need real-time data without
-> eventual consistency please take this scripts as they are: not usable in such
-> situations.
-## Is it possible to use them as a mounted drive with FUSE?
-Well, actually they can be used in such manor. Because they are similar to [AWS
-S3](https://aws.amazon.com/s3/) many tools are available and you can find many
-articles and [Stackoverflow items](https://stackoverflow.com/search?q=s3+fuse).
-To make this work you will need DigitalOcean account. If you don't have one you
-will not be able to test this code. But if you have an account then you go and
-[create new
-Droplet](https://cloud.digitalocean.com/droplets/new?size=s-1vcpu-1gb&region=ams3&distro=debian&distroImage=debian-9-x64&options=private_networking,install_agent).
-If you click on this link you will already have preselected Debian 9 with
-smallest VM option.
-* Please be sure to add you SSH key, because we will login to this machine
-  remotely.
-* If you change your region please remember which one you choose because we will
-  need this information when we try to mount space to our machine.
-Instuctions on how to use SSH keys and how to setup them are available in
-article [How To Use SSH Keys with DigitalOcean
-Droplets](https://www.digitalocean.com/community/tutorials/how-to-use-ssh-keys-with-digitalocean-droplets).
-![DigitalOcean Droplets](/assets/posts/do-fuse/fuse-droplets.png){:loading="lazy"}
-After we created Droplet it's time to create new Space. This is done by clicking
-on a button [Create](https://cloud.digitalocean.com/spaces/new) (right top
-corner) and selecting Spaces. Choose pronounceable ```Unique name``` because we
-will use it in examples below. You can either choose Private or Public, it
-doesn't matter in our case. And you can always change that in the future.
-When you have created new Space we should [generate Access
-key](https://cloud.digitalocean.com/settings/api/tokens).  This link will guide
-to the page when you can generate this key. After you create new one, please
-save provided Key and Secret because Secret will not be shown again.
-![DigitalOcean Spaces](/assets/posts/do-fuse/fuse-spaces.png){:loading="lazy"}
-Now that we have new Space and Access key we should SSH into our machine.
-```bash
-# replace IP with the ip of your newly created droplet
-ssh root@IP
-# this will install utilities for mounting storage objects as FUSE
-apt install s3fs
-# we now need to provide credentials (access key we created earlier)
-# replace KEY and SECRET with your own credentials but leave the colon between them
-# we also need to set proper permissions
-echo "KEY:SECRET" > .passwd-s3fs
-chmod 600 .passwd-s3fs
-# now we mount space to our machine
-# replace UNIQUE-NAME with the name you choose earlier
-# if you choose different region for your space be careful about -ourl option (ams3)
-s3fs UNIQUE-NAME /mnt/ -ourl=https://ams3.digitaloceanspaces.com -ouse_cache=/tmp
-# now we try to create a file
-# once you mount it may take a couple of seconds to retrieve data
-echo "Hello cruel world" > /mnt/hello.txt
-```
-After all this you can return to your browser and go to [DigitalOcean
-Spaces](https://cloud.digitalocean.com/spaces) and click on your created
-space. If file hello.txt is present you have successfully mounted space to your
-machine and wrote data to it.
-I choose the same region for my Droplet and my Space but you don't have to.  You
-can have different regions. What this actually does to performance I don't know.
-Additional information on FUSE:
-* [Github project page for s3fs](https://github.com/s3fs-fuse/s3fs-fuse)
-* [FUSE - Filesystem in Userspace](https://en.wikipedia.org/wiki/Filesystem_in_Userspace)
-## Will the performance degrade over time and over different sizes of objects?
-For this task I didn't want to just read and write text files or uploading
-images. I actually wanted to figure out if using something like SQlite is viable
-in this case.
-### Measurement experiment 1: File copy
-```bash
-# first we create some dummy files at different sizes
-dd if=/dev/zero of=10KB.dat bs=1024 count=10 #10KB
-dd if=/dev/zero of=100KB.dat bs=1024 count=100 #100KB
-dd if=/dev/zero of=1MB.dat bs=1024 count=1024 #1MB
-dd if=/dev/zero of=10MB.dat bs=1024 count=10240 #10MB
-# now we set time command to only return real
-TIMEFORMAT=%R
-# now lets test it
-(time cp 10KB.dat /mnt/) |& tee -a 10KB.results.txt
-# and now we automate
-# this will perform the same operation 100 times
-# this will output results into separated files based on objecty size
-n=0; while (( n++ < 100 )); do (time cp 10KB.dat /mnt/10KB.$n.dat) |& tee -a 10KB.results.txt; done
-n=0; while (( n++ < 100 )); do (time cp 100KB.dat /mnt/100KB.$n.dat) |& tee -a 100KB.results.txt; done
-n=0; while (( n++ < 100 )); do (time cp 1MB.dat /mnt/1MB.$n.dat) |& tee -a 1MB.results.txt; done
-n=0; while (( n++ < 100 )); do (time cp 10MB.dat /mnt/10MB.$n.dat) |& tee -a 10MB.results.txt; done
-```
-Files of size 100MB were not successfully transferred and ended up displaying
-error (cp: failed to close '/mnt/100MB.1.dat': Operation not permitted).
-As I suspected, object size is not really that important. Sadly I don't have the
-time to test performance over periods of time. But if some of you would do it
-please send me your data. I would be interested in seeing results.
-**Here are plotted results**
-You can download [raw result here](/assets/posts/do-fuse/copy-benchmarks.tsv). 
-Measurements are in seconds.
-<script src="//cdn.plot.ly/plotly-latest.min.js"></script>
-<div id="copy-benchmarks"></div>
-<script>
-(function(){
-        var request = new XMLHttpRequest();
-        request.open("GET", "/assets/posts/do-fuse/copy-benchmarks.tsv", true);
-        request.onload = function() {
-                if (request.status >= 200 && request.status < 400) {
-                        var payload = request.responseText.trim();
-                        var tsv = payload.split("\n");
-                        for (var i=0; i<tsv.length; i++) { tsv[i] = tsv[i].split("\t"); }
-                        var traces = [];
-                        var headers = tsv[0];
-                        tsv.shift();
-                        Array.prototype.forEach.call(headers, function(el, idx) {
-                                var x = [];
-                                var y = [];
-                                for (var j=0; j<tsv.length; j++) {
-                                        x.push(j);
-                                        y.push(parseFloat(tsv[j][idx].replace(",", ".")));
-                                }
-                                traces.push({ x: x, y: y, type: "scatter", name: el, line: { width: 1, shape: "spline" } });
-                        });
-                        var copy = Plotly.newPlot("copy-benchmarks", traces, { legend: {"orientation": "h"}, height: 400, margin: { l: 40, r: 0, b: 20, t: 30, pad: 0 }, yaxis: { title: "execution time in seconds", titlefont: { size: 12 } }, xaxis: { title: "fn(i)", titlefont: { size: 12 } } });
-                } else { }
-        };
-        request.onerror = function() { };
-        request.send(null);
-})();
-</script>
-As far as these tests show, performance is quite stable and can be predicted
-which is fantastic. But this is a small test and spans only over couple of
-hours. So you should not completely trust them.
-### Measurement experiment 2: SQLite performanse
-I was unable to use database file directly from mounted drive so this is a no-go
-as I suspected. So I executed code below on a local disk just to get some
-benchmarks. I inserted 1000 records with DROPTABLE, CREATETABLE, INSERTMANY,
-FETCHALL, COMMIT for 1000 times to generate statistics. As you can see
-performance of SQLite is quite amazing. You could then potentially just copy
-file to mounted drive and be done with it.
-```python
-import time
-import sqlite3
-import sys
-if len(sys.argv) < 3:
-  print("usage: python sqlite-benchmark.py DB_PATH NUM_RECORDS REPEAT")
-  exit()
-def data_iter(x):
-  for i in range(x):
-    yield "m" + str(i), "f" + str(i*i)
-header_line = "%s\t%s\t%s\t%s\t%s\n" % ("DROPTABLE", "CREATETABLE", "INSERTMANY", "FETCHALL", "COMMIT")
-with open("sqlite-benchmarks.tsv", "w") as fp:
-  fp.write(header_line)
-start_time = time.time()
-conn = sqlite3.connect(sys.argv[1])
-c = conn.cursor()
-end_time = time.time()
-result_time = CONNECT = end_time - start_time
-print("CONNECT: %g seconds" % (result_time))
-start_time = time.time()
-c.execute("PRAGMA journal_mode=WAL")
-c.execute("PRAGMA temp_store=MEMORY")
-c.execute("PRAGMA synchronous=OFF")
-result_time = PRAGMA = end_time - start_time
-print("PRAGMA: %g seconds" % (result_time))
-for i in range(int(sys.argv[3])):
-  print("#%i" % (i))
-  start_time = time.time()
-  c.execute("drop table if exists test")
-  end_time = time.time()
-  result_time = DROPTABLE = end_time - start_time
-  print("DROPTABLE: %g seconds" % (result_time))
-  start_time = time.time()
-  c.execute("create table if not exists test(a,b)")
-  end_time = time.time()
-  result_time = CREATETABLE = end_time - start_time
-  print("CREATETABLE: %g seconds" % (result_time))
-  start_time = time.time()
-  c.executemany("INSERT INTO test VALUES (?, ?)", data_iter(int(sys.argv[2])))
-  end_time = time.time()
-  result_time = INSERTMANY = end_time - start_time
-  print("INSERTMANY: %g seconds" % (result_time))
-  start_time = time.time()
-  c.execute("select count(*) from test")
-  res = c.fetchall()
-  end_time = time.time()
-  result_time = FETCHALL = end_time - start_time
-  print("FETCHALL: %g seconds" % (result_time))
-  start_time = time.time()
-  conn.commit()
-  end_time = time.time()
-  result_time = COMMIT = end_time - start_time
-  print("COMMIT: %g seconds" % (result_time))
-  print
-  log_line = "%f\t%f\t%f\t%f\t%f\n" % (DROPTABLE, CREATETABLE, INSERTMANY, FETCHALL, COMMIT)
-  with open("sqlite-benchmarks.tsv", "a") as fp:
-    fp.write(log_line)
-start_time = time.time()
-conn.close()
-end_time = time.time()
-result_time = CLOSE = end_time - start_time
-print("CLOSE: %g seconds" % (result_time))
-```
-You can download [raw result here](/assets/posts/do-fuse/sqlite-benchmarks.tsv). And 
-again, these results are done on a local block storage and do not represent 
-capabilities of object storage. With my current approach and state of the test 
-code these can not be done. I would need to make Python code much more robust 
-and check locking etc.
-<div id="sqlite-benchmarks"></div>
-<script>
-(function(){
-        var request = new XMLHttpRequest();
-        request.open("GET", "/assets/posts/do-fuse/sqlite-benchmarks.tsv", true);
-        request.onload = function() {
-                if (request.status >= 200 && request.status < 400) {
-                        var payload = request.responseText.trim();
-                        var tsv = payload.split("\n");
-                        for (var i=0; i<tsv.length; i++) { tsv[i] = tsv[i].split("\t"); }
-                        var traces = [];
-                        var headers = tsv[0];
-                        tsv.shift();
-                        Array.prototype.forEach.call(headers, function(el, idx) {
-                                var x = [];
-                                var y = [];
-                                for (var j=0; j<tsv.length; j++) {
-                                        x.push(j);
-                                        y.push(parseFloat(tsv[j][idx].replace(",", ".")));
-                                }
-                                traces.push({ x: x, y: y, type: "scatter", name: el, line: { width: 1, shape: "spline" } });
-                        });
-                        var sqlite = Plotly.newPlot("sqlite-benchmarks", traces, { legend: {"orientation": "h"}, height: 400, margin: { l: 50, r: 0, b: 20, t: 30, pad: 0 }, yaxis: { title: "execution time in seconds", titlefont: { size: 12 } } });
-                } else { }
-        };
-        request.onerror = function() { };
-        request.send(null);
-})();
-</script>
-## Can storage be mounted on multiple machines at the same time and be writable?
-Well, this one didn't take long to test. And the answer is **YES**. I mounted
-space on both machines and measured same performance on both machines. But
-because file is downloaded before write and then uploaded on complete there
-could potentially be problems is another process is trying to access the same
-file.
-## Observations and conslusion
-Using Spaces in this way makes it easier to access and manage files. But besides
-that you would need to write additional code to make this one play nice with you
-applications.
-Nevertheless, this was extremely simple to setup and use and this is just
-another excellent product in DigitalOcean product line. I found this exercise
-very valuable and am thinking about implementing some sort of mechanism for
-SQLite, so data can be stored on Spaces and accessed by many VM's. For a project
-where data doesn't need to be accessible in real-time and can have couple of
-minutes old data this would be very interesting. If any of you find this
-proposal interesting please write in a comment box below or shoot me an email
-and I will keep you posted.

diff --git a/_posts/2018-01-16-using-digitalocean-spaces-object-storage-with-fuse.md b/_posts/2018-01-16-using-digitalocean-spaces-object-storage-with-fuse.md deleted file mode 100644 index d29bd09..0000000 --- a/_posts/2018-01-16-using-digitalocean-spaces-object-storage-with-fuse.md +++ /dev/null
@@ -1,332 +0,0 @@
1	---
2	title: Using DigitalOcean Spaces Object Storage with FUSE
3	permalink: /using-digitalocean-spaces-object-storage-with-fuse.html
4	date: 2018-01-16T12:00:00+02:00
5	layout: post
6	type: post
7	draft: false
8	---
9
10	Couple of months ago [DigitalOcean](https://www.digitalocean.com) introduced new
11	product called
12	[Spaces](https://blog.digitalocean.com/introducing-spaces-object-storage/) which
13	is Object Storage very similar to Amazon's S3. This really peaked my interest,
14	because this was something I was missing and even the thought of going over the
15	internet for such functionality was in no interest to me. Also in fashion with
16	their previous pricing this also is very cheap and pricing page is a no-brainer
17	compared to AWS or GCE. [Prices are clearly and precisely defined and
18	outlined](https://www.digitalocean.com/pricing/). You must love them for that
19	:)
20
21	## Initial requirements
22
23	* Is it possible to use them as a mounted drive with FUSE? (tl;dr YES)
24	* Will the performance degrade over time and over different sizes of objects?
25	(tl;dr NO&YES)
26	* Can storage be mounted on multiple machines at the same time and be writable?
27	(tl;dr YES)
28
29	> Let me be clear. This scripts I use are made just for benchmarking and are not
30	> intended to be used in real-life situations. Besides that, I am looking into
31	> using this approaches but adding caching service in front of it and then
32	> dumping everything as an object to storage. This could potentially be some
33	> interesting post of itself. But in case you would need real-time data without
34	> eventual consistency please take this scripts as they are: not usable in such
35	> situations.
36
37	## Is it possible to use them as a mounted drive with FUSE?
38
39	Well, actually they can be used in such manor. Because they are similar to [AWS
40	S3](https://aws.amazon.com/s3/) many tools are available and you can find many
41	articles and [Stackoverflow items](https://stackoverflow.com/search?q=s3+fuse).
42
43	To make this work you will need DigitalOcean account. If you don't have one you
44	will not be able to test this code. But if you have an account then you go and
45	[create new
46	Droplet](https://cloud.digitalocean.com/droplets/new?size=s-1vcpu-1gb&region=ams3&distro=debian&distroImage=debian-9-x64&options=private_networking,install_agent).
47	If you click on this link you will already have preselected Debian 9 with
48	smallest VM option.
49
50	* Please be sure to add you SSH key, because we will login to this machine
51	remotely.
52	* If you change your region please remember which one you choose because we will
53	need this information when we try to mount space to our machine.
54
55	Instuctions on how to use SSH keys and how to setup them are available in
56	article [How To Use SSH Keys with DigitalOcean
57	Droplets](https://www.digitalocean.com/community/tutorials/how-to-use-ssh-keys-with-digitalocean-droplets).
58
59	![DigitalOcean Droplets](/assets/posts/do-fuse/fuse-droplets.png){:loading="lazy"}
60
61	After we created Droplet it's time to create new Space. This is done by clicking
62	on a button [Create](https://cloud.digitalocean.com/spaces/new) (right top
63	corner) and selecting Spaces. Choose pronounceable ```Unique name``` because we
64	will use it in examples below. You can either choose Private or Public, it
65	doesn't matter in our case. And you can always change that in the future.
66
67	When you have created new Space we should [generate Access
68	key](https://cloud.digitalocean.com/settings/api/tokens). This link will guide
69	to the page when you can generate this key. After you create new one, please
70	save provided Key and Secret because Secret will not be shown again.
71
72	![DigitalOcean Spaces](/assets/posts/do-fuse/fuse-spaces.png){:loading="lazy"}
73
74	Now that we have new Space and Access key we should SSH into our machine.
75
76	```bash
77	# replace IP with the ip of your newly created droplet
78	ssh root@IP
79
80	# this will install utilities for mounting storage objects as FUSE
81	apt install s3fs
82
83	# we now need to provide credentials (access key we created earlier)
84	# replace KEY and SECRET with your own credentials but leave the colon between them
85	# we also need to set proper permissions
86	echo "KEY:SECRET" > .passwd-s3fs
87	chmod 600 .passwd-s3fs
88
89	# now we mount space to our machine
90	# replace UNIQUE-NAME with the name you choose earlier
91	# if you choose different region for your space be careful about -ourl option (ams3)
92	s3fs UNIQUE-NAME /mnt/ -ourl=https://ams3.digitaloceanspaces.com -ouse_cache=/tmp
93
94	# now we try to create a file
95	# once you mount it may take a couple of seconds to retrieve data
96	echo "Hello cruel world" > /mnt/hello.txt
97	```
98
99	After all this you can return to your browser and go to [DigitalOcean
100	Spaces](https://cloud.digitalocean.com/spaces) and click on your created
101	space. If file hello.txt is present you have successfully mounted space to your
102	machine and wrote data to it.
103
104	I choose the same region for my Droplet and my Space but you don't have to. You
105	can have different regions. What this actually does to performance I don't know.
106
107	Additional information on FUSE:
108
109	* [Github project page for s3fs](https://github.com/s3fs-fuse/s3fs-fuse)
110	* [FUSE - Filesystem in Userspace](https://en.wikipedia.org/wiki/Filesystem_in_Userspace)
111
112	## Will the performance degrade over time and over different sizes of objects?
113
114	For this task I didn't want to just read and write text files or uploading
115	images. I actually wanted to figure out if using something like SQlite is viable
116	in this case.
117
118	### Measurement experiment 1: File copy
119
120	```bash
121	# first we create some dummy files at different sizes
122	dd if=/dev/zero of=10KB.dat bs=1024 count=10 #10KB
123	dd if=/dev/zero of=100KB.dat bs=1024 count=100 #100KB
124	dd if=/dev/zero of=1MB.dat bs=1024 count=1024 #1MB
125	dd if=/dev/zero of=10MB.dat bs=1024 count=10240 #10MB
126
127	# now we set time command to only return real
128	TIMEFORMAT=%R
129
130	# now lets test it
131	(time cp 10KB.dat /mnt/) \|& tee -a 10KB.results.txt
132
133	# and now we automate
134	# this will perform the same operation 100 times
135	# this will output results into separated files based on objecty size
136	n=0; while (( n++ < 100 )); do (time cp 10KB.dat /mnt/10KB.$n.dat) \|& tee -a 10KB.results.txt; done
137	n=0; while (( n++ < 100 )); do (time cp 100KB.dat /mnt/100KB.$n.dat) \|& tee -a 100KB.results.txt; done
138	n=0; while (( n++ < 100 )); do (time cp 1MB.dat /mnt/1MB.$n.dat) \|& tee -a 1MB.results.txt; done
139	n=0; while (( n++ < 100 )); do (time cp 10MB.dat /mnt/10MB.$n.dat) \|& tee -a 10MB.results.txt; done
140	```
141
142	Files of size 100MB were not successfully transferred and ended up displaying
143	error (cp: failed to close '/mnt/100MB.1.dat': Operation not permitted).
144
145	As I suspected, object size is not really that important. Sadly I don't have the
146	time to test performance over periods of time. But if some of you would do it
147	please send me your data. I would be interested in seeing results.
148
149	Here are plotted results
150
151	You can download [raw result here](/assets/posts/do-fuse/copy-benchmarks.tsv).
152	Measurements are in seconds.
153
154	<script src="//cdn.plot.ly/plotly-latest.min.js"></script>
155	<div id="copy-benchmarks"></div>
156	<script>
157	(function(){
158	var request = new XMLHttpRequest();
159	request.open("GET", "/assets/posts/do-fuse/copy-benchmarks.tsv", true);
160	request.onload = function() {
161	if (request.status >= 200 && request.status < 400) {
162	var payload = request.responseText.trim();
163	var tsv = payload.split("\n");
164	for (var i=0; i<tsv.length; i++) { tsv[i] = tsv[i].split("\t"); }
165	var traces = [];
166	var headers = tsv[0];
167	tsv.shift();
168	Array.prototype.forEach.call(headers, function(el, idx) {
169	var x = [];
170	var y = [];
171	for (var j=0; j<tsv.length; j++) {
172	x.push(j);
173	y.push(parseFloat(tsv[j][idx].replace(",", ".")));
174	}
175	traces.push({ x: x, y: y, type: "scatter", name: el, line: { width: 1, shape: "spline" } });
176	});
177	var copy = Plotly.newPlot("copy-benchmarks", traces, { legend: {"orientation": "h"}, height: 400, margin: { l: 40, r: 0, b: 20, t: 30, pad: 0 }, yaxis: { title: "execution time in seconds", titlefont: { size: 12 } }, xaxis: { title: "fn(i)", titlefont: { size: 12 } } });
178	} else { }
179	};
180	request.onerror = function() { };
181	request.send(null);
182	})();
183	</script>
184
185	As far as these tests show, performance is quite stable and can be predicted
186	which is fantastic. But this is a small test and spans only over couple of
187	hours. So you should not completely trust them.
188
189	### Measurement experiment 2: SQLite performanse
190
191	I was unable to use database file directly from mounted drive so this is a no-go
192	as I suspected. So I executed code below on a local disk just to get some
193	benchmarks. I inserted 1000 records with DROPTABLE, CREATETABLE, INSERTMANY,
194	FETCHALL, COMMIT for 1000 times to generate statistics. As you can see
195	performance of SQLite is quite amazing. You could then potentially just copy
196	file to mounted drive and be done with it.
197
198	```python
199	import time
200	import sqlite3
201	import sys
202
203	if len(sys.argv) < 3:
204	print("usage: python sqlite-benchmark.py DB_PATH NUM_RECORDS REPEAT")
205	exit()
206
207	def data_iter(x):
208	for i in range(x):
209	yield "m" + str(i), "f" + str(i*i)
210
211	header_line = "%s\t%s\t%s\t%s\t%s\n" % ("DROPTABLE", "CREATETABLE", "INSERTMANY", "FETCHALL", "COMMIT")
212	with open("sqlite-benchmarks.tsv", "w") as fp:
213	fp.write(header_line)
214
215	start_time = time.time()
216	conn = sqlite3.connect(sys.argv[1])
217	c = conn.cursor()
218	end_time = time.time()
219	result_time = CONNECT = end_time - start_time
220	print("CONNECT: %g seconds" % (result_time))
221
222	start_time = time.time()
223	c.execute("PRAGMA journal_mode=WAL")
224	c.execute("PRAGMA temp_store=MEMORY")
225	c.execute("PRAGMA synchronous=OFF")
226	result_time = PRAGMA = end_time - start_time
227	print("PRAGMA: %g seconds" % (result_time))
228
229	for i in range(int(sys.argv[3])):
230	print("#%i" % (i))
231
232	start_time = time.time()
233	c.execute("drop table if exists test")
234	end_time = time.time()
235	result_time = DROPTABLE = end_time - start_time
236	print("DROPTABLE: %g seconds" % (result_time))
237
238	start_time = time.time()
239	c.execute("create table if not exists test(a,b)")
240	end_time = time.time()
241	result_time = CREATETABLE = end_time - start_time
242	print("CREATETABLE: %g seconds" % (result_time))
243
244	start_time = time.time()
245	c.executemany("INSERT INTO test VALUES (?, ?)", data_iter(int(sys.argv[2])))
246	end_time = time.time()
247	result_time = INSERTMANY = end_time - start_time
248	print("INSERTMANY: %g seconds" % (result_time))
249
250	start_time = time.time()
251	c.execute("select count(*) from test")
252	res = c.fetchall()
253	end_time = time.time()
254	result_time = FETCHALL = end_time - start_time
255	print("FETCHALL: %g seconds" % (result_time))
256
257	start_time = time.time()
258	conn.commit()
259	end_time = time.time()
260	result_time = COMMIT = end_time - start_time
261	print("COMMIT: %g seconds" % (result_time))
262
263	print
264	log_line = "%f\t%f\t%f\t%f\t%f\n" % (DROPTABLE, CREATETABLE, INSERTMANY, FETCHALL, COMMIT)
265	with open("sqlite-benchmarks.tsv", "a") as fp:
266	fp.write(log_line)
267
268	start_time = time.time()
269	conn.close()
270	end_time = time.time()
271	result_time = CLOSE = end_time - start_time
272	print("CLOSE: %g seconds" % (result_time))
273	```
274
275	You can download [raw result here](/assets/posts/do-fuse/sqlite-benchmarks.tsv). And
276	again, these results are done on a local block storage and do not represent
277	capabilities of object storage. With my current approach and state of the test
278	code these can not be done. I would need to make Python code much more robust
279	and check locking etc.
280
281	<div id="sqlite-benchmarks"></div>
282	<script>
283	(function(){
284	var request = new XMLHttpRequest();
285	request.open("GET", "/assets/posts/do-fuse/sqlite-benchmarks.tsv", true);
286	request.onload = function() {
287	if (request.status >= 200 && request.status < 400) {
288	var payload = request.responseText.trim();
289	var tsv = payload.split("\n");
290	for (var i=0; i<tsv.length; i++) { tsv[i] = tsv[i].split("\t"); }
291	var traces = [];
292	var headers = tsv[0];
293	tsv.shift();
294	Array.prototype.forEach.call(headers, function(el, idx) {
295	var x = [];
296	var y = [];
297	for (var j=0; j<tsv.length; j++) {
298	x.push(j);
299	y.push(parseFloat(tsv[j][idx].replace(",", ".")));
300	}
301	traces.push({ x: x, y: y, type: "scatter", name: el, line: { width: 1, shape: "spline" } });
302	});
303	var sqlite = Plotly.newPlot("sqlite-benchmarks", traces, { legend: {"orientation": "h"}, height: 400, margin: { l: 50, r: 0, b: 20, t: 30, pad: 0 }, yaxis: { title: "execution time in seconds", titlefont: { size: 12 } } });
304	} else { }
305	};
306	request.onerror = function() { };
307	request.send(null);
308	})();
309	</script>
310
311	## Can storage be mounted on multiple machines at the same time and be writable?
312
313	Well, this one didn't take long to test. And the answer is YES. I mounted
314	space on both machines and measured same performance on both machines. But
315	because file is downloaded before write and then uploaded on complete there
316	could potentially be problems is another process is trying to access the same
317	file.
318
319	## Observations and conslusion
320
321	Using Spaces in this way makes it easier to access and manage files. But besides
322	that you would need to write additional code to make this one play nice with you
323	applications.
324
325	Nevertheless, this was extremely simple to setup and use and this is just
326	another excellent product in DigitalOcean product line. I found this exercise
327	very valuable and am thinking about implementing some sort of mechanism for
328	SQLite, so data can be stored on Spaces and accessed by many VM's. For a project
329	where data doesn't need to be accessible in real-time and can have couple of
330	minutes old data this would be very interesting. If any of you find this
331	proposal interesting please write in a comment box below or shoot me an email
332	and I will keep you posted.