aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMitja Felicijan <mitja.felicijan@gmail.com>2019-01-10 19:24:18 +0100
committerMitja Felicijan <mitja.felicijan@gmail.com>2019-01-10 19:24:18 +0100
commitad974810d43e1d5f70bca269665c25230e6a3221 (patch)
tree2396d87e409379d6ad4066b7caf62729650541e4
parent591a568ab2223f8ed79c50b53f3533858fe2e68e (diff)
downloadmitjafelicijan.com-ad974810d43e1d5f70bca269665c25230e6a3221.tar.gz
charts to plot.ly
-rw-r--r--.jekyll-metadatabin44635 -> 44770 bytes
-rw-r--r--_posts/2019-01-03-encoding-binary-data-into-dna-sequence.md70
2 files changed, 56 insertions, 14 deletions
diff --git a/.jekyll-metadata b/.jekyll-metadata
index 472f75c..32951be 100644
--- a/.jekyll-metadata
+++ b/.jekyll-metadata
Binary files differ
diff --git a/_posts/2019-01-03-encoding-binary-data-into-dna-sequence.md b/_posts/2019-01-03-encoding-binary-data-into-dna-sequence.md
index abd0164..56e96dd 100644
--- a/_posts/2019-01-03-encoding-binary-data-into-dna-sequence.md
+++ b/_posts/2019-01-03-encoding-binary-data-into-dna-sequence.md
@@ -189,12 +189,12 @@ FASTA format was extended by [FASTQ](https://en.wikipedia.org/wiki/FASTQ_format)
189 189
190### PNG encoded DNA sequence 190### PNG encoded DNA sequence
191 191
192| Nucleotides | RGB | Color name | 192| Nucleotides | RGB | Color name |
193| ------------ | ----------- | ----------- | 193| ------------ | ----------- | ---------- |
194| A (Adenine) | (0,0,255) | Blue | 194| A (Adenine) | (0,0,255) | Blue |
195| G (Guanine) | (0,100,0) | Green | 195| G (Guanine) | (0,100,0) | Green |
196| C (Cytosine) | (255,0,0) | Red | 196| C (Cytosine) | (255,0,0) | Red |
197| T (Thymine) | (255,255,0) | Yellow | 197| T (Thymine) | (255,255,0) | Yellow |
198 198
199With this in mind we can create a simple algorithm to create PNG representation of a DNA sequence. 199With this in mind we can create a simple algorithm to create PNG representation of a DNA sequence.
200 200
@@ -335,12 +335,12 @@ Our freshly generated 1KB file looks something like this (its full of garbage da
335![Sample binary file 1KB](/files/dna-sequence/sample-binary-file.png) 335![Sample binary file 1KB](/files/dna-sequence/sample-binary-file.png)
336 336
337We create following binary files: 337We create following binary files:
338- 1KB 338- 1KB.bin
339- 10KB 339- 10KB.bin
340- 100KB 340- 100KB.bin
341- 1MB 341- 1MB.bin
342- 10MB 342- 10MB.bin
343- 100MB 343- 100MB.bin
344 344
345After this we create FASTA files for all the binary files by encoding them into DNA sequence. 345After this we create FASTA files for all the binary files by encoding them into DNA sequence.
346 346
@@ -354,13 +354,55 @@ Then we GZIP all the FASTA files to see how much the can be compressed.
354gzip -9 < 10MB.fa > 10MB.fa.gz 354gzip -9 < 10MB.fa > 10MB.fa.gz
355``` 355```
356 356
357<script src="/assets/plotly-latest.min.js"></script>
358
357**Speed of encoding binary file into FASTA format.** 359**Speed of encoding binary file into FASTA format.**
358 360
359![Chart: encoding speed](/files/dna-sequence/chart-encoding-speed.png) 361<div id="encoding-benchmarks"></div>
362<script>
363(function(){
364 var trace1 = {
365 x: ['1KB.bin', '10KB.bin', '100KB.bin', '1MB.bin', '10MB.bin', '100MB.bin'],
366 y: [5.625224, 32.679975, 112.864416, 872.887675, 8472.693202, 85525.178217],
367 type: 'scatter',
368 };
369 var data = [trace1];
370 Plotly.newPlot("encoding-benchmarks", data, {
371 legend: {"orientation": "h"},
372 height: 300,
373 margin: { l: 50, r: 0, b: 50, t: 30, pad: 0 },
374 yaxis: { title: "execution time in milliseconds", titlefont: { size: 12 } },
375 });
376})();
377</script>
360 378
361**File sizes of encoded files and also GZIP-ed variations.** 379**File sizes of encoded files and also GZIP-ed variations.**
362 380
363![Chart: file sizes](/files/dna-sequence/chart-file-sizes.png) 381<div id="size-benchmarks"></div>
382<script>
383(function(){
384 var trace1 = {
385 x: ['1KB.bin', '10KB.bin', '100KB.bin', '1MB.bin', '10MB.bin', '100MB.bin'],
386 y: [4.1, 40.7, 406.7, 4100, 40700, 406700],
387 name: 'FASTA file size',
388 type: 'bar',
389 };
390 var trace2 = {
391 x: ['1KB.bin', '10KB.bin', '100KB.bin', '1MB.bin', '10MB.bin', '100MB.bin'],
392 y: [1.4, 13, 121, 1200, 12000, 118000],
393 name: 'FASTA GZIPPED file size',
394 type: 'bar',
395 };
396 var data = [trace1, trace2];
397 Plotly.newPlot("size-benchmarks", data, {
398 legend: {"orientation": "h"},
399 height: 300,
400 margin: { l: 50, r: 0, b: 50, t: 30, pad: 0 },
401 yaxis: { title: "size in kilobytes", titlefont: { size: 12 } },
402 barmode: 'stack'
403 });
404})();
405</script>
364 406
365[Download ODS file with benchmarks.](/files/dna-sequence/benchmarks.ods). 407[Download ODS file with benchmarks.](/files/dna-sequence/benchmarks.ods).
366 408