From ad974810d43e1d5f70bca269665c25230e6a3221 Mon Sep 17 00:00:00 2001 From: Mitja Felicijan Date: Thu, 10 Jan 2019 19:24:18 +0100 Subject: charts to plot.ly --- ...01-03-encoding-binary-data-into-dna-sequence.md | 70 +++++++++++++++++----- 1 file changed, 56 insertions(+), 14 deletions(-) (limited to '_posts/2019-01-03-encoding-binary-data-into-dna-sequence.md') diff --git a/_posts/2019-01-03-encoding-binary-data-into-dna-sequence.md b/_posts/2019-01-03-encoding-binary-data-into-dna-sequence.md index abd0164..56e96dd 100644 --- a/_posts/2019-01-03-encoding-binary-data-into-dna-sequence.md +++ b/_posts/2019-01-03-encoding-binary-data-into-dna-sequence.md @@ -189,12 +189,12 @@ FASTA format was extended by [FASTQ](https://en.wikipedia.org/wiki/FASTQ_format) ### PNG encoded DNA sequence -| Nucleotides | RGB | Color name | -| ------------ | ----------- | ----------- | -| A (Adenine) | (0,0,255) | Blue | -| G (Guanine) | (0,100,0) | Green | -| C (Cytosine) | (255,0,0) | Red | -| T (Thymine) | (255,255,0) | Yellow | +| Nucleotides | RGB | Color name | +| ------------ | ----------- | ---------- | +| A (Adenine) | (0,0,255) | Blue | +| G (Guanine) | (0,100,0) | Green | +| C (Cytosine) | (255,0,0) | Red | +| T (Thymine) | (255,255,0) | Yellow | With this in mind we can create a simple algorithm to create PNG representation of a DNA sequence. @@ -335,12 +335,12 @@ Our freshly generated 1KB file looks something like this (its full of garbage da ![Sample binary file 1KB](/files/dna-sequence/sample-binary-file.png) We create following binary files: -- 1KB -- 10KB -- 100KB -- 1MB -- 10MB -- 100MB +- 1KB.bin +- 10KB.bin +- 100KB.bin +- 1MB.bin +- 10MB.bin +- 100MB.bin After this we create FASTA files for all the binary files by encoding them into DNA sequence. @@ -354,13 +354,55 @@ Then we GZIP all the FASTA files to see how much the can be compressed. gzip -9 < 10MB.fa > 10MB.fa.gz ``` + + **Speed of encoding binary file into FASTA format.** -![Chart: encoding speed](/files/dna-sequence/chart-encoding-speed.png) +
+ **File sizes of encoded files and also GZIP-ed variations.** -![Chart: file sizes](/files/dna-sequence/chart-file-sizes.png) +
+ [Download ODS file with benchmarks.](/files/dna-sequence/benchmarks.ods). -- cgit v1.2.3