From ae24d9a8869c497537839f330384cbadb2cf687c Mon Sep 17 00:00:00 2001 From: Mitja Felicijan Date: Tue, 31 Oct 2023 10:17:43 +0100 Subject: Updated theme --- public/encoding-binary-data-into-dna-sequence.html | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) (limited to 'public/encoding-binary-data-into-dna-sequence.html') diff --git a/public/encoding-binary-data-into-dna-sequence.html b/public/encoding-binary-data-into-dna-sequence.html index bdd4543..48ce1b2 100755 --- a/public/encoding-binary-data-into-dna-sequence.html +++ b/public/encoding-binary-data-into-dna-sequence.html @@ -41,7 +41,7 @@ We are made of starstuff. -- Carl Sagan, Cosmos

The nucleotide in DNA consists of a sugar (deoxyribose), one of four bases (cytosine (C), thymine (T), adenine (A), guanine (G)), and a phosphate. Cytosine and thymine are pyrimidine bases, while adenine and guanine are purine -bases. The sugar and the base together are called a nucleoside.

DNA

DNA (a) forms a double stranded helix, and (b) adenine pairs with thymine and +bases. The sugar and the base together are called a nucleoside.

DNA

DNA (a) forms a double stranded helix, and (b) adenine pairs with thymine and cytosine pairs with guanine. (credit a: modification of work by Jerome Walker, Dennis Myts)

Encode binary data into DNA sequence

As an input file you can use any file you want:

  • ASCII files,
  • Compiled programs,
  • Multimedia files (MP3, MP4, MVK, etc),
  • Images,
  • Database files,
  • etc.

Note: If you would copy all the bytes from RAM to file or pipe data to file you could encode also this data as long as you provide file pointer to the encoder.

Basic Encoding

As already mentioned, the Basic Encoding is based on a simple mapping. Since DNA @@ -143,7 +143,7 @@ making progress. 2019/01/10 00:40:09 Output image file length is 1.1 kB 2019/01/10 00:40:09 Process took 19.036117ms 2019/01/10 00:40:09 Done ... -

After encoding into PNG format this file looks like this.

Encoded Quote in PNG format

The larger the input stream is the larger the PNG file would be.

Compiled basic Hello World C program with +

After encoding into PNG format this file looks like this.

Encoded Quote in PNG format

The larger the input stream is the larger the PNG file would be.

Compiled basic Hello World C program with GCC would look like.

// gcc -O3 -o sample sample.c
 #include <stdio.h>
@@ -178,14 +178,14 @@ like.
      --version           Show application version.
 

Benchmarks

First we generate some binary sample data with dd.

dd if=<(openssl enc -aes-256-ctr  -pass pass:"$(dd if=/dev/urandom bs=128 count=1 2>/dev/null | base64)" -nosalt < /dev/zero) of=1KB.bin bs=1KB count=1 iflag=fullblock
 

Our freshly generated 1KB file looks something like this (its full of garbage -data as intended).

Sample binary file 1KB

We create following binary files:

  • 1KB.bin
  • 10KB.bin
  • 100KB.bin
  • 1MB.bin
  • 10MB.bin
  • 100MB.bin

After this we create FASTA files for all the binary files by encoding them +data as intended).

Sample binary file 1KB

We create following binary files:

  • 1KB.bin
  • 10KB.bin
  • 100KB.bin
  • 1MB.bin
  • 10MB.bin
  • 100MB.bin

After this we create FASTA files for all the binary files by encoding them into DNA sequence.

./dnae-encode -i 100MB.bin -o 100MB.fa
 

Then we GZIP all the FASTA files to see how much the can be compressed.

gzip -9 < 10MB.fa > 10MB.fa.gz
-
Encode to FASTA

The speed increase that occurs when encoding to FASTA format.

File sizes

Size of the out file after encoding.

Download CSV file with benchmarks.

References


Posts from blogs I follow around the net

Encode to FASTA

The speed increase that occurs when encoding to FASTA format.

File sizes

Size of the out file after encoding.

Download CSV file with benchmarks.

References


Posts from blogs I follow around the net

  • Finding which NFSv4 client owns a lock on a Linux NFS(v4) serverChris's Wiki :: blog
    A while back I wrote an entry about finding which NFS client owns +a lock on a Linux NFS server, which turned +out to be specific to NFS v3 (which I really should have seen coming, +since it involved NLM and lockd). Finding the NFS v4 client that +owns a lock is, depending on your perspective, either simpl…
  • October 28, 2023Rob Landley's Blog Thing for 2023
    Oh good grief, two of my least favorite licensing people, Larry Rosen and Bradley Kuhn, are interacting on the OSI's license-discuss list where the're doing bad computer history and insisting that a guy Larry Rosen -- cgit v1.2.3