Cleaned up some content

author: Mitja Felicijan <mitja.felicijan@gmail.com> 2020-03-25 05:19:49 +0100
committer: Mitja Felicijan <mitja.felicijan@gmail.com> 2020-03-25 05:19:49 +0100
commit: f7eefe654a8eb27b4ac2ac10c033cbdfa85af567 (patch)
tree: c42431d9b175a0d66e05393b7843361fd726acba /content/encoding-binary-data-into-dna-sequence.md
parent: ed161e7fb20a697ecba070ef7db4c231d700f245 (diff)
download: mitjafelicijan.com-f7eefe654a8eb27b4ac2ac10c033cbdfa85af567.tar.gz
1 files changed, 11 insertions, 11 deletions
diff --git a/content/encoding-binary-data-into-dna-sequence.md b/content/encoding-binary-data-into-dna-sequence.md
index a4f8b86..068aa32 100644
--- a/content/encoding-binary-data-into-dna-sequence.md
+++ b/content/encoding-binary-data-into-dna-sequence.md
@@ -39,7 +39,7 @@ My interests in this field are purely in encoding processes and experimental tes
 ## Data encoding
-**TL;DR:** Encoding involves the use of a code to change original data into a form that can be used by an external process [^1].
+**TL;DR:** Encoding involves the use of a code to change original data into a form that can be used by an external process.
 Encoding is the process of converting data into a format required for a number of information processing needs, including:
@@ -47,7 +47,7 @@ Encoding is the process of converting data into a format required for a number o
 - Data transmission, storage and compression/decompression
 - Application data processing, such as file conversion
-Encoding can have two meanings[^1]:
+Encoding can have two meanings:
 - In computer technology, encoding is the process of applying a specific code, such as letters, symbols and numbers, to data for conversion into an equivalent cipher.
 - In electronics, encoding refers to analog to digital conversion.
@@ -69,7 +69,7 @@ Encoding can have two meanings[^1]:
 - **2000** – Genetic code of the fruit fly is decoded.
 - **2002** – Mouse is the first mammal to have its genome decoded.
 - **2003** – The Human Genome Project is completed.
- **2013** – DNA Worldwide and Eurofins Forensic discover identical twins have differences in their genetic makeup [^2].
+- **2013** – DNA Worldwide and Eurofins Forensic discover identical twins have differences in their genetic makeup.
 ## What is DNA?
@@ -83,7 +83,7 @@ The nucleotide in DNA consists of a sugar (deoxyribose), one of four bases (cyto
 ![DNA](/assets/dna-sequence/dna-basics.jpg#center)
-*DNA (a) forms a double stranded helix, and (b) adenine pairs with thymine and cytosine pairs with guanine. (credit a: modification of work by Jerome Walker, Dennis Myts) [^3]*
+*DNA (a) forms a double stranded helix, and (b) adenine pairs with thymine and cytosine pairs with guanine. (credit a: modification of work by Jerome Walker, Dennis Myts)*
 ## Encode binary data into DNA sequence
@@ -135,13 +135,13 @@ begin
 end
 ```
-Another encoding would be **Goldman encoding**. Using this encoding helps with Nonsense mutation (amino acids replaced by a stop codon) that occurs and is the most problematic during translation because it leads to truncated amino acid sequences, which in turn results in truncated proteins. [^4]
+Another encoding would be **Goldman encoding**. Using this encoding helps with Nonsense mutation (amino acids replaced by a stop codon) that occurs and is the most problematic during translation because it leads to truncated amino acid sequences, which in turn results in truncated proteins.
 [Where to store big data? In DNA: Nick Goldman at TEDxPrague](https://www.youtube.com/watch?v=a4PiGWNsIEU)
 ### FASTA file format
-In bioinformatics, FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences. The format originates from the FASTA software package, but has now become a standard in the field of bioinformatics. [^5]
+In bioinformatics, FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences. The format originates from the FASTA software package, but has now become a standard in the field of bioinformatics.
 The first line in a FASTA file started either with a ">" (greater-than) symbol or, less frequently, a ";" (semicolon) was taken as a comment. Subsequent lines starting with a semicolon would be ignored by software. Since the only comment used was the first, it quickly became used to hold a summary description of the sequence, often starting with a unique library accession number, and with time it has become commonplace to always use ">" for the first line and to not use ";" comments (which would otherwise be ignored).
@@ -339,8 +339,8 @@ gzip -9 < 10MB.fa > 10MB.fa.gz
 ## References
-[^1]: https://www.techopedia.com/definition/948/encoding
+- https://www.techopedia.com/definition/948/encoding
-[^2]: https://www.dna-worldwide.com/resource/160/history-dna-timeline
+- https://www.dna-worldwide.com/resource/160/history-dna-timeline
-[^3]: https://opentextbc.ca/biology/chapter/9-1-the-structure-of-dna/
+- https://opentextbc.ca/biology/chapter/9-1-the-structure-of-dna/
-[^4]: https://arxiv.org/abs/1801.04774
+- https://arxiv.org/abs/1801.04774
-[^5]: https://en.wikipedia.org/wiki/FASTA_format
+- https://en.wikipedia.org/wiki/FASTA_format
author	Mitja Felicijan <mitja.felicijan@gmail.com>	2020-03-25 05:19:49 +0100
committer	Mitja Felicijan <mitja.felicijan@gmail.com>	2020-03-25 05:19:49 +0100
commit	f7eefe654a8eb27b4ac2ac10c033cbdfa85af567 (patch)
tree	c42431d9b175a0d66e05393b7843361fd726acba /content/encoding-binary-data-into-dna-sequence.md
parent	ed161e7fb20a697ecba070ef7db4c231d700f245 (diff)
download	mitjafelicijan.com-f7eefe654a8eb27b4ac2ac10c033cbdfa85af567.tar.gz

diff --git a/content/encoding-binary-data-into-dna-sequence.md b/content/encoding-binary-data-into-dna-sequence.md index a4f8b86..068aa32 100644 --- a/content/encoding-binary-data-into-dna-sequence.md +++ b/content/encoding-binary-data-into-dna-sequence.md
@@ -39,7 +39,7 @@ My interests in this field are purely in encoding processes and experimental tes
39		39
40	## Data encoding	40	## Data encoding
41		41
42	TL;DR: Encoding involves the use of a code to change original data into a form that can be used by an external process [^1].	42	TL;DR: Encoding involves the use of a code to change original data into a form that can be used by an external process.
43		43
44	Encoding is the process of converting data into a format required for a number of information processing needs, including:	44	Encoding is the process of converting data into a format required for a number of information processing needs, including:
45		45
@@ -47,7 +47,7 @@ Encoding is the process of converting data into a format required for a number o
47	- Data transmission, storage and compression/decompression	47	- Data transmission, storage and compression/decompression
48	- Application data processing, such as file conversion	48	- Application data processing, such as file conversion
49		49
50	Encoding can have two meanings[^1]:	50	Encoding can have two meanings:
51		51
52	- In computer technology, encoding is the process of applying a specific code, such as letters, symbols and numbers, to data for conversion into an equivalent cipher.	52	- In computer technology, encoding is the process of applying a specific code, such as letters, symbols and numbers, to data for conversion into an equivalent cipher.
53	- In electronics, encoding refers to analog to digital conversion.	53	- In electronics, encoding refers to analog to digital conversion.
@@ -69,7 +69,7 @@ Encoding can have two meanings[^1]:
69	- 2000 – Genetic code of the fruit fly is decoded.	69	- 2000 – Genetic code of the fruit fly is decoded.
70	- 2002 – Mouse is the first mammal to have its genome decoded.	70	- 2002 – Mouse is the first mammal to have its genome decoded.
71	- 2003 – The Human Genome Project is completed.	71	- 2003 – The Human Genome Project is completed.
72	- 2013 – DNA Worldwide and Eurofins Forensic discover identical twins have differences in their genetic makeup [^2].	72	- 2013 – DNA Worldwide and Eurofins Forensic discover identical twins have differences in their genetic makeup.
73		73
74	## What is DNA?	74	## What is DNA?
75		75
@@ -83,7 +83,7 @@ The nucleotide in DNA consists of a sugar (deoxyribose), one of four bases (cyto
83		83
84	![DNA](/assets/dna-sequence/dna-basics.jpg#center)	84	![DNA](/assets/dna-sequence/dna-basics.jpg#center)
85		85
86	DNA (a) forms a double stranded helix, and (b) adenine pairs with thymine and cytosine pairs with guanine. (credit a: modification of work by Jerome Walker, Dennis Myts) [^3]	86	DNA (a) forms a double stranded helix, and (b) adenine pairs with thymine and cytosine pairs with guanine. (credit a: modification of work by Jerome Walker, Dennis Myts)
87		87
88	## Encode binary data into DNA sequence	88	## Encode binary data into DNA sequence
89		89
@@ -135,13 +135,13 @@ begin
135	end	135	end
136	```	136	```
137		137
138	Another encoding would be Goldman encoding. Using this encoding helps with Nonsense mutation (amino acids replaced by a stop codon) that occurs and is the most problematic during translation because it leads to truncated amino acid sequences, which in turn results in truncated proteins. [^4]	138	Another encoding would be Goldman encoding. Using this encoding helps with Nonsense mutation (amino acids replaced by a stop codon) that occurs and is the most problematic during translation because it leads to truncated amino acid sequences, which in turn results in truncated proteins.
139		139
140	[Where to store big data? In DNA: Nick Goldman at TEDxPrague](https://www.youtube.com/watch?v=a4PiGWNsIEU)	140	[Where to store big data? In DNA: Nick Goldman at TEDxPrague](https://www.youtube.com/watch?v=a4PiGWNsIEU)
141		141
142	### FASTA file format	142	### FASTA file format
143		143
144	In bioinformatics, FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences. The format originates from the FASTA software package, but has now become a standard in the field of bioinformatics. [^5]	144	In bioinformatics, FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences. The format originates from the FASTA software package, but has now become a standard in the field of bioinformatics.
145		145
146	The first line in a FASTA file started either with a ">" (greater-than) symbol or, less frequently, a ";" (semicolon) was taken as a comment. Subsequent lines starting with a semicolon would be ignored by software. Since the only comment used was the first, it quickly became used to hold a summary description of the sequence, often starting with a unique library accession number, and with time it has become commonplace to always use ">" for the first line and to not use ";" comments (which would otherwise be ignored).	146	The first line in a FASTA file started either with a ">" (greater-than) symbol or, less frequently, a ";" (semicolon) was taken as a comment. Subsequent lines starting with a semicolon would be ignored by software. Since the only comment used was the first, it quickly became used to hold a summary description of the sequence, often starting with a unique library accession number, and with time it has become commonplace to always use ">" for the first line and to not use ";" comments (which would otherwise be ignored).
147		147
@@ -339,8 +339,8 @@ gzip -9 < 10MB.fa > 10MB.fa.gz
339		339
340	## References	340	## References
341		341
342	[^1]: https://www.techopedia.com/definition/948/encoding	342	- https://www.techopedia.com/definition/948/encoding
343	[^2]: https://www.dna-worldwide.com/resource/160/history-dna-timeline	343	- https://www.dna-worldwide.com/resource/160/history-dna-timeline
344	[^3]: https://opentextbc.ca/biology/chapter/9-1-the-structure-of-dna/	344	- https://opentextbc.ca/biology/chapter/9-1-the-structure-of-dna/
345	[^4]: https://arxiv.org/abs/1801.04774	345	- https://arxiv.org/abs/1801.04774
346	[^5]: https://en.wikipedia.org/wiki/FASTA_format	346	- https://en.wikipedia.org/wiki/FASTA_format