So I'm sure you've registered my objection that the word 'Information' refers to a human record of something going on in the universe. The real issue is that there are complex codes and relationships in the universe .. that are there for us to observe!. So whether or not 'information' is the

*enumeration*of complex parts of the universe or the complex parts themselves - Lets talk about the complex parts. Back to the

**ZEBRAFISH**.

There are 1700 million bases in the zebrafish genome (about 1/3 the size of the human genome). To recap: Each base can be Adenosine, Thymidine, Guanine or Cytosine. Therefore 1700 million bases can encode a vast amount of 'information'.

QUESTION - how much Information?.

Well we could describe it in the minimum number of binary digits that would be required. A=10, C=01, T=00, G=11 ... => 2 bits/base position

But cf. 'information theory' we could have caluclated this from Shannon's informational entropy

*(H) {summed for all nucleotides,i} = -p{i} * log2 (p{i})*. If we assume each nucleotide occurs with roughly equal frequency, i.e. p(A) = 0.25, p(T) = 0.25, etc. This equation resolves to: Informational entropy

*(H) {for a base position} = [ -0.25 x log2 (0.25) ] * 4*= 2 bits.

=> The Zebrafish genome has an information capacity of 3400 million bits or 425 Megabytes (2/3 of a CD-ROM).

Redundant information

A random sequence of DNA, 1700 million bases long would also have an 'information capacity' of 425 megabytes. However there is most certainly a complex stucture in the arrangement of the Zebrafish genome, that would be missing from the hypothetical random sequence. The Zebrafish genome is arranged in chromosomes, genes, promoters, start codons, stop codons, exons and introns - complex patterns. Have a look for yourself.

By the very fact that there are patterns - the sequence is not random - BUT remember - even complex patterns are more predictable than random sequence - therefore the Zebrafish genome contains quite a bit of redundant 'predictable' sequence. In Information theory terms - this redundancy, this reduced 'suprise factor' means it is further away from using its maximal information capacity.

Granted it is no where as simple or redundant as 1700 million 'A' nucleotides in a row {see how such a hypothetical 'All-A' system is so 'ordered', so 'simple' or so low-in-information that I can describe it in one line of text}- But the Zebrafish genome is also not as unpredictable/informaton dense as to be totally random.

Fascinatingly complex natural systems like genomes, earthquakes and traffic jams are often characterised by fractal patterning called '1/f' or 'pink noise' rather than being totally random which is called 'white noise'.

An old friend

Now I'm sure its obvious to us all that there are a number of special things about the Zebrafish genome - that a random sequence would be useless at. Making Zebrafish for example...

So to ask your question for you - and if I may - speaking Teleologically - Some DNA makes people, some makes fish. Where do these

*useful*biological codes come from ...

**Edited by pantrog, 26 June 2005 - 10:58 AM.**