Absolutely, but that's true of most information as well. For example, the information in the article is relative to the context of understanding our language and the body of assumed knowledge and references of a reader of the New Yorker.
My understanding of biology is very limited. I've heard how physically small microbes are in every article out there. But never how information-small they are. Fascinating from a software developer's perspective.
I wonder what code golf for a virus would look like.
Like viriods, which make a virus look huge by comparison:
> Viroids are plant pathogens that consist of a short stretch (a few hundred nucleobases) of highly complementary, circular, single-stranded RNA. Viroid genomes are extremely small in size, ranging from 246 to 467 nucleotides (nt), and consisting of fewer than 10,000 atoms. In comparison, the genome of the smallest known viruses capable of causing an infection by themselves are around 2,000 nucleobases in size. The human pathogen hepatitis D virus is similar to viroids.
> Viroid RNA does not code for any protein. Their replication mechanism uses RNA polymerase II, a host cell enzyme normally associated with synthesis of messenger RNA from DNA, which instead catalyzes "rolling circle" synthesis of new RNA using the viroid's RNA as template. Some viroids are ribozymes, having catalytic properties which allow self-cleavage and ligation of unit-size genomes from larger replication intermediates.
> A ribozyme (ribonucleic acid enzyme) is an RNA molecule that is capable of catalyzing specific biochemical reactions, similar to the action of protein enzymes.
The action of ribozymes led to the RNA world hypothesis, as the mechanism for how you could have a simple system from which DNA and proteins can come as later optimizations on particular aspects. Some ribozymes are able to go as far as catalyze the building of their own RNA structure in the right environments (albeit, with limited success so far).
Right. It's so much context dependent that the "hardware" of a different animal may react very differently to it - perhaps even ignoring it altogether.
The genetic information is really the 4.8 kB of "code" PLUS the entire information already contained in the cellular hardware reading it. One doesn't make sense without the other.
At the very bottom, the whole thing depends on the laws of quantum mechanics in this universe, governing the minute details of molecular interaction. That, too, should be considered to go into the "code". Make a tiny change to the Plank constant, and the Zaire ebolavirus code will do something very different.
Thank you for saying that! Its so often repeated that our DNA contains the entire program for a human being. That's patently false. The cellular machinery provides almost all of the OS; DNA is just a script.
I liken DNA to a paper tape containing one of two punches: MAN or MOUSE. Feed it into a bio-replicator and get a man or a mouse. Does the paper tape define the man? Of course not.
> I liken DNA to a paper tape containing one of two punches: MAN or MOUSE
I don't think that really captures it. Yes, it requires external machinery to actually do anything, but DNA is much more information dense and carries much more of an exact definition of the organism to be produced.
Personally, I prefer the analogy of compiler source code. Sure, it can't do anything on its own. But it defines how an working external system (another compiler or an functional cellular environment) can produce a second possibly different system
Yet the dynamic biochemistry of the cell is orders of magnitude larger and more complex than DNA. So its larger than a paper tape, sure, but the comparison is pretty good really.
And yet the paper tape of a plant or animal genome (as opposed to a virus) also contains the instructions for the OS and the bio-replicator, which is part of what's so fascinating about it.
I don't think that's accurate at all. The DNA has no effect on the cellular soup - the RNA etc - that are the bioreactor. That you got from some ancestral Eve. It changes perhaps over time, like anything else through random chance. But its independent of the DNA, which is a tiny part of the whole.
No, the article does not contain more information. It contains more data, but when dealing with genomics you must keep in mind the fact that various codons translate to various proteins, and each of the proteins serve various functions depending on their shape... and proteins can assume different shapes, which changes their effects. This is the importance of protein-folding research.
You hit near it when you gzipped the article -- consider a genome to be an incredibly compressed format, able to explode into a truly stunning amount of information, stored in a relative paucity of raw data.
It is simply incredible to me that it could be so small (in an information sense).
This article (gzipped) is 22KB, more than 4 times as much information.