Second-order information encoded in DNA

Jonathan Widom is one of the scholars who’s make a remarkable discovery about information encoded within DNA. There is, he tells us, an additional layer of information expressed in our DNA, which controls three-dimensional organization of our DNA.

Our DNA is a long double helix – a string of millions of base pairs that encode proteins. To fit into our cells, these long strings wrap against themselves – short strands (about 150 letters) wrap on a “protein spool” millions of times in the length of a strand. It matters where these nucleosomes form – control proteins want to bond to the DNA, but they can’t bond when the DNA is wrapped in a nucleosome. So the sequences not bound in nucleosomes are more likely to trigger control proteins.

This may help explain how the same DNA allows us to create very different cells, like nerve and muscle cells. We know that cells express different proteins to get their unique identity – what proteins get expressed has to do with these control proteins. “The genomic landscape helps govern which genes get expressed due to nucleosome development.”

It’s hard to bend DNA into a nucleosome – it’s a very tight bend of the DNA. The location of certain letters in certain positions makes it easier for the DNA to bend – having the “backbone” of the DNA face in makes curvature easier. Sequences of AA, TT or TA in one phase and GC in another phase makes bending easier. This code of “bendability” is “superimposed” on the genetic information.

Widom and colleagues have tested this theory both by looking at a huge (5 x 10^12 unique sets) sample of DNA molecules, and by looking at actual nucleosomic DNA in live animals. There’s a strongly periodic graph of AA/TT/AT and out of phase GC sequences in nucleosomes, and this pattern repeats in yeast and in chicken cells, two animals that are not closely related.

Why does this matter? It’s a mystery why transcription factors know how to bind to certain sequences and not to other sequences just based on their chemical sequence. The answer is that three-dimensional geography of the genome helps explain it: “The ones not supposed to be recognized are occluded in nucleosomes with high probability whereas the ones supposed to be recognized are not occluded with high probability.”

Apologies for everything I’ve gotten wrong in this talk – it was quite a technical talk and not in a field I know anything about. Chris Anderson tried to jump in about ten minutes in and get Dr. Widom to talk about some of the larger implications of this research… with little success. Clearly this is a hugely important discovery, but I suspect it’s pretty hard for most people at TED to understand just what this means for science in the long run. (Biologists in my readerbase, please feel free to use the comment thread and set us all straight…)

This entry was posted in TED2007. Bookmark the permalink.

One Response to Second-order information encoded in DNA

  1. quixote says:

    (Well, I’m a biologist, but not all that close to type of molecular biology you’re talking about. Here goes anyway, though.)

    The relationship between DNA packaging, regulation of gene expression, and transcription has been known for some time. Widom’s contribution is, I think, more the discovery that the sequences necessary for “bendability” occur in specific patterns. Which is very interesting.

    The significance of the research on this topic is that once we really understand DNA packaging and gene regulation, we’ll be able to turn genes on an off. Ultimately, that could mean regrowing limbs, or damaged spinal nerves, or Islets of Langerhans, or, in fact, just about anything.

    Exciting times.

Comments are closed.