THE EVOLUTION LIST: New Definitions of a "Gene"

Thursday, October 12, 2006

New Definitions of a "Gene"

AUTHOR: RPM

SOURCE: Evolgen

COMMENTARY: Allen MacNeill

RPM at Evolgen has a new post about evolving definitions of a "gene". Here are my thoughts on the subject:

For years I have been teaching my students that a gene is a segment of DNA that codes for a single RNA molecule with a complementary sequence, regardless of whether that RNA molecule is translated or not. This definition takes into account the genes for the various rRNAs and tRNAs, which are not translated, and also other forms of non-translated RNA that have recently been discovered. By this definition, genes that code for mRNAs that are actually translated are distinguished as "structural genes," using terminology that was first developed to describe the Jacob-Monod model of the lactose operon. Using this same terminology, the gene that codes for the lactose repressor protein is a "regulatory gene," insofar as the repressor does not function in an "extrinsic" biochemical pathway, but rather participates in the regulation of other structural genes.

However, the distinction between "structural" and "regulatory" genes outlined above is insufficient to describe the various kinds of genetically significant DNA sequences now known. For example, it does not include regions of the DNA to which protein regulators bind, but which are not themselves transcribed. It also does not distinguish between RNAs that are translated into proteins (either enzymes or repressor/regulator proteins) and those that are transcribed into RNA but never translated (such as rRNA, tRNA, and the newer non-translated RNAs).

Given the foregoing, it appears to me that there are four (possibly five) functionally different kinds of DNA coding sequences:

(1) translatable sequences: those DNA sequences that are both transcribed into mRNA and later translated into proteins, regardless of function (these can be further subdivided into proteins that participate in non-DNA related biochemical pathways and those that directly regulate DNA, but those seem to me to be classifications of the proteins, not the DNA sequences that code for them);

(2) transcribable sequences: those DNA sequences that are transcribed into RNA (i.e. rRNA, tRNA, etc.), but are not later translated into proteins/polypeptide chains. Again, what the RNAs do after being transcribed is not a function of the DNA, but rather of the RNAs, and therefore should not really be used to classify DNA coding sequences;

(3) binding sequences: those DNA sequences that are not transcribed into RNA nor translated into protein, but which function as binding sites for regulatory molecules such as repressor proteins, homeotic gene products, etc. While such sequences do not code for the production of a transcribed or translated gene product, they still participate in the regulation of other genes by serving as regulatory binding sites; and

(4) non-binding sequences: those DNA sequences that are not transcribed into RNA, not translated into protein, nor function as binding sites for regulatory moelcules. Such sequences would include highly repetitive sequences, tandom repeats, "spacer DNA", pseudogenes, retroviral and transposon inserts (both "dead" and potentially "alive"), etc. This latter category could be further subdivided into "functional" non-coding/non-binding DNA sequences versus "non-functional/parastitic" non-coding/non-binding DNA sequences, depending on whether they arise as part of the functional architecture of the DNA (primarily of eukaryotes), or whether they arise as side-effects of the action of parasitic genetic elements, such as retroviruses or transposons.

There may be other categories of DNA sequences that have other functions, but right now I can't think of any. Therefore, this is how I intend to teach the concept of a "gene" to my students at Cornell from now on.

So much for the Beadle/Tatum "one gene, one enzyme" model, eh? And the classical Mendelian definition of "one gene, one phenotypic trait" is no longer viable as well...

--Allen

Labels: biology, gene definitions, genetics, junk DNA, non-coding DNA, non-transcribed DNA, regulatory DNA

7 Comments:

At 10/13/2006 09:51:00 AM, Anonymous said...: And of course the One Gene -> One RNA definition is one that has been overturned with the elucidation of differential gene expression pathways in eukaryotic systems.

It really is an exciting time to be a biologist.
At 10/15/2006 02:52:00 PM, Anonymous said...: has Dembski banned you yet? They've been banning scientists like crazy lately, David Heddle and Tom English being the two most recent.

(You might find Heddle interesting. He's an ex-physicist ID supporter who recently read Dembski's work and declares him to be a fraud.)
At 10/17/2006 09:37:00 AM, Alan Fox said...: BTW, one of John's pet refrains is "It is not the genes but the chromosomes that do the evolving" which seems to show a basic lack of understanding about the genetic code and how genetic information is stored and inherited.
At 10/17/2006 10:09:00 PM, Allen MacNeill said...: Neurobiologists have shown that humans, along with other mammals, have quite a few more than six senses. The skin alone has at least five: heat, cold, light touch, pressure, and pain. The eyes have at least two: light intensity and color/fine discrimination. The ears and semicircular canals have another three: sound, body movement in space, and gravity, and so forth. This list doesn't include the many receptors for internal physiological processes, such as blood pressure, blood glucose concentration, etc.
At 10/18/2006 11:09:00 AM, Anonymous said...: "Using this same terminology, the gene that codes for the lactose repressor protein is a "regulatory gene," insofar as the repressor does not function in an "extrinsic" biochemical pathway, but rather participates in the regulation of other structural genes.

I've used that and have come to dislike the term as an unacceptable shorthand, at least for describing things to people just starting off.

Regarding the lacI, I'd call it a gene that *codes* for a regulatory protein. In this case, the DNA sequence itself does not actively regulate the lac operon. Additionally, there are proteins that participate directly in metabolic pathways or have roles as structural elements in the cell that also have direct roles in gene expression. Thus the relationship is "one to many" (one protein, multiple functions) and things don't pigeonhole nicely into the structure/regulation dichotomy.

How about this: A gene corresponds to a transcribed sequence (i.e. DNA to RNA or RNA to RNA), that influences phenotypic traits in an organism, either directly as RNA or indirectly via translation to protein. Associated with the transcribed sequence are structural and regulatory regions that directly influence transcription of the gene.

"Given the foregoing, it appears to me that there are four (possibly five) functionally different kinds of DNA coding sequences:"

There is overlap here as well... Actively transcribed sequences can also act structurally as spacers, and regulatory binding sites are also found inside transcribed sequences.
At 10/18/2006 04:38:00 PM, Anonymous said...: johnadavison: dont worry about being banned. I'd have thought you'd be used to it by now. They all crazy anyway.

Zero: mankind has many more then 6 senses as allen says. Why dont they generate axis of symmetry indicating something deep also? snowflakes are symmetrical also - and there's many more of them then humans. I've read your posts over at ATBC. get to the point please :)
At 11/13/2013 04:44:00 PM, Unknown said...: If binding sites for regulators can be considered a gene of sorts, could the shape of the chromosome itself be considered a form of gene? Can shape dictate function, much like nucleotide sequence does?

THE EVOLUTION LIST

Thursday, October 12, 2006

New Definitions of a "Gene"

7 Comments:

About Me

LINKS: Evolution

LINKS: Charles Darwin

LINKS: Evolution/Creationism

LINKS: Intelligent Design

LINKS: History & Philosophy

LINKS: News & Analysis

LINKS: Science Journals

LINKS: Friends and Relations

CURRICULUM VITAE, etc.