In this week’s eSkeptic, we present William Stansfield’s article from the archives of Skeptic magazine Volume 10, Number 4 in which he critiques the typing monkeys metaphor generated by Richard Hardison and Richard Dawkins as being too unlike the biological realities of natural selection.
William Stansfield is Emeritus Professor, Biological Sciences Department, California Polytechnic State University. He is the author of textbooks in genetics, evolution, molecular and cell biology, immunology, and science history. He is coauthor with R.C. King of a dictionary of genetics.
How Evolution Really Works
by William Standsfield
MICHAEL SHERMER’S ARTICLE “To Be or Not to Be a Weasel: Hamlet, Intelligent Design, and How Evolution Works” (Skeptic magazine Vol. 9, No. 4, 16–20) discussed the metaphor originally produced (independently) by Richard Hardison and Richard Dawkins, in which imaginary monkeys pound on a typewriter to produce a line from Shakespeare’s Hamlet. Using Hardison’s example, “To Be Or Not To Be,” he tacitly assumes that every letter of the alphabet has an equal opportunity of being typed each time the monkey hits a key. If we ignore the five spaces in the phrase and discount capital versus lower case letters, the probability that the 13-letter sequence “tobeornottobe” could be typed by chance is the inverse of “26 to the 13th power.” However, it is possible to use a computer program to “randomize alphabet selection until a T is drawn. Then it will be programmed to do the same for the O and continue accordingly for all desired 13 letters.” In theory, using this procedure, the phrase is expected to be typed in 1 out of 338 trials. Hardison claims this procedure (which Richard Dawkins calls “ratcheted cumulative selection”) is more akin to how natural selection works than the previous nonselective method where none of the individual letters have any adaptive value of their own apart from the complete 13 letter sequence.
Shermer’s article includes several critiques of Hardison’s method by other correspondents. R. Reece suggests that perhaps “Two be or knot too be” is “close enough for evolutionary forces to generate.” Reece thus allows 3 letters to be added to the sequence without destroying its sense when verbalized.
Bill Kittler would “allow ‘nonsense’ sequences between the correspondence elements (e.g., ‘cAStYOUsLIKE’ or even ‘AcSYOgULIzKE)’” as a developmental stage for producing the title of Shakespeare’s play AS YOU LIKE IT. All that is needed for perfection is a process for removing nonsense sequences.
All of these artificial programs are merely metaphors for the processes of organic evolution, i.e., generation of heritable variations by random mutations, recombination of homologous DNA sequences during the formation of gametes (meiosis), and preservation of the more favorable variants by differential reproductive success via natural selection.
I can’t resist the temptation to try to put these metaphors into a more biological context. Each gene consists of a unique sequence of four nucleotides (A, T, G, and C) in a DNA molecule. Gene expression involves the transcription of the gene sequence into a kind of “mirror image” molecule called messenger RNA (mRNA). Ribosomes translate the mRNA sequence into a string of amino acids that comprise a protein molecule. The traits that a multicellular organism possesses are functions of the kind, amount, locations, and timing of the proteins made by its various cells. An average size protein chain contains about 300 amino acids in a specific sequence, like the letters of our phrase “tobeornottobe.” Some proteins consist of a single chain of amino acids; other proteins must contain two or more identical chains to be functional; still others may consist of two or more dissimilar chains (each kind of chain specified by it own gene).
There are 20 kinds of amino acids from which biological proteins are naturally made. Each amino acid is encoded by a triplet of 3 adjacent nucleotides (a codon) in DNA (and its mirror image in mRNA). There are 4 x 4 x 4 = 64 possible codon permutations. Three of these triplets are referred to as “nonsense codons.” They do not code for any amino acid. Instead they serve as signals to terminate translation of the protein chain; Kittler’s use of the term “nonsense sequences” is unfortunately inappropriate here. All of the remaining 61 possible triplets specify an amino acid and are called “sense codons.” Some amino acids have only one corresponding codon; others may have 2, 3, 4, or 6 synonymous triplets that are called “samesense” codons. Thus, there is a lot of redundancy in the genetic code and selection cannot discriminate between same sense mutations. For example, consider a short segment of a protein chain containing 5 amino acids. One amino acid is represented in the genetic code by only one codon, another has 2 samesense codons, another with three, another with four, and another with six. The number of different DNA molecules that could code for this string of 5 amino acids is 1 x 2 x 3 x 4 x 6 = 144. “Missense codons” specify amino acids that are different than those of the original DNA in which they occurred. Proteins bearing altered amino acid sequences may be harmful or beneficial (both in various degrees) or selectively neutral depending on several factors.
Most protein chains spontaneously fold back and forth on themselves into intricate patterns produced by various kinds of bonds between nonadjacent amino acids that tend to produce the energetically most stable molecules. The natural configuration of each kind of protein in a defined environment is thus unique. Each protein enzyme contains a set of amino acids in a specific configuration (called its reactive site) that can associate closely with its substrate like a “lock and key” or a “hand and glove.” The same is true for the interaction of a protein antibody and its cognate antigen or, in general, the way that any protein chains associate with one another. The recognition or reactive site may contain amino acids that are not closely linked on the same chain (or perhaps even on other chains) because of the characteristic way that a functional protein folds up into its most stable configuration. Any mutation that can increase the stability of a protein (e.g., to a wider range of temperatures, hydrogen ion concentrations, or other environmental variables), enhance its biological activity, reaction specificity, and circulation half-life, or control the expression level in an adaptive way would tend to be perpetuated in a population by natural selection.
Since there are 26 letters in the alphabet and only 20 amino acids, let’s delete the last six letters of the alphabet (uvwxyz). This will leave 20 letters, each of which can be assigned to one of the amino acids. Let us envision a segment of an ancestral protein that contains none of the 13 amino acids in our phrase “tobeornottobe.” The original protein was obviously adaptive, or else organisms could not have survived and reproduced. We can imagine that over millions of years of evolution the 13 original amino acid sites have gradually been replaced by those in our phrase. Let the order of these sites be numbered 1 to 13, left to right in the protein chain.
It is highly unlikely that the order with which each of these sites was replaced would be exactly the same as their linear order in the chain (as suggested by Hardison’s model). For example, the 7th letter (n) in the phrase might have been the oldest replacement, whereas the most recent replacement could be the 3rd letter (b). It is also highly unlikely that all seven replacements in a chain of about 300 amino acids would be grouped together in one contiguous segment. It is much more likely that the 13 sites where replacements have occurred would be nonadjacent and separated by perhaps as many as 11 segments containing the original amino acids of the ancestral sequence (as suggested by Kittler). If each of the amino acid replacements made the organism in which it occurred more adaptive to the local environment in which it lived, then organisms possessing each stage of mutated protein (produced by ratcheted cumulative selection) survived and reproduced more offspring than those with the immediate ancestral sequence. However, there may be one or more “nonessential” segments of a protein chain outside of its reactive/ recognition site(s) that could be modified by certain kinds of DNA missense mutations and yet have no effect on the functioning of that protein. Such a “selectively neutral” mutation would be unlikely to be retained in a population unless it was closely linked (hitchhiking) on the same DNA molecule to an otherwise well adapted nucleotide sequence.
In either event, over many generations, the frequency of the ancestral amino acids at each of the 13 sites would gradually diminish as the frequency of the more adaptive amino acid substitutions (or selectively neutral hitchhickers) increased in the population. It is unrealistic to think that any one of these adaptive replacements at any one site must spread throughout the population (gradually reducing the incidence of the ancestral amino acid at that site to a rare occurrence) before natural selection could begin the process of replacing one or more other amino acid sites. At any given time in the evolution of a protein, there may have been two or more sites in the process of being replaced by more adaptive amino acids. In the human population, for example, there currently are more than 100 amino acid substitutions known in the beta chains of normal adult human hemoglobin. Most of these variants produce a functional protein, so most people who carry them are usually unaware of it. By contrast, the full meaning (function) of the English phrase “tobeornottobe” makes no sense (is not fully functional) until all 13 letters are in their respective positions.
At the level of the gene, it may be more difficult to make the typing metaphor understandable in biological terms than it is at the level of the protein. This is partly because a string of 13 x 3 = 39 nucleotides is minimally required to code for a string of 13 amino acids. The number of mutations occurring in a gene cannot be accurately estimated from the number of amino acid substitutions in the protein product of the gene. The reason for this is that many mutations may be “samesense” and produce no change in the amino acid sequence of the protein. In addition, many genes in the nucleated cells of plants and animals have one or more coding regions (exons) interrupted by noncoding regions (introns). These “split genes” are transcribed in the normal way into mRNA molecules within the cell’s nucleus. However, the mRNAs must have their introns removed and their exons spliced together before they are released into the cytoplasm to be translated into protein chains by ribosomes. If Kittler’s “nonsense” sequences are considered to be noncoding introns, there does exist a mechanism for removing them during the nuclear processing of mRNA molecules. As long as mutations within introns do not interfere with the excision and splicing processes, we would not expect them to be subject to selection that operates at the level of the trait determined by the protein. (Introns and DNA segments between genes do not code for proteins, and used to be called “junk DNA.” Recently however, some of this “junk” has been found to be transcribed into small RNA molecules that can have a profound influence on the behavior of cells by interacting with other RNA molecules, with DNA, with proteins, or with small chemical molecules. Our failure to recognize the significance of these microRNAs “may well go down as one of the biggest mistakes in the history of molecular biology.”)
Reece’s sequence “twobeorknottoobe” introduced 3 additional letters to the standard 13-letter phrase. Messenger RNA molecules are translated by ribosomes reading three adjacent nucleotides at a time coding for each amino acid. Introducing one additional nucleotide to a gene results in mRNA molecules that have shifted their “reading frame” by one nucleotide from the point of the addition on through the remainder of the molecule. Ordinarily a frame shift mutation (single nucleotide addition or deletion) would create many “missense” mutations that specify the wrong amino acids in the protein chain. They could render the protein useless if they occurred in the reactive site of the chain or caused the chain to fold improperly. If the shifted reading frame created a nonsense triplet, the length of the protein would prematurely stop there and the truncated protein would probably not function. In a few cases it has been possible to show that two or more frameshift mutations coding for a “nonessential region” of a protein may restore the correct reading frame. For example, one nucleotide addition and one nucleotide deletion compensate for one another to restore the correct reading frame; only the codon sequence between the two mutations would be in the wrong reading frame. Three additions (or three deletions) would also restore the reading frame. The region between the first and third mutations would be in the wrong reading frame; for 3 additions, the protein chain would be one amino acid longer than normal; for 3 deletions, the protein chain would be one amino acid shorter than normal.
New genes can be created by bringing together, as exons of a single gene, several coding sequences that had previously specified different proteins or different structural or functional domains of the same protein, through intron-mediated recombination. This process is called exon shuffling. Each different letter of our phrase could represent a different exon in a gene assembled by exon shuffling.
Occasionally an entire gene or even longer DNA nucleotide sequence may become duplicated. The duplicate gene may then experience one or more mutations of various kinds that may allow its protein product to perform new adaptive functions. These mutational changes can evolve without diminishing the organism’s fitness because the original gene continues to carry out its normal function while the duplicate gene goes its own evolutionary way. For example, it is thought that the photosensitive protein opsin, sensitive to red light in birds and primates, differs because the primate version arose anew in Old World primates from a duplication of the green opsin gene. Over time, the duplicated gene accumulated mutations that made it encode a protein sensitive to red wavelengths.
Metaphors can sometimes be very useful educational tools. However, I believe that the typing-monkeys metaphor generated by Hardison and Dawkins are so unlike biological realities and the way that natural selection operates that they will only tend to confuse students, rather than help them learn. My advise to teachers is “quit monkeying around.” Just tell it like it is.
Skeptical perspectives on evolution and natural selection
- The Selfish Gene
(paperback $15.95) by Richard Dawkins
Richard Dawkins’ brilliant reformulation of the theory of natural selection has the rare distinction of having provoked as much excitement and interest outside the scientific community as within it. His theories have helped change the whole nature of the study of social biology, and have forced thousands of readers to rethink their beliefs about life… Read more…
- The Blind Watchmaker
(paperback $15.95) by Richard Dawkins
One of the most famous creationist arguments originated with 18th century theologian William Paley who suggested that since a watch should have a maker, the natural world also needed to have one. Just as a watch is too complicated and too functional to have sprung into existence by accident, so too must all living things, with their far greater complexity, be purposefully designed. It was Charles Darwin’s brilliant discovery that put the lie to these arguments. But only Richard Dawkins could have written this eloquent riposte to the creationists… READ more…
- River Out of Eden
(DVD $23.95 CD $15.95) with Richard Dawkins
In this lecture at Caltech, evolutionary biologist Dawkins continues his train of evolutionary reasoning from his previous bestselling works (listed above). Dawkins shows why creationism is simply and obviously wrong. He also examines the African Eve theory and discusses current controversies in evolutionary theory. A brilliant lecture by one of the greatest scientists of our time. Read more…