Heritability studies suggest a substantial genetic component to many aspects of physical capacity (2). Although none of the genes that determine the difference between low and high capacity have yet been identified, recent developments in genetic theory and molecular tools have accelerated the attainment of this goal. The central purpose of this review is to describe the development and use of animal genetic models for determining biologic cause and effect via cosegregation studies. Rat genetic models of aerobic treadmill running capacity will be the primary trait discussed.
From the genetic perspective, traits can be classified as either Mendelian or quantitative. Mendelian traits are those for which a genetic difference at a single locus is sufficient to cause a difference in phenotypic expression of a given character. Quantitative traits do not manifest as discrete phenotypes in populations, but distribute with continuous variation. Continuous variation is the result of the variable presence and expression of many genes (i.e., polygenic) within a population as they interact with the environment. Many physical traits of interest, such as height, body weight, blood pressure, coordination, and strength, are examples of quantitative traits. The inherent complexity of quantitative traits enhances the importance of using a unified conceptual and analytical approach as a starting point for genetic analysis, as given by the expression (4) Yx = μx + Gx + Ex + (GE)x + Errorx, where: Yx is the value for trait x, μx is the population mean for trait x, Gx is the genetic variation for trait x, Ex is the environmental variation for trait x, (GE)x is the genetic-environmental interaction for trait x, and Errorx represents the random and systematic errors in measure of trait x.
Animal models in which genetic and environmental variations approach a minimum are of substantial value. Animal models first selectively bred for contrasting low and high measures of a trait and then subsequently developed into inbred strains are the most useful. An inbred strain is defined as one that has been exclusively sister-brother mated for at least 20 generations. Each generation of inbreeding increases the probability of attaining homozygosity at any given genetic locus; after 20 generations, approximately 97.5% of the loci are homozygous. The major value of inbred strains emanates from their close genetic uniformity that facilitates genotyping, phenotyping, and the opportunity for multiple investigators to evaluate the same genetic substrate repeatedly; this type of uniformity cannot be approached in human studies.
COSEGREGATION AS A GENERAL APPROACH TO EVALUATING GENETIC CAUSATION
Animal strains widely divergent for a trait and methods of genetic analysis merge to provide a pathway for determining genes that cause natural trait variation within populations. The essential element is to evaluate the association of genes or downstream gene products (such as mRNA, protein, or subordinate physiological traits) with values of the phenotype of interest. This path works because genetic variation causes trait variation, not vice versa. In a segregating population in which alleles (alternate forms of the same gene) recombine randomly to yield new genotypes, genes that cause a given trait will remain associated with that trait and other genes will segregate randomly with respect to the trait of interest. The most informative segregating population is produced from two sequential breeding crosses of inbred strains (Fig. 1). The first cross is between two different inbred parental strains (P1 and P2) that contrast widely for a trait and presumably have contrasting alleles that dictate the difference in that trait. These original contrasting strains are often referred to as the “low strain” and “high strain” to indicate the difference in trait measure. The P1 × P2 cross produces F1 (first filial) animals that are all essentially identical heterozygotes. An F1 ×F1 cross yields a segregating F2 (second filial) population in which allelic variants recombine randomly, producing a wide distribution of genotypes and thus a wide distribution of phenotypic trait values.
A simple example of inheritance based on a single Mendelian trait can lead to understanding cosegregation when extended to the use of genetic markers in the polygenic condition of a quantitative trait. Consider the existence of two allelic variants (A1 and A2) that produce enough variation in a phenotype that the two homozygotic (A1A1 and A2A2) and the singular heterozygotic (A1A2) genotypes can be distinguished by measurement of the trait of interest. Presume the low strain (P1) has the homozygous genotype A1A1 and that the high strain (P2) has the homozygous genotype A2A2. A mating between P1 and P2 will produce only A1A2 heterozygotes in the F1 offspring. That is:
When F1 hybrids are intercrossed, a segregating population will be produced that has all possible genotypes in the ratio of 1:2:1 (1A1A1, 2A1A2, 1A2A2):
Thus, for a Mendelian trait of large effect, segregation of the phenotype can be followed in an F2 population. That is, the phenotypes of the homozygotes (A1A1 and A2A2) would segregate and display the low and high values for the trait, whereas the heterozygotes would display values in between the low and high.
Genetic markers are identifiable physical locations on a chromosome (loci) whose inheritance can be followed similarly as described above for Mendelian traits. Genetic markers are of fundamental value because they can be used to make linkage maps of chromosomes for genetic analysis. The most widely used markers are short tandem repeated sequences of DNA such as CACACACACACACACA (an 8 tandem dinucleotide repeat) that occur throughout introns and in nontranslated DNA flanking many alleles. Di-, tri-, and tetra-nucleotide repeats are termed microsatellites and are widely distributed throughout the genome. The variation in the number of repeated elements and thus length of the repeats permits them to function as markers. For example, two strains can differ for the length of a single microsatellite marker found between identical flanking DNA regions; such differences are referred to as polymorphisms and allow us to distinguish one strain from another at a genetic marker. Forward and reverse pairs of oligonucleotide primers (single-stranded DNA) can be synthesized to amplify DNA across the microsatellite via the polymerase chain reaction (PCR). Primers about 20 nucleotides in length reduce the probability of a perfect sequence match elsewhere in the genome to a very low value. The PCR products thus contain variable tandem sequences and identical flanking DNA that can be separated by gel electrophoresis even if the length differs by only 1 repeat (i.e., two base pairs) (Fig. 2).
Markers such as microsatellites are used in the construction of genetic linkage maps of chromosomes. Linkage maps reveal the relative position between two markers based on the frequency that crossing over occurs between the markers during meiotic divisions. During the meiotic phase of gametogenesis, chromosomes duplicate (one derived from each parent) and homologous chromosomes align to form a tetrad. Exchange of chromosomal regions between the two nonsister chromatids of the tetrad can occur and is termed crossing over (Fig. 3). Such crossovers produce recombination of genetic material not present in the original homologous pairs of chromosomes. Two loci located very near each other on the same chromosome have a recombination frequency that approaches zero because the probability of being separated by a crossover is low. Two loci far apart on a chromosome will have a recombination frequency of 50% if crossing over always occurs, thus behaving as if they are on separate chromosomes. It is easy to envision that more distance between two different markers equates with a greater probability of the markers being separated by a crossover. Thus, the frequency of production of recombinants in the gametes is used as the basis for estimating map distance and assigning the order of markers along chromosomes. The centimorgan (cM) is the unit of map distance and is equal to the formation of 1% gametic recombinants. Information regarding almost 10,000 microsatellite markers is available in electronic form for the rat (Whitehead Institute rat genetic map web site, http://www.genome.wi.mit.edu/rat/public) and mouse (Jackson Laboratory, http://www.informatics.jax.org), and primer pairs are available from Research Genetics (Birmingham, AL).
A quantitative trait locus (QTL) is a segment of a chromosome that contains one or more loci affecting a given trait. The first step toward identifying a QTL is to phenotype a large population of F2 animals for the quantitative trait. The second step is to genotype each animal at a marker locus every 10 to 15 cM throughout the genome. Alleles that influence the trait of interest and that are physically close on either side of a marker will be inherited (i.e., linked) with the marker. Markers linked to loci causative of trait variation will thus cosegregate with the phenotype for which they are responsible. Markers linked to alleles that do not cause trait variation will be distributed randomly (i.e., not cosegregate) with the phenotype.
Figure 4 depicts the principle of differential cosegregation of a single polymorphic marker with phenotype in a simulated F2 population. In practice, QTLs are identified by interval mapping with computer programs such as MapMaker/QTL (11,13) or Map Manager QT (12). These interval mapping programs evaluate simultaneously sets of linked markers for their association with trait value in a segregating population using estimates based upon maximum likelihood methodology (11). The likelihood calculation indicates how much more likely a hypothesis of linkage is true compared with the hypothesis of nonlinkage. Computerization of this approach is essential because it is calculation-intensive and nontrivial in both theory and usage. Once a QTL is identified, regions of the QTL containing alleles causative of trait difference can be fine-mapped by creating congenic strains (strains that are identical except for one defined chromosomal segment) and alleles involved can ultimately be identified by positional cloning, as described in detail by Strachan and Read (14).
ARTIFICIAL SELECTIVE BREEDING FOR COMPLEX TRAITS
Artificial selection means breeding individual organisms expressing the extreme values of a complex phenotype. This process produces somewhat ideal genetic models because contrasting allelic variation for the trait will be concentrated from one generation to the next. A phenotypic response to selection is possible if sufficient additive genetic variance (variance associated with the average effects of substituting one allele for another) exists in a population for that trait. Based on Fisher’s 1930 Theorem of Natural Selection (5), traits peripherally associated with evolutionary fitness, such as morphology and complex physiology, demonstrate more additive genetic variance because of less pressure from natural selection. Selective breeding begins by measuring the trait of interest in a large group of animals that as a population has wide genetic heterogeneity. Genetic heterogeneity of the founder population provides a pool from which selection of choice breeders can concentrate an array of alleles expressing the extreme values of the phenotype. At each generation, progeny are phenotyped, selected as the “best” relative to the trait of interest, and bred to create the next generation. This process is repeated until the change in the population mean produced by selection (response to selection) plateaus, which typically indicates exhaustion of additive genetic variance for the trait (Fig. 5).
In reality, it is not a trait itself that is selected upon, but a test or measure of a trait. Thus, a major step is to devise a measure that is exemplar of the trait. These criteria are useful guides for developing a measure: 1) relatively simple to perform, 2) objectively interpretable, 3) gradable on a continuous numerical scale, and 4) demonstrates a wide range in magnitude between the low and high values of the phenotype. Based on the above criteria, artificial selection was feasible for the trait of treadmill running capacity. A small population of Sprague-Dawley rats was used to determine if a test of aerobic treadmill running capacity demonstrated a measurable response to selection (10). Patterned after clinical stress tests of cardiovascular capacity, the measure consisted of running each rat on a motorized treadmill up a 15° slope with incremental increases in velocity until the rat was exhausted. The initial velocity was 10 m/min and was increased 1 m every 2 min. The total distance run in meters until exhaustion was designated as the standard for the estimate of treadmill running capacity. Two pairs of the highest and lowest performers were selectively bred through three successive generations. Figure 6 shows that three generations of selection resulted in a 70% difference in mean running capacity between the low and high selected lines (low, 388 m, versus high, 659 m). Narrow sense heritability (h2) is the extent to which phenotypes in offspring are determined by alleles transmitted from the parents. Heritability can be estimated from the regression of individual offspring values on the mean of the parents (mid-parent value). If h2 = 1, then the trait was on average inherited with complete fidelity, whereas an h2 = 0 demonstrates no heritable similarity between parents and offspring. Estimated heritability for treadmill running capacity across the three generations of rats shown in Figure 6 was 0.39.
Having established an apparent measure of treadmill running capacity that demonstrated heritability, a large-scale breeding paradigm was undertaken with the ultimate goal of establishing permanent models of aerobic treadmill running capacity. In designing an approach to selection, five related factors determine the magnitude of the response to selection: breeding value, accuracy of selection, intensity of selection, population size, and inbreeding (4).
The breeding value is determined by the sum of the additive effects of all alleles that act upon the trait. Obviously, breeding value can only be measured by breeding, but the breeding value is estimated by ranking animals on their measure for a trait. The animals that rank the furthest from the population mean are estimated to have the higher breeding values. Rats of the NiNIH stock are favorable to select from because they were developed and have been maintained for wide genetic heterogeneity (6). Such heterogeneity increases the probability of obtaining animals that differ widely for breeding value.
Accuracy of Selection
This is the correlation between the phenotypic measure of a trait and the true breeding value and is estimated as the square root of the heritability (h2). If the accuracy of selection is zero, the measure of the trait will produce a ranking not different from random selection and selective breeding will produce a response of zero. Accuracy of selection of 100% would rank animals by their true breeding value and result in maximal selection response. Based on a heritability of 0.39 for aerobic treadmill running capacity as measured in our population of Sprague-Dawley rats (Fig. 6), the estimated accuracy of selection for capacity was 62%.
Intensity of Selection
Intensity of selection is the superiority (or inferiority) of the selected parents, standardized to the amount of variation in the trait. Superiority of the parents is given by the selection differential (S). S is the mean of the selected parents minus the mean of the population from which they were selected. Selection intensity (i) is often measured as S divided by the SD of the population (i = S/sp), which indicates by how many SDs the mean of the selected individual animals exceeds the mean of the population. The larger the selection intensity, the greater the response to selection. Although selecting only a few mating pairs at the extremes increases the intensity of selection, this simultaneously increases the inbreeding. Thus, one needs to use a population size that minimizes inbreeding while retaining sufficient intensity of selection, as explained next.
The smaller the population selected for breeding, the greater the intensity of breeding, which increases the short-term response to selection. A smaller population, however, also increases the amount of inbreeding that causes a decrease in the maximum possible selection limit because less genetic variation is included in the selected breeders. In general, for relatively short-term breeding programs, inbreeding of about 1% per generation is probably an acceptable yet somewhat arbitrary compromise between: a) the expense of maintaining larger populations to reduce random loss of genetic substrate and b) accepting some loss of genetic variation in a population of manageable size. It should be recognized that mean differences between low and high lines may not necessarily be caused by inheritance of the selected trait but can result from genetic drift. Genetic drift means changes in allele frequency that are attributable entirely to chance. Modest values for the initial selection differential, heritability, and/or population size increase the probability of genetic drift accounting for differences the between low and high lines. Under such conditions, it is important to establish replicate selected and control lines that can be used to estimate the contribution of genetic drift.
Planned breeding designed to give the lowest possible rate of inbreeding is termed minimal inbreeding. The rate of inbreeding can be reduced by making the contributions from each family more equal; this can be accomplished by taking the “best” female and male animal from each mating and using them as parents in the next generation (within family selection). Within-family selection, coupled with systematic rotational breeding, which keeps mating between relatives at the minimum, can maintain the rate of inbreeding per generation (df) just less than 1% if 13 families are used in both the low and high selected lines. df = 1/(4N), where N = number of individual parents in each line at each generation (4).
Based on the above criteria, low and high lines for aerobic treadmill running capacity are under development. The founder population was 192 NiNIH rats, and after six generations of selection, the lines have separated substantially (Koch and Britton, unpublished data). From these models one can begin to evaluate subordinate traits that might cause the difference between the lines at each generation, continue selective breeding until the response to selection reaches a plateau, and then develop low and high inbred strains of treadmill running capacity for genetic study.
Swallow et al. (15) have taken a unique approach to creation of an exercise model in mice. Rather than use a forced type of stimulus, they selected for voluntary wheel running. After 10 generations of selection, the low and high lines differed by 75% in voluntary wheel running capacity and 6% in maximal oxygen consumption. These mice may prove to be excellent models to uncover the genetic substrate causative of motivational aspects of exercise. The work of Dohm et al. (3) demonstrated a significant narrow-sense heritability of endurance capacity in mice that ranged from 0.17 to 0.33. In addition, their work demonstrated that the additive genetic covariance between sprint speed and endurance was negative. These results suggest that our selection in rats for treadmill running capacity will produce a decrease in sprint speed across the generations.
IDENTIFICATION OF ALREADY-AVAILABLE INBRED STRAINS
Another useful path is to identify wide differences for a trait of interest in already-available inbred strains. Numerous inbred strains have been developed that either were, or were not, first selectively bred for a specific trait. Like populations of individual animals, differences between inbred strains can demonstrate wide variation for any given trait and thus serve as models to explore the cause of this variation. Although these are useful models, they are less ideal than strains selectively bred on the basis of a trait because the alleles have not been concentrated for the extremes of a trait.
Using a treadmill test of aerobic capacity as described above, Barbato et al. (1) screened 11 inbred strains of rats and found a 2.7-fold difference between the lowest and highest performing strains (Figure 7). The COP rats ran to exhaustion at 298 m and the DA rats at 840 m. The wide divide in running capacity between these strains make them suitable models for genomic analysis and for decomposition into subordinate phenotypes. Knowledge that originates from the seminal work of Hill and Lupton (7) supports the contention that the ability of the heart to deliver oxygen is the predominant singular factor that limits maximal aerobic capacity. Joyner (8) developed a model in which maximal endurance running capacity is the product of three physiological variables: 1) the maximal rate at which oxygen and nutrient substrates can be utilized to produce energy in the form of ATP, 2) the percent of o2max at the threshold for lactate release, and 3) the efficiency of running. These ideas suggested a working hypothesis that differential cardiac function would be a primary trait accounting for part of the difference between the low (COP) and high (DA) capacity strains. Intrinsic cardiac performance was estimated from the cardiac output generated using an isolated Langendorff-Neely working heart preparation (1). The preload was set at 15 mm Hg and the afterload was set at 70 mm Hg throughout the perfusion. Figure 8 shows that distance run to exhaustion was significantly correlated (r = 0.87, P = 0.003) with intrinsic cardiac performance across the 11 strains. Control for differences in body weights was accomplished with a multiple regression approach. These data are consonant with the hypothesis that variation in cardiac function accounts for part of the variation in aerobic running capacity.
In a subsequent study (9), five autonomically regulated cardiovascular traits were evaluated in COP and DA rats during rest and exercise. At rest, the DA rats had significantly more sympathetic (24%) and parasympathetic (192%) tonus for heart rate control and more sympathetic support of blood pressure (84%) compared with the COP rats. During three graded levels of treadmill exercise, the DA rats had higher blood pressures (16%) and higher heart rates (4%) than the COP rats. In addition, the DA rats had a 27% greater heart weight/body weight than the COP strain of rats. All six of these intermediate phenotypes could participate as variables causative of the difference in treadmill running capacity between the DA and COP strains of rats.
Soon, most of the transcribed regions of the human genome will have been sequenced. Although this represents a major achievement, the function of only a relatively small part of the genome is known. A next logical step will be the use of genetic models to connect genes with function. Identification of genes responsible for trait differences between low and high physical capacity would form a broader base for understanding the genetic origins of complex physiology and its relationship to health and lead to new paths for the prevention and treatment of disease.
This work was supported by a grant from the United States Public Health Service (National Institutes of Health grant HL64270).
1. Barbato, J.C., Koch, L.G. Darvish, A. Cicila, G.T. Metting, P.J. and Britton. S.L. Spectrum of aerobic endurance running
performance in eleven inbred strains of rats. J. Appl. Physiol. 85: 530–536, 1998.
2. Bouchard, C., Lesage, R. Lortie, G. Simoneau, J.-A. Hamel, P. Boulay, M.R. Perusse, L. Theriault, G. and LeBlanc. C. Aerobic
performance in brothers, dizygotic and monozygotic twins. Med. Sci. Sports Exerc. 18: 639–646, 1986.
3. Dohm, M.R., Hayes, J.P. and Garland. T.J. Quantitative genetics of sprint running
speed and swimming endurance
in laboratory house mice (Mus domesticus
). Evolution 50: 1688–1701, 1996.
4. Falconer, D.S., and Mackay. T.F.C. Introduction to Quantitative Genetics. Essex, England: Addison Wesley Longman, Ltd., 1996.
5. Fisher, R.A. The Genetical Theory of Natural Selection. Oxford, England: Clarendon Press, 1930.
6. Hansen, C., and Spuhler. K. Development of the National Institutes of Health genetically heterogeneous rat stock. Alcohol Clin. Exp. Res. 8: 477–479, 1984.
7. Hill, A.V., and Lupton. H. Muscular exercise
, lactic acid, and the supply and utilization of oxygen. Q. J. Med. 16: 135–171, 1923.
8. Joyner, M.J. Modeling: optimal marathon performance on the basis of physiological factors. J. Appl. Physiol. 70: 683–687, 1991.
9. Koch, L.G., Britton, S.L. Barbato, J.C. Rodenbaugh, D.W. and DiCarlo. S.E. Phenotypic differences in cardiovascular regulation in inbred rat models of aerobic
capacity. Physiol. Genomics
1: 63–69, 1999.
10. Koch, L.G., Meredith, T.A. Fraker, T.D. Metting, P.J. and Britton. S.L. Heritability of treadmill running endurance
in rats. Am. J. Physiol. 275: R1455–R1460, 1998.
11. Lander, E.S., and Botstein. D. Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121: 185–199, 1989.
12. Manly, K.F., and Olson. J.M. Overview of QTL mapping software and introduction to Map Manager QT. Mamm. Genome 10: 327–334, 1999.
13. Paterson, A., Lander, E. Lincoln, S. Hewitt, J. Peterson, S. and Tanksley. S. Resolution of quantitative traits into Mendelian factors using a complete RFLP linkage map. Nature 335: 721–726, 1988.
14. Strachan, T., and Read. A.P. Human Molecular Genetics 2. New York: Wiley-Liss, 1999.
15. Swallow, J.G., Carter, P.A. and Garland, Jr. T. Artificial selection for increased wheel-running
behavior in house mice. Behav. Genet. 28: 277–237, 1998.
Keywords:© 2001 Lippincott Williams & Wilkins, Inc.
exercise; aerobic; endurance; treadmill; running; genes; genomics