Genetic polymorphism is a difference in DNA sequence among individuals, groups, or populations. Sources include single nucleotide polymorphisms (SNPs), sequence repeats, insertions, deletions, and recombinations (e.g. a genetic polymorphism might give rise to blue eyes vs. brown eyes, or straight hair vs. curly hair). Genetic polymorphisms may be the result of chance processes, or may have been induced by external agents such as viruses or radiation. If a difference in DNA sequence among individuals has been shown to be associated with disease, it will usually be called a genetic mutation. Changes in DNA sequence that have been confirmed to be caused by external agents are also generally called ‘mutations’ rather than ‘polymorphisms’ (http://genomics.phrma.org/lexicon).
Polymorphisms arise through mutations. The mutation may be due to a change from one type of nucleotide to another, an insertion or deletion, or a rearrangement of nucleotides. Once formed, a polymorphism can be inherited like any other DNA sequence, allowing its inheritance to be tracked from parent to child.
Polymorphisms are also found outside of genes, in the vast quantity of DNA that does not code for protein. Indeed, regions of DNA that do not code for proteins tend to have more polymorphisms. This is because a change in the DNA sequences that encode proteins may have a harmful effect on the individual who carries it. Synonymous polymorphisms are those that do not have any effect on the organism and are said to be selectively neutral as the substitution causes no amino acid change in the protein produced. This is also called a silent mutation. A nonsynonymous substitution results in an alteration of the encoded amino acid. A missense mutation changes the protein by causing a change in the codon. A nonsense mutation results in a misplaced termination codon. One half of all coding sequence SNPs result in nonsynonymous codon changes (http://genomics.phrma.org/lexicon).
Types of DNA polymorphisms
Tandem repeat polymorphisms
DNA sequences that are repeated in tandem are common throughout the genomes of a wide range of species, including humans, and are often highly conserved, suggesting that they have an important function (Kashi and King, 2006; Fondon et al., 2008; Usdin, 2008).
Most research on tandem repeats has focused on those in the human genome, such as microsatellites used as genetic markers and as repeat-expansion mutations causing dominant or recessive disorders with Mendelian inheritance patterns. The term ‘dynamic mutations’ has been used to describe the variation in tandem repeat length that has been found in many human genes to cause a range of repeat-expansion disorders, particularly those affecting the nervous system including Huntington’s and other polyglutamine diseases, Friedreich’s ataxia, and fragile X syndrome (Sutherland et al., 1998; Nithianantharajah and Hannan, 2007; Usdin, 2008). It should be noted that one of the genes, FMR1, which contains a trinucleotide expansion implicated in fragile X syndrome, shows further associations of premutation repeat lengths (below the disease threshold necessary for fragile X syndrome) with more complex disorders, including tremor/ataxia, Parkinsonism, neuropsychiatric symptoms, and premature ovarian failure (Coffey et al., 2008; Bourgeois et al., 2009). A much larger array of tandem repeats are present in, and between, genes that are not known to be involved directly in diseases of Mendelian inheritance. Tandem repeats are also referred to as simple sequence repeats, microsatellites, or minisatellites. These repetitive sequences can be located in exons, introns, or intergenic regions, providing opportunities for the modulation of gene expression and for that of the structure and function of RNAs and proteins (e.g. codon repeats translated into amino acid runs).
Bioinformatic analysis indicates that there is some bias regarding the distribution of the hundreds of thousands of unique tandem repeats throughout the human genome (O’Dushlaine and Shields, 2008; Molla et al., 2009). A large proportion of tandem repeats are located in introns and intergenic regions. There is also some specificity with respect to different types of repeat motifs and their genomic distribution patterns. For example, trinucleotide and hexanucleotide repeats are more likely to be found in exons and coding regions in particular, where their expansion or contraction will lead to altered lengths of amino acid runs (e.g. polyhistidine tracts) but not catastrophic frameshifts (Subramanian et al., 2003; Salichs et al., 2008).
The term tandem repeats is used to describe tandemly repeated DNA sequences also known as simple sequence repeats or satellite DNA (which includes both microsatellites and minisatellites). Tandem repeats can involve mononucleotides, dinucleotides, trinucleotides (triplets), tetranucleotides, etc. Microsatellites range from 1–5 to 1–10 bp in length. Thus, tandem repeats with a motif longer than the upper limit for microsatellites (i.e. >10 base pairs in length) are generally called minisatellites. Tandem repeat polymorphisms, or SSR polymorphisms, have also been referred to as variable-number tandem repeats or repeat-length polymorphisms.
Short tandem repeats
Short tandem repeats (STRs) are very short stretches of DNA that are repeated back to back at various locations throughout the human genome. Typically, the repeating sequence is just 2, 3, or 4 bp in length and the number of copies found back to back is variable across a wide range. Unlike the case for the DNA sequences of coding genes, there is no ‘correct’ number of repeats for any specific STR in the genome; they are simply areas within the genome in which variation is normal and healthy. For any specific STR, each person will have two copies, one that is inherited from his/her mother at conception and the other that is inherited from his/her father. STRs are helpful in forensic and paternity testing. Because there are a lot of natural variations in STRs, the chance of two people matching for the exact number of repeats on both inherited copies of the STR is fairly small. On combining analysis of many STRs across the genome, the probability of two people matching by random chance was found to be extremely low.
For example, if the normal variation for a certain STR (e.g. STR-A) is from seven to 20 copies, there will be 14 different lengths that could be passed on from either parent. As each person has two copies of STR-A, the total number of possible combinations would be 196 (14×14). If each of these different STR lengths had an equally probability (1/14), and every possible combination was equally likely, the chance of two people matching on the pattern of STR-A would be 1/196. Now, suppose that there are many other STRs available for study (e.g. STR-B, STR-C, STR-D, …, STR-Z), each with the same probabilities we chose for STR-A, the likelihood of matching by random chance alone across these many markers would be found by multiplying the probabilities of each one individually.
For two markers, the probability of a random chance match drops to 1/38 416. Adding a third marker drops the probability to 1/7 529 536. The addition of a fourth marker decreases the likelihood to less than one in 14 billion (Hochmeister et al., 1991; Hammond et al., 1994).
The use of Y-chromosome STR loci has become increasingly common in forensic science, partly because of the loci being recombination free during meiosis and paternally inherited. The analysis of Y-chromosome STRs in paternity testing of potential paternal relationships is very important especially in cases in which a potential father is not available. In sexual assault cases, Y-chromosome STRs are used to identify the male offender (Betz et al., 2001). Prinz et al. (1997) reported that the male DNA was detectable even in the ratio of 1 : 2000 (male : female) by Y-chromosome STR analysis, the application of which was also found in human evolution (Jobling and Tyler-Smith, 1995), genealogical (Anslinger et al., 2000), and population studies (Sasaki and Dahiya, 2000).
Copy-number polymorphism or variation
Genetic association studies generally evaluate SNPs, which are variations in single nucleotides at specific genomic locations between individuals of the same species. Recent results indicate that the human genome contains another frequent type of polymorphism, copy-number variations (CNVs; Conrad et al., 2010). A CNV is a variation in which a segment of DNA can be found in various copy numbers in the genomes of different individuals. CNVs range in size from a few hundred nucleotides to several megabases. Compared with SNPs, CNVs affect a more significant fraction of the genome and arise more frequently. Hence, CNVs significantly contribute to human evolution, genetic diversity, and an increasing number of phenotypic traits (Stankiewicz and Lupski, 2010).
CNVs of DNA sequences are abundant in natural populations and are functionally significant but still need to be fully ascertained. A CNV is generated by both recombination and replication mechanisms and a de-novo locus-specific mutation rate, which is higher than in that in SNP. CNVs can cause Mendelian, sporadic, or diseased effects and affect gene duplication, exon shuffling, and genome diversity and evolution (Zhang et al., 2009). Mechanisms of changes causing CNV evolution in humans, through deletions and duplications of chromosomal segments, were described by Hastings et al. (2009).
CNVs in humans were examined by SNP genotyping arrays and clone-based comparative genomic hybridization (Redon et al., 2006). Large (>100 kb) CNVs affect a much smaller portion of the genome than initially reported. Approximately 80% of observed copy-number differences between pairs of individuals were because of common copy-number polymorphisms with an allele frequency greater than 5%, and more than 99% were derived from inheritance rather than new mutations. Most common were the diallelic copy-number polymorphisms in strong linkage disequilibrium with SNPs, and most low-frequency CNVs segregated onto specific SNP haplotypes.
Single nucleotide polymorphism
A single nucleotide polymorphism is present at a particular nucleotide site. The DNA molecules in the population often differ in the identity of the nucleotide pair that occupies the site. For example, some DNA molecules in the same population may have a T–A base pair at a particular nucleotide site, whereas other DNA molecules in the same population may have a C–G base pair at the same site. This difference constitutes an SNP. The SNP defines two alleles for which there could be three genotypes among individuals in the population; homozygous chromosomes or heterozygous chromosomes with T–A in one chromosome and C–G in the homologous chromosome.
The word allele is in quotation marks above because the SNP need not be in a coding sequence, or even in a gene. In the human genome, any two randomly chosen DNA molecules are likely to differ at about one SNP site every 1000 bp in noncoding DNA and at about one SNP site every 3000 bp in protein-coding DNA. The definition of a SNP that stipulates that DNA molecules must differ at a nucleotide site excludes rare genetic variations of the sort found in less than 1% of the DNA molecules in a population. The reason for the exclusion is that genetic variants that are too rare are not generally as useful in genetic analysis as the more common variants. SNPs are the most common form of genetic differences among people.
About 3 million SNPs that are relatively common in the human population have been identified, of which about 1 million are typically used in a search for SNPs that might be associated with complex diseases such as diabetes or high blood pressure (Cargill et al., 1999). In some instances, a variant form of a gene may confer an evolutionary advantage to the species and is eventually incorporated into the DNA for many or most members of the species, and effects of the variant form may be both beneficial and detrimental, depending on the circumstances. For example, a heterozygous sickle cell mutation confers resistance to malaria, but a homozygous sickle cell mutation is usually lethal.
In many cases, both progenitor and variant forms survive and coexist in a species population. The coexistence of multiple forms of a genetic sequence gives rise to genetic polymorphisms, including SNPs.
SNPs may arise from a substitution of one nucleotide by another at the polymorphic site. Substitutions can be transitions. A transition is the replacement of one purine nucleotide by another purine nucleotide or one pyrimidine by another pyrimidine. A SNP may also be a single base insertion or deletion variant referred to as ‘indel’ (Weber et al., 2002). A synonymous codon change, or silent mutation/SNP (the terms ‘mutation’ or ‘polymorphism’, ‘mutant’, and ‘variation’ are used interchangeably), is one that does not result in a change of amino acid because of the degeneracy of the genetic code. A substitution that changes a codon coding for one amino acid to a codon coding for a different amino acid (i.e. a nonsynonymous codon change) is referred to as a missense mutation. A nonsense mutation is a type of nonsynonymous codon change in which a stop codon is formed, thereby leading to premature termination of a polypeptide chain and formation of a truncated protein. A read-through mutation is another type of nonsynonymous codon change that causes the destruction of a stop codon, thereby resulting in an extended polypeptide product. Although SNPs can be biallelic, triallelic, or tetra-allelic, the vast majorities of the SNPs are biallelic and are thus often referred to as ‘biallelic markers’ or ‘diallelic markers’ (Goto et al., 2001).
References to SNPs and SNP genotypes include individual SNPs and/or haplotypes, which are groups of SNPs that are generally inherited together. Haplotypes can have stronger correlations with diseases or other phenotypic effects compared with individual SNPs, and may therefore provide increased diagnostic accuracy in some cases (Stephens et al., 2001). Furthermore, in the case of nonsense mutations, A SNP may lead to premature termination of a polypeptide product. Such variant products can result in a pathological condition, for example genetic disease. Examples of genes in which an SNP within a coding sequence causes a genetic disease include sickle cell anemia and cystic fibrosis.
SNPs responsible for a disease do not necessarily have to occur in coding regions; they can occur in, for example, any genetic region that can ultimately affect the expression, structure, and/or activity of the protein encoded by a nucleic acid. Such genetic regions include, for example, those involved in transcription. Examples are SNPs in transcription factor binding domains, in promoter regions, in areas involved in transcript processing, such as SNPs at intron–exon boundaries, which may cause defective splicing, or in mRNA processing signal sequences such as polyadenylation signal regions (Li and Pritchard, 2000; Artiga et al., 2002).
Nevertheless, some SNPs that are not causative SNPs are in close association with, and therefore segregate with, a disease causing sequence. In this situation, the presence of an SNP correlates with the presence of, the predisposition to, or an increased risk of developing the disease. These SNPs, although not causative, are nonetheless also useful in diagnostics, disease predisposition screening, and other applications.
Applications of DNA markers
Disease gene risk factors with multifactor diseases
The key goal of studying DNA polymorphisms in human genetics is to identify the chromosomal location of mutant genes associated with hereditary diseases. In the context of disorders caused by the interaction of multiple genetic and environmental factors, such as heart disease, cancer, diabetes, depression, and so forth, it is important to think of a harmful allele as a risk factor for the disease, which increases the probability of occurrence of the disease, rather than as a sole causative agent. This needs to be emphasized, especially because genetic risk factors are often called disease genes. For example, the major disease gene for breast cancer in women is the gene BRCA1.
For women who carry a mutant allele of BRCA1, the lifetime risk for breast cancer is about 36%, and hence, most women with this genetic risk factor develop breast cancer. In contrast, among women who are not carriers, the lifetime risk for breast cancer is about 12%. Indeed, BRCA1 mutations are found in only 16% of affected women who have a family history of breast cancer. The importance of the genetic risk factor can be expressed quantitatively as relative risk, which equals the risk for disease in the individuals who carry the risk factor as compared with the risk in those who do not. The relative risk for the disease in women carrying BRCA1 is equal to 3.0 (calculated as 36/12%; Hartland and Jones, 2009).
SNPs do not cause disease; however, they can help determine the likelihood that someone will develop a particular illness. One of the genes associated with Alzheimer’s disease, apolipoprotein E or ApoE, is a good example of how SNPs affect disease development. ApoE contains two SNPs that result in three possible alleles for this gene: E2, E3, and E4. Each allele differs by one DNA base, and the protein product of each gene differs by one amino acid. Each individual inherits one maternal copy of ApoE and one paternal copy of ApoE. Research has shown that a person who inherits at least one E4 allele will have a greater chance of developing Alzheimer’s disease. Apparently, the change of one amino acid in the E4 protein alters its structure and function enough to make disease development more likely. Inheriting the E2 allele, however, seems to indicate that a person is less likely to develop Alzheimer’s disease (Coon et al., 2007).
Genetic mapping and linkage
Each DNA polymorphism serves as a genetic marker for its own location in the chromosome. The importance of genetic linkage is that DNA markers that are sufficiently close to the disease gene will tend to be inherited together with the disease gene, and the closer the markers, the stronger this association. The first approach in the identification of the disease gene is to find DNA markers that are genetically linked with the disease in order to identify its chromosomal location, a procedure known as genetic mapping. Once the chromosomal position is known, other methods can be used to pinpoint the disease gene itself and to study its function. The human genome contains ∼30 000 genes. If genetic linkage did not exist, then we would have to examine 30 000 DNA polymorphisms, one in each gene, in order to identify a disease gene. But the human genome has only 23 pairs of chromosomes, and because of genetic linkage and the power of genetic mapping, it actually requires only a few hundred DNA polymorphisms to identify the chromosome and approximate location of a genetic risk factor.
Pharmacogenetics and its applications
Individual response to a drug is governed by many factors such as genetics, age, sex, environment, and disease. The influence of genetic factors on the response of a drug is a known fact. Study of the influence of genetic factors on drug response and metabolism is termed as pharmacogenetics. If the knowledge of pharmacogenetics is applied during drug dosing or drug selection, one can avoid adverse reactions, predict toxicity or therapeutic failure, and thus enhance therapeutic efficiency with improvement in clinical outcomes (Abraham and Adithan, 2001). Polymorphism exhibited by drug metabolizing enzymes is a well known phenomenon. Currently, exploration in the field of pharmacogenetics focuses mainly on the characterization of enzymes responsible for drug biotransformation as well as on describing the various sources of variability in enzyme activity (Ma et al., 2002). Pharmacogenetics attempts to identify genetic variations leading to unexpected drug effects, to clarify the underlying molecular mechanisms, to evaluate the clinical relevance, and to develop appropriate phenotyping and genotyping tests (Linder et al., 1997).
SNPs can be expressed in the phenotype of the extensive metabolizer and the poor metabolizer. Accordingly, SNPs may lead to allelic variations of a protein in which one or more of the protein functions in one population are different from those in another population. SNPs and the encoded variant peptides thus provide targets to ascertain a genetic predisposition that can affect treatment modality. For example, in a ligand-based treatment, SNPs may give rise to amino terminal extracellular domains and/or other ligand-binding regions of a receptor that are more or less active in ligand binding, thereby affecting subsequent protein activation. Accordingly ligand dosage would necessarily be modified to maximize the therapeutic effect within a given population containing particular SNP alleles or haplotypes. As an alternative to genotyping, specific variant proteins containing variant amino acid sequences encoded by alternative SNP alleles could be identified. Thus pharmacogenomic characterization of an individual permits the selection of effective compounds and effective dosages of such compounds for prophylactic or therapeutic uses based on the individual’s SNP genotype, thereby enhancing and optimizing the effectiveness of the therapy (Pfost et al., 2000).
Polymorphisms in cytochrome-P 450 (CYP), which is one of the most important drug metabolizing systems, can lead to different drug responses or toxicities. Studies on SNP variation can therefore help us to evaluate the phenotype status of the study population and to understand more about drug metabolism, (Yogesh and Reena, 2011). Individuals with polymorphisms in xenobiotic-metabolizing enzymes such as CYP, glutathione S-transferase, or N-acetyl transferase have shown an altered susceptibility toward environmentally induced diseases such as cancer, central nervous system diseases, and asthma, (Patel et al., 2005).
Warfarin is an antiplatelet drug prescribed for the prevention of stroke and thrombotic disease. CYP2C9*1 metabolizes warfarin normally, CYP2C9*2 reduces warfarin metabolism by 30%, and CYP2C9*3 reduces warfarin metabolism by 90%. Depending on the genotype, conventional dose titration with warfarin could lead to either an increased risk for bleeding events or an increase in time required to achieve therapeutic anticoagulation. In August 2007, the Food and Drug Administration regulations required that a warning label be put on warfarin, which explains the relationship between genotype and warfarin clearance (Goldstein, 2001).
Abacavir is a nucleoside analog reverse transcriptase inhibitor used to treat acquired immune deficiency syndrome. However, abacavir hypersensitivity reaction, which is a reversible immune-mediated systemic reaction, can occur within the first 6 weeks of use and is potentially a treatment-limiting factor. Several polymorphisms within the human leukocyte antigen gene-B (HLA-B) region were found to occur more frequently in individuals exhibiting abacavir hypersensitivity. Screening for the HLA-B*5701 gene before abacavir therapy has resulted in a decrease in abacavir hypersensitivity reaction (Kupiec and Shimasaki, 2010).
DNA markers have several applications such as in epidemiology and food safety science, in DNA polymorphisms as ecological indicators, in evolutionary genetics, in population studies, and in determining evolutionary relationships among species.
Detection methods of polymorphisms
Restriction fragment length polymorphism
Although most SNPs require DNA sequencing to be studied, those that happen to be located within a restriction site can be analyzed using a restriction enzyme. For example, an SNP consists of a T–A nucleotide pair in some molecules and a C–G pair in others. In this example, the polymorphic nucleotide site is included in a cleavage site for the restriction enzyme EcoRI (5′-GAATTC-3′). In this kind of situation, DNA molecules with T–A at the SNP will be cleaved at both flanking sites and also at the middle site, yielding two EcoRI restriction fragments. Alternatively, DNA molecules with C–G at the SNP will be cleaved at both flanking sites but not at the middle site (because the presence of C–G destroys the EcoRI restriction site) and thus will yield only one larger restriction fragment. An SNP that eliminates a restriction site is known a restriction fragment length polymorphism. Because restriction fragment length polymorphisms change the number and size of DNA fragments produced by digestion with a restriction enzyme, they can be detected by the Southern blotting procedure. In this case the labeled probe DNA hybridizes near the restriction site at the far left and identifies the position of this restriction fragment in the electrophoresis gel. The duplex molecule labeled ‘allele A’ has a restriction site in the middle, and when cleaved and subjected to electrophoresis it yields a small band that contains sequences homologous to the probe DNA. The duplex molecule labeled ‘allele a’ lacks the middle restriction site and yields a larger band. In this situation there can be three genotypes AA, Aa, or aa, depending on which alleles are present in the homologous chromosomes, and all three genotypes can be distinguished as one copy of each allele in the heterozygous genotype Aa (Hartland and Jones, 2009).
Dynamic allele-specific hybridization
In the first step, a genomic segment is amplified and attached to a bead through a PCR reaction with a biotinylated primer. In the second step, the amplified product is attached to a streptavidin column and washed with NaOH to remove the unbiotinylated strand. An allele-specific oligonucleotide is then added in the presence of a molecule that fluoresces when bound to double-stranded DNA. The intensity is then measured with an increase in temperature until the Tm can be determined. Presence of an SNP will result in a lower than expected Tm. Because dynamic allele-specific hybridization genotyping measures a quantifiable change in Tm, it is capable of measuring all types of mutations, not just SNPs. Other benefits of dynamic allele-specific hybridization include its ability to work with label-free probes and its simple design and performance conditions (Howell et al., 1999).
Single nucleotide polymorphism microarray
This technique uses thousands of probes arrayed on a small chip, therefore allowing for many SNPs to be detected simultaneously. By comparing the differential amount of hybridization of the target DNA with each of these redundant probes, it is possible to determine specific homozygous and heterozygous alleles (Rapley and Harbron, 2004). Affymetrix Human SNP 5.0 (Santa Clara, California, USA) GeneChip is used to carry out a genome-wide assay that can genotype over 500 000 human SNPs (Affymetrix, 2007). Microarray is also used to characterize genetic diversity and drug responses, to identify new drug targets, and to assess the toxicological properties of chemicals and pharmaceuticals.
Molecular beacons for real-time polymerase chain reaction
The unique design of these molecular beacons allows for a simple diagnostic assay to identify SNPs at a given location. If a molecular beacon is designed to match a wild-type allele and another to match a mutant of the allele, the two can be used to identify the genotype of an individual. If only the first probe’s fluorophore wavelength is detected during the assay then the individual is homozygous to the wild type. If only the second probe’s wavelength is detected then the individual is homozygous to the mutant allele. Finally, if both wavelengths are detected, then both molecular beacons must be hybridizing to their complements, and thus the individual must contain both alleles and be heterozygous.
Next-generation sequencing technologies, such as pyrosequencing, sequence less than 250 bases in a read, which limits their ability to sequence whole genomes. However, their ability to generate results in real-time and their potential to be massively scaled up makes them a viable option for sequencing small regions to perform SNP genotyping. Compared with other SNP genotyping methods, sequencing is particularly suited to identifying multiple SNPs in a small region, such as the highly polymorphic major histocompatibility complex region of the genome (Rapley and Harbron, 2004).
The sample size is calculated according to a case–control study for calculation of the sample size using the EPIENFO program (http://www.cdc.gov/epiinfo).
We chose the ratio between the control and case to be 1 : 1, the odds ratio to be 2, the percentage of exposure among controls to be 20%, and a 95% confidence interval ratio.
We recalculate the sample size manually using the following equations:
where p2=p1×OR/[1+p1(OR−1)], p=(p2+cp1)/(1+c), q=1−p, OR is the odds ratio worth detecting, C is the ratio of controls/cases, Zα is the α risk, Z(1−β) is the power, and p1 is the proportion exposure among the control population.
Conflicts of interest
There are no conflicts of interest.
Abraham BK, Adithan C. Genetic polymorphism
of CYP2D6. Indian J Pharmacol. 2001;33:147–169
Anslinger K, Keil W, Weichhold G, Eisenmenger W. Y-chromosomal STR haplotypes in a population sample from Bavaria. Int J Legal Med. 2000;113:189–192
Artiga MJ, Saez AI, Romero C, Sanchez-Beato M, Mateo MS, Navas C, et al. A short mutational hot spot in the first intron of BCL-6 is associated with increased BCL-6 expression and with longer overall survival in large B-cell lymphomas. Am J Pathol. 2002;160:1371–1380
Betz A, Bassler G, Dietl G, Steil X, Weyermann G, Pflug W. DYS STR analysis with epithelial cells in a rape case. Forensic Sci Int. 2001;118:126–130
Bourgeois JA, Coffey SM, Rivera SM, Hessl D, Gane LW, Tassone F, et al. A review of fragile X premutation disorders: expanding the psychiatric perspective. J Clin Psychiatry. 2009;70:852–862
Cargill M, Altshuler D, Ireland J, Sklar P, Ardlie K, Patil N, et al. Characterization of single nucleotide
polymorphisms in coding regions of human genes. Nat Genet. 1999;22:231–238
Coffey SM, Cook K, Tartaglia N, Tassone F, Nguyen DV, Pan R, et al. Expanded clinical phenotype of women with the FMR1 premutation. Am J Med Genet A. 2008;146A:1009–1016
Conrad DF, Pinto D, Redon R, Feuk L, Gokcumen O, Zhang Y, et al. Origins and functional impact of copy number variation in the human genome. Nature. 2010;464:704–712
Coon KD, Myers AJ, Craig DW, Webster JA, Pearson JV, Lince DH, et al. A high-density whole-genome association study reveals that APOE is the major susceptibility gene for sporadic late-onset Alzheimer’s disease. J Clin Psychiatry. 2007;68:613–618
Fondon JW III, Hammock EA, Hannan AJ, King DG. Simple sequence repeats: genetic modulators of brain function and behavior. Trends Neurosci. 2008;31:328–334
Goldstein JA. Clinical relevance of genetic polymorphisms in the human CYP2C subfamily. Clin Pharmacol. 2001;52:349–355
Goto Y, Yue L, Yokoi A, Nishimura R, Uehara T, Koizumi S, Saikawa Y. A novel single-nucleotide polymorphism in the 3′-untranslated region of the human dihydrofolate reductase gene with enhanced expression. Clin Cancer Res. 2001;7:1952–1956
Hammond HA, Jin L, Zhong Y, Caskey CT, Chakraborty R. Evaluation of 13 short tandem repeat
loci for use in personal identification applications. Am J Hum Genet. 1994;55:175–189
Hartl DL, Jones EW. Genetics analysis of genes and genomes. Chapter 2. DNA Structure and Genetic Variation. 20097th ed. MA, USA Jones and Bartlett Publishers Inc:42–87
Hastings PJ, Lupski JR, Rosenberg SU, Grzegorz I. Mechanisms of change in gene copy number. Nat Rev Genet. 2009;10:551–564
Hochmeister MN, Budowle B, Jung J, Borer UV, Comey CT, Dirnhofer R. PCR-based typing of DNA extracted from cigarette butts. Int J Legal Med. 1991;104:229–233
Howell W, Jobs M, Gyllensten U, Brookes A. Dynamic allele-specific hybridization. A new method for scoring single nucleotide
polymorphisms. Nat Biotechnol. 1999;17:87–88
Jobling MA, Tyler-Smith C. Fathers and sons? The Y chromosome and human evolution. Trends Genet. 1995;11:449–456
Kashi Y, King DG. Simple sequence repeats as advantageous mutators in evolution. Trends Genet. 2006;22:253–259
Li M, Pritchard PH. Characterization of the effects of mutations in the putative branchpoint sequence of intron 4 on the splicing within the human lecithin: cholesterol acyltransferase gene. J Biol Chem. 2000;275:18079–18084
Linder MW, Clin Chem RA, Valdes R. Pharmacogenetics: a laboratory tool for optimizing the therapeutic efficiency. Clin Chem. 1997;43:254–266
Ma MK, Woo MH, Mcleod HL. Genetic basis of drug metabolism. Am J Health Syst Pharm. 2002;59:2061–2069
Molla M, Delcher A, Sunyaev S, Cantor C, Kasif S. Triplet repeat length bias and variation in the human transcriptome. Proc Natl Acad Sci USA. 2009;106:17095–17100
Nithianantharajah J, Hannan AJ. Dynamic mutations as digital genetic modulators of brain development, function and dysfunction. Bioessays. 2007;29:525–535
O’Dushlaine CT, Shields DC. Marked variation in predicted and observed variability of tandem repeat
loci across the human genome. BMC Genomics. 2008;9:175–188
Patel S, Parmar D, GuptaYK Singh MP. Contribution of genomics, proteomics, and single-nucleotide polymorphism in toxicology research and Indian scenario. Indian J Hum Genet. 2005;11:61–75
Pfost DR, Boyce-Jacino MT, Grant DM. A SNPshot: pharmacogenetics and the future of drug therapy. Trends Biotechnol. 2000;18:334–338
Prinz M, Boll K, Baum H, Shaler B. Multiplexing of Y chromosome specific STRs and performance for mixed samples. Forensic Sci Int. 1997;85:209–218
Rapley R, Harbron S Molecular Analysis and Genome Discovery. 2004 Chichester John Wiley & Sons Ltd
Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, et al. Global variation in copy number in the human genome. Nature. 2006;444:444–454
Salichs E, Ledda A, Mularoni L, Albà MM, de la Luna S. Genome-wide analysis of histidine repeats reveals their role in the localization of human proteins to the nuclear speckles compartment. PLoS Genet. 2008;5:e1000397
Sasaki M, Dahiya R. The polymorphisms of various short tandem repeats on the Y chromosome in Japanese and German populations. Int J Legal Med. 2000;113:181–188
Stankiewicz P, Lupski JR. Structural variation in the human genome and its role in disease. Annu Rev Med. 2010;61:437–455
Stephens JC, Schneider JA, Tanguay DA, Choi J, Acharya T, Stanley SE, et al. Haplotype variation and linkage disequilibrium in 313 human genes. Science. 2001;293:489–493
Subramanian S, Mishra RK, Singh L. Genome-wide analysis of microsatellite repeats in humans: their abundance and density in specific genomic regions. Genome Biol. 2003;4:R13.1–R13.10
Sutherland GR, Baker E, Richards RI. Fragile sites still breaking. Trends Genet. 1998;14:501–516
Usdin K. The biological effects of simple tandem repeats: lessons from the repeat expansion diseases. Genome Res. 2008;18:1011–1019
Weber JL, David D, Heil J, Fan Y, Zhao C, Marth G. Human diallelic insertion/deletion polymorphisms. Am J Hum Genet. 2002;71:854–862
Yogesh J, Reena P. A review on genetic polymorphism
of cytochrome p-450 2C19 and its clinical validity. Review article. Int Res J Pharm. 2011;2:7–12
Zhang F, Gu W, Hurles ME, Lupski JR. Copy number variation in human health, disease, and evolution. Annu Rev Genomics Hum Genet. 2009;10:451–481