Background: The serotonin transporter gene (SLC6A4) and its promoter (5-HTTLPR) polymorphism have been the focus of a large number of association studies of behavioral traits and psychiatric disorders. However, large-scale genotyping of the polymorphism has been very difficult. We report the development and validation of a 5-HTTLPR genotype prediction model.
Methods: The single nucleotide polymorphisms (SNPs) from the 2000 kb region surrounding 5-HTTLPR were used to construct a prediction model through a newly developed machine learning method, multicategory vertex discriminant analysis with 2147 individuals from the Northern Finnish Birth Cohort genotyped with the Illumina 370K SNP array and manually genotyped for 5-HTTLPR polymorphism. The prediction model was applied to SNP genotypes in a Dutch/German schizophrenia case–control sample of 3318 individuals to test the association of the polymorphism with schizophrenia.
Result: The prediction model of eight SNPs achieved a 92.4% accuracy rate and a 0.98±0.01 area under the receiving operating characteristic. Evidence for an association of the polymorphism with schizophrenia was observed (P=0.05, odds ratio=1.105).
Conclusion: This prediction model provides an effective substitute of manually genotyped 5-HTTLPR alleles, providing a new approach for large scale association studies of this polymorphism.