Inferences from Genomic Models in Stratified Populations

Publikation: Bidrag til tidsskrift/Konferencebidrag i tidsskrift /Bidrag til avisTidsskriftartikelForskningpeer review

  • Luc Janss
  • Gustavo de los Campos, Section on Statistical Genetics, Biostatistics, University of Alabama, Birmingham, USA
  • Nuala Sheehan, Department of Health Sciences and Department of Genetics, University of Leicester, Storbritannien
  • Daniel Sorensen
Unaccounted population stratification can lead to spurious associations in genome-wide association studies (GWAS) and in this context several methods have been proposed to deal with this problem. An alternative line of research uses whole-genome random regression (WGRR) models that fit all markers simultaneously. Important objectives in WGRR studies are to estimate the proportion of variance accounted for by the markers, the effect of individual markers, prediction of genetic values for complex traits, and prediction of genetic risk of diseases. Proposals to account for stratification in this context are unsatisfactory. Here we address this problem and describe a reparameterization of a WGRR model, based on an eigenvalue decomposition, for simultaneous inference of parameters and unobserved population structure. This allows estimation of genomic parameters with and without inclusion of marker-derived eigenvectors that account for stratification. The method is illustrated with grain yield in wheat typed for 1279 genetic markers, and with height, HDL cholesterol and systolic blood pressure from the British 1958 cohort study typed for 1 million SNP genotypes. Both sets of data show signs of population structure but with different consequences on inferences. The method is compared to an advocated approach consisting of including eigenvectors as fixed-effect covariates in a WGRR model. We show that this approach, used in the context of WGRR models, is ill posed and illustrate the advantages of the proposed model. In summary, our method permits a unified approach to the study of population structure and inference of parameters, is computationally efficient, and is easy to implement
TidsskriftGenetics (Online)
Sider (fra-til)693-704
Antal sider9
StatusUdgivet - 2012

Se relationer på Aarhus Universitet Citationsformater

ID: 46010063