Estimating the effective sample size in association studies of quantitative traits

Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaperJournal articleResearchpeer-review

  • Andrey Ziyatdinov, Harvard T.H. Chan School of Public Health, Boston, MA
  • ,
  • Jihye Kim, Harvard T.H. Chan School of Public Health, Boston, MA
  • ,
  • Dmitry Prokopenko, Massachusetts General Hospital , Harvard Medical School
  • ,
  • Florian Privé
  • Fabien Laporte, Institut Pasteur
  • ,
  • Po-Ru Loh, Brigham and Women’s Hospital, Boston, MA , Harvard Medical School, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
  • ,
  • Peter Kraft, Harvard T.H. Chan School of Public Health, Boston, MA
  • ,
  • Hugues Aschard, Harvard T.H. Chan School of Public Health, Boston, MA, Institut Pasteur

The effective sample size (ESS) is a metric used to summarize in a single term the amount of correlation in a sample. It is of particular interest when predicting the statistical power of genome-wide association studies (GWAS) based on linear mixed models. Here, we introduce an analytical form of the ESS for mixed-model GWAS of quantitative traits and relate it to empirical estimators recently proposed. Using our framework, we derived approximations of the ESS for analyses of related and unrelated samples and for both marginal genetic and gene-environment interaction tests. We conducted simulations to validate our approximations and to provide a quantitative perspective on the statistical power of various scenarios, including power loss due to family relatedness and power gains due to conditioning on the polygenic signal. Our analyses also demonstrate that the power of gene-environment interaction GWAS in related individuals strongly depends on the family structure and exposure distribution. Finally, we performed a series of mixed-model GWAS on data from the UK Biobank and confirmed the simulation results. We notably found that the expected power drop due to family relatedness in the UK Biobank is negligible.

Original languageEnglish
Article numberjkab057
JournalG3 (Bethesda, Md.)
Volume11
Issue6
Number of pages10
ISSN2160-1836
DOIs
Publication statusPublished - Jun 2021

    Research areas

  • Effective sample size, Gwas, Linear mixed models, GWAS, VARIANTS, linear mixed models, effective sample size, POWER, GENOME, gwas, MODELS, LINKAGE, JOINT

See relations at Aarhus University Citationformats

ID: 213691235