Backing out effective sample size

Andrew_Grotzinger · March 10, 2023, 4:58pm

Hi everyone,

There were several questions yesterday that centered around how to get the appropriate effective sample size for various situations (e.g., when analyzing GWAS data with related individuals, for case/control traits for which the cohort-specific sample size information is not available). In these instances the effective sample size can be estimated directly from the GWAS data as:
4/(2pq x SE^2)
where 2pq is the SNP variance calculated as 2MAF (1-MAF);
MAF is the minor allele frequency;
and SE is the GWAS standard error (on the logistic scale for binary traits)

This is described in the Online Supplement (p. 10) of a recent paper 2023 paper (reference below). We also recommend capping the effective sample size estimate at 1.1 and 0.5 of the total effective sample size calculated using the aggregate number of cases and controls. There is an example of calculating the effective sample size from GWAS data for anxiety disorders on this page of the GenomicSEM github:

Grotzinger, A. D., de la Fuente, J., Privé, F., Nivard, M. G., & Tucker-Drob, E. M. (2023). Pervasive downward bias in estimates of liability-scale heritability in genome-wide association study meta-analysis: a simple solution. Biological Psychiatry , 93 (1), 29-36.