HE vs REML estimator in GWAS

Ralph_Porneso · May 16, 2024, 11:34am

Hi,

Are there some good references or is there someone who knows the scale of the difference in computation between REML and HE to estimate the variance components in a mixed effects model, say in GCTA fastGWA?

I have seen some papers showing HE having wider CIs so I am quite hesitant to use it, but I have a longitudinal model which I know takes a lot longer to run than a cross-sectional GWAS. I am trying to balance speed vs point estimate accuracy but currently have no idea of how big the difference is in terms of run time between these 2 estimators.

Thanks in advance!!

(Note: I am considering to do a GWAS on the delta of pheno at t1, t2 and t3 as a last resort.)

baptiste.CD · May 20, 2024, 12:07am

Hi Ralph,
Here is an article I really like about mixed models, Advantages and pitfalls in the application of mixed-model association methods | Nature Genetics, it gives an idea of the computational cost for GCTA/REML.
Computationally, HE is a linear regression [of the phenotype covariance on the genotype covariance - across all pairs of individuals in your sample]. So the number of observations is N*(N-1)/2, which quickly gets large (N is the number of individuals).
This paper by Valentin Hivert et al., uses both methods
Estimation of non-additive genetic variance in human complex traits from a large sample of unrelated individuals - PMC and they discuss the differences of statistical power, and also include considerations about computation.
I am aware of a couple GWAS that use longitudinal data, e.g., Genetic variants associated with longitudinal changes in brain structure across the lifespan | Nature Neuroscience, they can be very interesting per se.
I hope this helps.