Is it better to have different cutoffs for HWE filtering

Is it better to have different cutoffs for HWE filtering based on sample size (considering working with different cohorts)? And if so, how should we set these cut offs?

We usually use 1 threshold for all sample sizes but as you will see when you do the QC prac as the HWE check isn’t right at the start we don’t usually lose many at this step

You can use less stringent threholds for very large cohorts (i.e., UK Biobank), because they have more power to detect deviations that are not problematic. A bit more info about that here: https://www.medrxiv.org/content/10.1101/2024.02.07.24301951v1.full.pdf

For the practical we use a single cutoff.

The latest recommendations (implemented in Plink 2 but not Plink 1.9) involve a formula that accounts for sample size when calculating a cutoff.

Yes, for plink2 --hwe keep-fewhet helps account for the variable power with sample size - so you can use the same threshold. This is a very recent

Some people will also keep SNPs that are out of HWE if it’s because they have too few heterozygotes.
When a SNP violates HWE because it has too many heterozygotes, that can be a problem with the genotype calling, eg. an error in the data.
When a SNP violates HWE because it has too few heterozygotes, that can be a population-based pattern. So it’s not necessarily an error in the data, the pattern may be real, and you may not want to exclude that SNP. (It will be a judgement call specific to your analysis.)

That is separate from the formula for the cutoff based on sample size, but Plink added both options around the same time.