Does anyone know how to handle data where the assumptions for equal means and variances across twin order and zygosity are violated (and any iteration of the possibilities for these assumptions to be violated)?
My advice would be to look at the data. Different means or variances often arise from a small number of outliers in the distributions. You may want to exclude those individuals, or Winsorise the distribution (Winsorizing - Wikipedia).
The problem may also come from non-normal distributions (e.g. heavy tails when working with clinical scores). In this case, you may want to apply a transformation (e.g. log transform) to prevent the tails causing false positives or unreliable results.
In doubt, repeat the analyses with several transforms, or exclusions to evaluate if the results are robust (this is called sensitivity analyses). Settle on the heritability estimates that are found in most analyses scenarios, and do not hesitate to report the sensitivity analyses in the paper (or supplementary) as it can only add confidence in the reported results.
Agreed. Consider using ordinal variable analysis instead of continuous data analysis. This approach can help a lot, but it isn’t a panacea. The analysis of sum-score scales frequently encounters weird-looking distributions, say with a big pile of zeroes if it is a clinical instrument and most of the population has none of the symptoms. It may be ok to analyze this distribution as ordinal, but it’s still not ideal because there are several different ways to end up with the same score. Ideally, the items should be jointly analyzed in a multivariate model, but this can be challenging if there are many items. The direction I think we should look is to calculate factor scores and their standard errors, and use the standard errors to moderate the amount of residual variance associated with each item. How well this would work in practice would be a good exercise in simulation.
Thank you!! This gives me some good direction in terms of moving forward.