Edit: Most of my questions in this post were answered during the tutorial. I am including the answers as I think I understand them below, plus one remaining question I’m still confused about at the bottom.
Questions that were answered:
- On slide 32 (around 1:51 in the video), the notation is introduced that VE = VE11, VC = VC11, etc. What does the 11 part mean?
The 11 part refers to the row/column subscript for a matrix. For a univariate twin model, each of VC, VA, and VE are a 1x1 matrix. For more complex models with multiple traits, the matrices may be larger.
- In the twin model diagram on slide 32, VC11 is used to label var(C1), var(C2), and cov(C1,C2). I understand why we can assume var(C1) = var(C2), but why is cov(C1,C2) equal to the variance?
- A similar question applies for VA11, which is also used as a label for both variance and covariance.
cov(C1,C2) is equal to var(C1) because the correlation between C1 and C2 is 1. If cor(x,y) = 1, x = y. In that case, cov(x,x) = var(x).
- On slide 39 (around 19:35 in the video), it shows that OpenMX gives confidence intervals for the variance components. Are there ever circumstances where the answer you’d get for a likelihood ratio test for dropping a variable will differ from the answer you’d get from looking at that variable’s confidence intervals in the model that includes it?
OpenMx gives confidence intervals, but those confidence intervals are not symmetric. Because the confidence intervals aren’t symmetric, you can’t do a regular significance test the way you would when you had a point estimate and a standard error. That’s why the fit test is used to get a p-value.
However, the fit test should agree with the confidence interval, there shouldn’t be cases where they disagree. That’s because when you drop a parameter and do a fit test, you’re doing the equivalent of fixing that parameter to 0. If 0 is not in the confidence interval for the parameter, fixing the parameter to 0 (by dropping it) would produce significantly worse fit.
- Question 4 was related to definition variables and has been moved to a separate thread.
Questions that I am still confused about:
I’ve seen people often fix something (one or more coefficients or variances) to 1, and usually it has something to do with making variances or coefficients for latent variables interpretable. How do you know/decide what you’re going to fix to 1 to achieve that?
(I also noticed that for the model shown in this video, the raw estimates shown for the variance components were not “proportion of variance in the phenotype explained” and needed to be scaled. So in this model, is setting the straight-arrow path coefficients to 1 just done to make the model identified? And if the variance of the phenotype was 1, would those coefficients being 1 then make the raw estimates for the variance components be “proportion of variance explained” without needing to be scaled?)
Sorry to ask so many questions!! I’ve never really understood SEM, and I’m hoping to finally wrap my head around it this time! (And thank you for all the great videos!)