C versus D

Why are C and D negatively confounded?

The background here is that in a univariate analysis of monozygotic (MZ) and dizygotic (DZ) twins, parameter estimates are based upon three observed statistics: the variance of the phenotype, the covariance between MZ twins, and the covariance between DZ
twins. Note that there is only one variance statistic, as all variances are parameterized to have the same expectation across MZ and DZ twins, and for first and second born twins. Each statistic may be expressed as a function of the model parameters:

VAR(P) = V_A + V_D + V_C + V_E

COV(MZ) = V_A + V_D + V_C

COV(DZ) = 0.5*V_A + 0.25*V_D + V_C

where VAR(P) is the phenotypic variance of the variable under study, COV(MZ) is the covariance between MZ twins, COV(DZ) the covariance between DZ twins, and V_A, V_D, V_C and VE
are the latent additive genetic, dominance genetic, common environmental and unique environmental variances that are being estimated (i.e. the parameters).

With three observed statistics, you can only estimate a maximum of three parameters, less the model be unidentified. Typically investigators estimate either an ACE model (if the DZ correlation > 0.5MZ correlation) and assume D is zero, or an ADE model (if the MZ correlation < 0.5*MZ correlation) and assume that C is zero.

The reason why C and D are “negatively correlated”, is that C makes DZ twins more similar to each other, whereas D makes MZ twins more similar to each other relative to DZ twins.

Finally, it’s possible to estimate all four sources of variation by adding additional relatives or informative individuals (e.g. parents, adopted individuals etc etc).

Some of these issues discussed in greater detail here:

The Boulder Workshop Question Box - PubMed

Hope that helps?

Dave