I have a rather general question with respect to how psychometric properties, e.g. reliability measures, relates to ACE estimates, power calculations etc. In a preprint from 2022, Plomin et al writes that: “A first step in this direction is to assess the extent to which differential reliability underlies differential heritability, because reliability, especially test-retest reliability rather than internal consistency, creates a ceiling for heritability” (with no further references to this statement).
I would highly appreciate any comments, suggested readings etc. on this topic.
Heidi Umbach Hansen
Hi Heidi: Suppose we are analysing neuroticism (N) test scores. We administer the N test comprising (say) 20 items in the sample of twins, and based on the item scores we calculate the N sum scores. The N sum scores are used in ADE or ACE twin modeling. Here is my take on the psychometric background 1) the items are supposed to be unidimensional: each item is directly (and causally) dependent on the latent variable N. 2) itemscores are characterized by measurement (m) error 3) consequenly the N sum score is characterized by measument (m) error 4) the reliability of the test scores (the N sum scores) is defined as rel=var(TRUE SCORE) / (var(TRUE SCORE) + var(m error)), 5) the m errors of the items are mutually uncorrelated. In fitting the - say - ADE twin model to the N scores, we obtain A, D, and E variance components. The E variance component will include the term var(m error). For instance suppose that rel=.8, and var(N sum score) = 10. That means that 2 of the 10 is due to error (8/10 = rel). Suppose that we obtain the following estimates of the variances of A, D, E:
5 (A), 2 (D), and 3 (E) (5+2+3 = 10). Suppose we standardize: var(A) = 5/10 = .5, var(D) = 2/10 = .2, var(E) = 3/10 = .3. Ok .5 + .2 = .7 is the broad sense heritability. So far we have not taken in to account the error. var(E) = 3, but the m error account for 2 and E include measurement error. So let’s express this explicitly: A+D+E+m error = 5 + 2 + 1 + 2. Standardized: .5 + .2 + .1 + .2. But we want the results “corrected for unreliability”… So we could obtain those as follows: 10 - 2 = 8, i.e., total variance (10) minus m error variance (2) is true score variance (8). The decomposition of the true score variance is 8 = 5 + 2 + 1. The standardized results - corrected for m error - are 5/8 (.625), 2/8 (.250), and 1/8 (.125).
So IF you know the reliability of the test in you population of interest with great accuracy, you can correct the ADE decomposition results for the measurement error. Here is a quiz question: Why is the standardized var(E) in the ADE or ACE twin model a upper bound estimate of the test’s reliability? Another psychometric theme concerns the dimensionality of the items. This is touched upon in the practical on 13/06 on the common pathway model