I have a bit of a struggle understanding the ‘width’ and ‘length’ thingy wrt twin data. How do I know how to treat the data to get my analyses up and running smoothly?
What I´m trying to ask is: When do i want it to be length and when do I want it to be width? and how do i transit (or how to say) between those two?
All the best
Hi @anonymous9 ! If you would like to run any of the twin models that we have been doing as part of the workshop so far, you will need to make your data “wide”. In other words, Each row will contain the data from both members of the twin pair. Further, we will need to denote whether a variable represents Twin 1 or Twin 2 with some extension (e.g., Weight_Tw1 and Weight_Tw2).
For today’s practical the data are given in the wide (aka horizontal) format.
As applied to twin data:
wide means each row is a twin pair (each row include data of 1st twin member and 2nd twin member)
long mean each row is an individual (each row includes data of a single individual)
example (2 mz twin pairs and 1 dz twin pair)
zyg famnr pheno twin 1 pheno twin 2
1 101 1.3 1.5
1 102 .95 1.01
2 103 1.2 1.4
zyg famnr member pheno
1 101 1 1.3
1 101 2 1.5
1 102 1 .95
1 102 2 1.01
2 103 1 1.2
2 103 2 1.4
The long format is used in linear mixed modeling (a.k.a. multilevel modeling), which includes GCTA type analyses