Can you discuss further the inheritance of parental alleles that are informative vs non-informative/ambiguous? It appears that inheriting non-ambiguous alleles are those that would not bond on complementary strands. Thanks!
I think this may have come up from my somewhat garbled description of transmitted vs. non-transmitted alleles, about which Day 8 will provide MUCH more information. In a nutshell, the idea is that a polygenic score derived from non-transmitted alleles in the parents should be independent of the phenotype of the child. However, if such correlations are non-zero, they might indicate the effects of assortative mating or parent to child cultural transmission. And probably a few other things
I think we spoke about this on a zoom, but in case this is a different post.
In short. Yes. Alleles that are not ambiguous are those that do not bond to each other across the strands.
Here’s a more detailed run-down…
Each of us has 2 chromosomes: one from each biological parent.
Each chromosome has 2 strands: let’s call them strand1 and strand2.
Each strand is a string of nucleotide bases.
Those bases bond to a complimentary base (A - T and C - G). This is what binds the two strands together.
Because of the complimentary bonding we only need to mention or measure the base on one strand to communicate or know what both are.
Because we have 1 chromosome from each parent we have 2 versions of every locus. I’ll call them A1 and A2.
The issue of ambiguous alleles happens because we cannot be certain that the same strand has been measured by all genotyping methods.
If an allele in the population might be either an A or a G on strand1, then the complementary bonding is a T or a C on strand2.
I’ll use an example for the next bit:
Consider a situation where in cohort 1 a genotyping method measured stand1, those data would call people as having genotypes AA, or AG, or GG.
While another cohort used a company that measured strand2, those data would call people as having genotypes TT, TC, or CC.
Now at this locus the reference human genome has only two alleles and that is an A or a G.
If we “flip” the genotypes from cohort 2 (this is converting those genotypes to their complimentary bases), then we end up with the genotypes as AA, or AG, or GG. And they are consistent with the reference genome.
This an example of an unambiguous alleles. A and G do not bond with each other across the strands.
At a different locus, cohort 1 has genotyped people as being AA, AT, or TT on strand1.
While cohort 2 has genotyped people as TT, TA, or AA on strand2.
We know from the reference genome that there are two allele at this locus and they are an A or a T.
But we do not know which cohort needs to be flipped.
The unit of interest is the base-pair across both strands when speaking of allelic effect. For simplicity we only need to speak of one because of the bonding.
In our first example, let’s say that the A allele increases the trait of interest compared to the G allele. Then we can readily flip cohort 2 to have our data coded as A or G and have our allelic effects consistent.
But in our second example, let’s say that the A allele increases the trait of interest compared to the T allele. Then we cannot tell which cohort needs to be flipped to make our data align to either each other or the reference.
Now all is not lost for those cases.
We might be able to use the frequency of the alleles to be able to compare to the reference genome or to compare across cohorts.
In our ambiguous example, if the frequency of A was .95 in the population and T was .05, then it would be fairly clear if our data corresponded to the population frequencies.
If the frequencies of both A and T were similar, then we would probably give up at trying to resolve those alleles in our data.
I didn’t intend to write such a mammoth response!
Hopefully it has confirmed or clarified what you we thinking.