What is the typical rate of sex mismatches found during genetic QC? I’d like to know what is considered normal and what might be unusually high.
1 or 2 people per 1000
mostly it is data entry error
Or transgender participants
There was a paper about that recently with UK Biobank data, a lot of the “sex discordant” samples had other data to support that the participants may have been transgender or had intersex traits: https://www.pnas.org/doi/abs/10.1073/pnas.2218700120
So in some cases it may not be desirable to exclude them.