Thank you for the statistical introduction this morning! My background is in bioinformatics and clinical medicine. One of my goals for the workshop is to identify more resources to help “shore up” my understanding of the underlying statistical methods and their pitfalls for genetic analysis. This morning the problem with ratio methods was alluded to. Is there a good resource to dive into this more fully?
As a follow up, in order to gain some level of understanding of the underlying mathematics I was poking around. A general introduction to PCA which may be useful are:
While these are ok for me. There are probably more mathematically oriented introductions which might be helpful?
I am sure there are others and PLEASE let me know if there are better ones.
Question: for a two (or three) dimensional array, I assume that PCA will be the linear best fit line?
On a second point, it was a bit unclear to me how/when Fst is used to “correct” for population substructure. Is this preferred to PCA? Any insights would be helpful.