I did a South Asian PCA + Mclust analysis last month. Here are the PCA plots from that analysis.
First, the eigenvectors are not scaled to the eigenvalues in the plots. So here's a table explaining how much each eigenvector is worth.
Eigenvector | Percentage variation explained |
---|---|
1 | 1.134% |
2 | 0.452% |
3 | 0.351% |
4 | 0.263% |
5 | 0.254% |
6 | 0.236% |
7 | 0.228% |
8 | 0.224% |
9 | 0.215% |
10 | 0.209% |
11 | 0.207% |
12 | 0.205% |
13 | 0.203% |
14 | 0.201% |
15 | 0.198% |
16 | 0.194% |
17 | 0.191% |
18 | 0.189% |
19 | 0.189% |
20 | 0.188% |
21 | 0.188% |
22 | 0.187% |
23 | 0.186% |
24 | 0.185% |
25 | 0.184% |
26 | 0.184% |
27 | 0.183% |
28 | 0.182% |
29 | 0.180% |
30 | 0.180% |
31 | 0.179% |
32 | 0.179% |
Eigenvector 1 looks like the Indian cline but it's actually a West-East Eurasian cline. It's quite similar to Reich et al's Indian cline for their subset of populations (correlation between pc1 and ASI is 0.998869) but since East Asian is not separated out here due to the lack of any East Asian samples, we get a mix of East Asian and Ancestral South Indian towards the right of the plot.
Eigenvector 2 separates Kalash from everyone else.
Thanks Zack, excellent bunch of plots. As you've mentioned, Eigenvector 1 seems to illustrate the Indian cline best. Now if only we were able to find ourselves on this plot :p. Curiously enough, there's one HAP participant clustering with a Balochi (green X symbol). I wonder who that is. Also, Zack, I've noticed that in some other plots, as in eigenvector 1, that the Burusho and Kashmiri Pandits cluster together. What could be the cause of this? Similar total West Eurasian percentages?
Working on a plot where participants can find out their position.
We do have a half-Baloch participant, who must be the one clustering with the Baloch.