Continuing the admixture analysis on my reference dataset I, let's look at K=6 ancestral components.
As before, all the results are listed in a spreadsheet.
For K=6, we get the following plot:
Admixture: Reference populations K=6
C1 (red) is the South Asian ancestral component. However, the Austronesian (Papuan/Melanesian) component has now separated from it as C5 (blue). You can see small proportions of the Papuan component among South Indian and Southeast Asian (Malay and Cambodian) populations.
C3 (green) is exactly the same as C3 in K=5 run and represents East Asian populations. C6 (magenta) is exactly the same component as C5 in the K=5 run and represents African ancestry. C4 (cyan) is the same as C4 in the K=5 run and represents Southwest/West Asia.
C2 (yellow) is the European component (maximum among North Europeans) almost the same as C2 in K=5 analysis. The major difference is that C2 (in K=6) is reduced among South Asians as compared to K=5. This is due to the South Asian component being higher for them.
Fst divergences between estimated populations for K=6:
|
C1 |
C2 |
C3 |
C4 |
C5 |
C2 |
0.053 |
|
|
|
|
C3 |
0.084 |
0.114 |
|
|
|
C4 |
0.068 |
0.052 |
0.130 |
|
|
C5 |
0.178 |
0.205 |
0.184 |
0.218 |
|
C6 |
0.148 |
0.165 |
0.186 |
0.157 |
0.260 |
When we increase the ancestral components to K=7,
Admixture: Reference populations K=7
The South Asian component (C1/red) is the same. Note that there is a significant drop from about 51% to 29% from Makranis to Iranians (ignore the Paniya as there are only 4 samples with one being very different). Looking at the 19 individual Iranian samples from our reference dataset, their South Asian ancestral component values vary from 17% to 33%.
The Southwest/West Asian component (C2/yellow) is now higher among West Asians and lower among East Africans compared to K=6 run. C3/green is the European component which now almost disappears from the Southwest Asian populations.
The East Asian component (C4/bluish green) is the same as before as is the Papuan C5/light blue component.
The African ancestry breaks into West African (C6/blue) and East African (C7/magenta).
Note that the split here is different from the batch 1 run where the East Asian split into two for K=7 and the African split happened at K=8, the opposite of what happened here.
Fst divergences between estimated populations for K=7:
|
C1 |
C2 |
C3 |
C4 |
C5 |
C6 |
C2 |
0.058 |
|
|
|
|
|
C3 |
0.052 |
0.034 |
|
|
|
|
C4 |
0.082 |
0.122 |
0.113 |
|
|
|
C5 |
0.176 |
0.210 |
0.204 |
0.184 |
|
|
C6 |
0.152 |
0.159 |
0.167 |
0.190 |
0.264 |
|
C7 |
0.112 |
0.113 |
0.122 |
0.153 |
0.229 |
0.037 |
At K=8, the East Asian components forks into two: A Southeast Asian (C4/bright green) one that is highest among the Dai, Malay, Cambodians and Lahu; and a Northeast Asian one (C6/blue) that is maximum among the Yakut, Oroqen, Japanese, Hezhen and Daur.
Admixture: Reference populations K=8
Among most South Asian groups in our reference dataset, the Southeast Asian component is much more common than the Northeast Asian one.
Fst divergences between estimated populations for K=8:
|
C1 |
C2 |
C3 |
C4 |
C5 |
C6 |
C7 |
C2 |
0.058 |
|
|
|
|
|
|
C3 |
0.052 |
0.034 |
|
|
|
|
|
C4 |
0.096 |
0.133 |
0.125 |
|
|
|
|
C5 |
0.177 |
0.211 |
0.205 |
0.201 |
|
|
|
C6 |
0.093 |
0.131 |
0.122 |
0.046 |
0.195 |
|
|
C7 |
0.152 |
0.159 |
0.167 |
0.200 |
0.266 |
0.201 |
|
C8 |
0.113 |
0.113 |
0.113 |
0.163 |
0.231 |
0.163 |
0.037 |
Here's the plot for K=9 ancestral components:
Reference Populations Admixture K=9
The new component here is the Kalash component which is at 94% among the Kalash but is in the 30-40% range for Caucasian and Pakistani populations. It is also present among West Asians, Europeans and Central Asians to a small degree.
Looking at the Kalash samples, they seem fairly uniform and mostly with only a little of the other ancestral components except for one sample which has 70% Kalash component.
Kalash Admixture K=9
Fst divergences between estimated populations for K=9:
|
C1 |
C2 |
C3 |
C4 |
C5 |
C6 |
C7 |
C8 |
C2 |
0.056 |
|
|
|
|
|
|
|
C3 |
0.064 |
0.072 |
|
|
|
|
|
|
C4 |
0.088 |
0.126 |
0.136 |
|
|
|
|
|
C5 |
0.064 |
0.061 |
0.039 |
0.131 |
|
|
|
|
C6 |
0.167 |
0.208 |
0.214 |
0.202 |
0.211 |
|
|
|
C7 |
0.084 |
0.124 |
0.133 |
0.045 |
0.127 |
0.195 |
|
|
C8 |
0.152 |
0.173 |
0.161 |
0.201 |
0.172 |
0.266 |
0.200 |
|
C9 |
0.115 |
0.133 |
0.117 |
0.165 |
0.129 |
0.233 |
0.164 |
0.036 |
From the Fst values, we can see that the Kalash component is closest to the South Asian component and then to the European component.
To summarize, here are the ancestral components inferred for different values of K.
K=2 |
K=3 |
K=4 |
K=5 |
K=6 |
K=7 |
K=8 |
K=9 |
Eurasian |
European |
S Asian |
S Asian |
S Asian |
S Asian |
S Asian |
S Asian |
African |
E Asian |
European |
European |
European |
SW Asian |
SW Asian |
Kalash |
|
African |
E Asian |
E Asian |
E Asian |
European |
European |
SW Asian |
|
|
African |
SW Asian |
SW Asian |
E Asian |
SE Asian |
SE Asian |
|
|
|
African |
Papuan |
Papuan |
Papuan |
European |
|
|
|
|
African |
W African |
NE Asian |
Papuan |
|
|
|
|
|
E African |
W African |
NE Asian |
|
|
|
|
|
|
E African |
W African |
|
|
|
|
|
|
|
E African |
Note that for a specific value of K, they are listed in approximately decreasing average percentage among the South Asian samples in our reference dataset I.
Recent Comments