Reference I Admixture Analysis K=10-12

A week later, some more Admixture analysis of Reference I dataset.

As usual, the results are available in a spreadsheet, which is also listed on my sidebar.

Let's start with K=10.

Admixture: Reference I populations K=10

C1 South Asian C2 Kalash
C3 Southwest Asian C4 Southeast Asian
C5 European C6 Papuan
C7 Northeast Asian C8 Siberian
C9 West African C10 East African

The addition here is basically of the Siberian component which is highest among the Yakut.

Fst divergences between estimated populations for K=10:

C1 C2 C3 C4 C5 C6 C7 C8 C9
C2 0.057
C3 0.064 0.073
C4 0.089 0.127 0.136
C5 0.063 0.061 0.038 0.131
C6 0.167 0.209 0.215 0.202 0.210
C7 0.080 0.120 0.129 0.032 0.123 0.190
C8 0.085 0.117 0.127 0.059 0.118 0.203 0.039
C9 0.152 0.174 0.161 0.201 0.171 0.266 0.195 0.199
C10 0.115 0.133 0.117 0.166 0.128 0.233 0.160 0.163 0.036

Now for K=11,

Admixture: Reference I populations K=11

C1 South Asian C2 Kalash
C3 Southwest Asian C4 Southeast Asian
C5 European C6 Papuan
C7 Siberian C8 Northeast Asian
C9 East African Bantus C10 West African
C11 East African

C8 at K=11 is now modal among the Han instead of the Japanese. This affected the Southeast Asian C4 component which is now more of a real Southeast Asian one.

The new ancestral component C9 is among the Bantus of eastern and southern Africa. It is highest among the Luhya and Bantus of Kenya.

Fst divergences between estimated populations for K=11:

C1 C2 C3 C4 C5 C6 C7 C8 C9 C10
C2 0.055
C3 0.062 0.072
C4 0.081 0.120 0.128
C5 0.063 0.063 0.038 0.124
C6 0.169 0.211 0.215 0.195 0.213
C7 0.089 0.128 0.135 0.057 0.130 0.203
C8 0.083 0.122 0.131 0.031 0.127 0.194 0.039
C9 0.143 0.165 0.150 0.185 0.162 0.259 0.195 0.189
C10 0.152 0.174 0.160 0.194 0.172 0.268 0.203 0.198 0.014
C11 0.104 0.122 0.101 0.149 0.115 0.226 0.158 0.152 0.037 0.043

At K=12,

Admixture: Reference I populations K=12

C1 South Asian C2 Balochistan/Caucasus
C3 Kalash C4 Southeast Asian
C5 Southwest Asian C6 European
C7 Papuan C8 Northeast Asian
C9 Siberian C10 East African Bantus
C11 West African C12 East African

The Kalash component has split, with an assist from Southwest Asian, into a pure Kalash component (C3) and a Balochistan/Caucasus (C2) which is highest in Southwestern Pakistan (Brahui, Makrani, Balochi) at 60-57% followed by Georgians, Lezgin, Adeygei, Azerbaijan Jews and Iranian Jews (56-50%).

The Southwest Asian component (C5) is now more of a Southwest Asian and North/Northwest African component. The West Asian element in it has been reduced.

The Northeast Asian component (C8) is now again centered on Japan. I have a solution for this movement which I'll apply in my next round of analysis.

Fst divergences between estimated populations for K=12:

C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11
C2 0.057
C3 0.066 0.060
C4 0.089 0.124 0.136
C5 0.075 0.057 0.087 0.142
C6 0.066 0.040 0.073 0.130 0.048
C7 0.167 0.205 0.219 0.202 0.220 0.210
C8 0.080 0.117 0.128 0.032 0.134 0.122 0.190
C9 0.085 0.114 0.126 0.059 0.133 0.117 0.203 0.039
C10 0.145 0.154 0.176 0.192 0.154 0.162 0.258 0.187 0.190
C11 0.154 0.163 0.186 0.201 0.164 0.172 0.266 0.195 0.199 0.014
C12 0.107 0.109 0.135 0.157 0.105 0.116 0.225 0.151 0.154 0.035 0.041

Higher K value admixture analysis will continue.

26 Comments.

  1. Is C2 the "Dagestani" component?

    • Since it's a little higher in southwestern Pakistan than in Daghestan, the label Daghestani is not as appropriate in my opinion.

      It seems similar to Dodecad's Daghestani component but I think this one is higher among the Punjabis etc than Daghestani component was.

  2. A projection of the results geographically in contours style for selected admixture components would be incredibly useful to determine labels.

  3. At K=12, the European and Pakistani/Caucasian components have one of the lowest Fst divergences on the table. The only one I see lower is the one between the East African and East African Bantu components.

  4. Zack I'm trying to tie this in with thess pieces.

    http://blogs.discovermagazine.com/gnxp/2010/12/some-of-the-indo-europeans-found/

    http://blogs.discovermagazine.com/gnxp/2010/12/south-asians-too-are-sons-of-the-farmers/

    I am also have a sepia mutiny discussion on genes with Razib; just to clarify I'm not versed in the science so I just like skimming through the analysis.

    I'm aiming for a cohesive narrative but then I probably will be making myself more confused since I don't understand many of the constituent parts. We need more theories people and random speculations lol 😛

    • If you look at Dienekes's bar plot which has the Dagestani component, you'll notice that it peaks among the Lezgin and is fairly low among the Baloch.

  5. { Brown Pundits } » Genes in the Desisphere - pingback on February 18, 2011 at 5:15 pm
  6. Admixture K=12, HRP0011-HRP0020 | Harappa Ancestry Project - pingback on February 23, 2011 at 7:40 am
  7. Reference I Dendrogram | Harappa Ancestry Project - pingback on February 25, 2011 at 12:33 pm
  8. Fst for Reference I Admixture K=12 | Harappa Ancestry Project - pingback on February 27, 2011 at 12:28 am
  9. Admixture K=12, HRP0001 to HRP0040 | Harappa Ancestry Project - pingback on February 28, 2011 at 6:15 am
  10. Admixture K=12, HRP0041-HRP0050 | Harappa Ancestry Project - pingback on March 8, 2011 at 4:51 pm
  11. My Harappa Project Results | Procrastination - pingback on March 16, 2011 at 11:00 am
  12. Admixture K=12, HRP0051-HRP0060 | Harappa Ancestry Project - pingback on March 21, 2011 at 1:09 pm
  13. Admixture K=12, HRP0061-HRP0070 | Harappa Ancestry Project - pingback on March 28, 2011 at 11:16 pm
  14. Admixture K=12, HRP0071-HRP0080 | Harappa Ancestry Project - pingback on April 5, 2011 at 2:10 pm
  15. Admixture K=12, HRP0081-HRP0090 | Harappa Ancestry Project - pingback on April 25, 2011 at 12:09 pm
  16. Reference I Admixture Errors | Harappa Ancestry Project - pingback on May 6, 2011 at 1:24 am