Reference I Admixture Analysis K=10-12

Posted by Zack on February 17, 2011

A week later, some more Admixture analysis of Reference I dataset.

As usual, the results are available in a spreadsheet, which is also listed on my sidebar.

Let's start with K=10.

Admixture: Reference I populations K=10

C1	South Asian	C2	Kalash
C3	Southwest Asian	C4	Southeast Asian
C5	European	C6	Papuan
C7	Northeast Asian	C8	Siberian
C9	West African	C10	East African

The addition here is basically of the Siberian component which is highest among the Yakut.

Fst divergences between estimated populations for K=10:

	C1	C2	C3	C4	C5	C6	C7	C8	C9
C2	0.057
C3	0.064	0.073
C4	0.089	0.127	0.136
C5	0.063	0.061	0.038	0.131
C6	0.167	0.209	0.215	0.202	0.210
C7	0.080	0.120	0.129	0.032	0.123	0.190
C8	0.085	0.117	0.127	0.059	0.118	0.203	0.039
C9	0.152	0.174	0.161	0.201	0.171	0.266	0.195	0.199
C10	0.115	0.133	0.117	0.166	0.128	0.233	0.160	0.163	0.036

Now for K=11,

Admixture: Reference I populations K=11

C1	South Asian	C2	Kalash
C3	Southwest Asian	C4	Southeast Asian
C5	European	C6	Papuan
C7	Siberian	C8	Northeast Asian
C9	East African Bantus	C10	West African
C11	East African

C8 at K=11 is now modal among the Han instead of the Japanese. This affected the Southeast Asian C4 component which is now more of a real Southeast Asian one.

The new ancestral component C9 is among the Bantus of eastern and southern Africa. It is highest among the Luhya and Bantus of Kenya.

Fst divergences between estimated populations for K=11:

	C1	C2	C3	C4	C5	C6	C7	C8	C9	C10
C2	0.055
C3	0.062	0.072
C4	0.081	0.120	0.128
C5	0.063	0.063	0.038	0.124
C6	0.169	0.211	0.215	0.195	0.213
C7	0.089	0.128	0.135	0.057	0.130	0.203
C8	0.083	0.122	0.131	0.031	0.127	0.194	0.039
C9	0.143	0.165	0.150	0.185	0.162	0.259	0.195	0.189
C10	0.152	0.174	0.160	0.194	0.172	0.268	0.203	0.198	0.014
C11	0.104	0.122	0.101	0.149	0.115	0.226	0.158	0.152	0.037	0.043

At K=12,

Admixture: Reference I populations K=12

C1	South Asian	C2	Balochistan/Caucasus
C3	Kalash	C4	Southeast Asian
C5	Southwest Asian	C6	European
C7	Papuan	C8	Northeast Asian
C9	Siberian	C10	East African Bantus
C11	West African	C12	East African

The Kalash component has split, with an assist from Southwest Asian, into a pure Kalash component (C3) and a Balochistan/Caucasus (C2) which is highest in Southwestern Pakistan (Brahui, Makrani, Balochi) at 60-57% followed by Georgians, Lezgin, Adeygei, Azerbaijan Jews and Iranian Jews (56-50%).

The Southwest Asian component (C5) is now more of a Southwest Asian and North/Northwest African component. The West Asian element in it has been reduced.

The Northeast Asian component (C8) is now again centered on Japan. I have a solution for this movement which I'll apply in my next round of analysis.

Fst divergences between estimated populations for K=12:

	C1	C2	C3	C4	C5	C6	C7	C8	C9	C10	C11
C2	0.057
C3	0.066	0.060
C4	0.089	0.124	0.136
C5	0.075	0.057	0.087	0.142
C6	0.066	0.040	0.073	0.130	0.048
C7	0.167	0.205	0.219	0.202	0.220	0.210
C8	0.080	0.117	0.128	0.032	0.134	0.122	0.190
C9	0.085	0.114	0.126	0.059	0.133	0.117	0.203	0.039
C10	0.145	0.154	0.176	0.192	0.154	0.162	0.258	0.187	0.190
C11	0.154	0.163	0.186	0.201	0.164	0.172	0.266	0.195	0.199	0.014
C12	0.107	0.109	0.135	0.157	0.105	0.116	0.225	0.151	0.154	0.035	0.041

Higher K value admixture analysis will continue.

Admixturereference

← Chinese Samples

Dodecad vs Harappa →

26 Comments.

Zachary Latif February 17, 2011 at 9:31 am

Is C2 the "Dagestani" component?
- Zack February 17, 2011 at 10:31 am
  
  Since it's a little higher in southwestern Pakistan than in Daghestan, the label Daghestani is not as appropriate in my opinion.
  
  It seems similar to Dodecad's Daghestani component but I think this one is higher among the Punjabis etc than Daghestani component was.
Simranjits February 17, 2011 at 10:17 am

A projection of the results geographically in contours style for selected admixture components would be incredibly useful to determine labels.
- Zack February 17, 2011 at 10:33 am
  
  I agree. If I can figure out a way it would be great.
  
  I know I can do country-level maps easily but we need some better detail in South Asia.
  
  Anyone know of any software we can use to do gradient maps of the world?
  - RK February 19, 2011 at 4:11 pm
    
    I think MATLAB's Mapping Toolbox can do that: http://www.mathworks.com/products/mapping/
    - Zack February 19, 2011 at 9:52 pm
      
      Thanks! Unfortunately I don't have Matlab at home. So I am looking at R's mapping libraries. It'll take some effort to associate the ethnicities with different regions but a map will be ready one day. 🙂
sv February 17, 2011 at 11:27 am

At K=12, the European and Pakistani/Caucasian components have one of the lowest Fst divergences on the table. The only one I see lower is the one between the East African and East African Bantu components.
- sv February 17, 2011 at 11:29 am
  
  Oh, and K=12 would be a good K value for future runs using project participants, since the presence of the Pakistani/Caucasian component fits the project's focus.
  - Zack February 17, 2011 at 11:50 am
    
    Already on it. 🙂
Zachary Latif February 17, 2011 at 5:09 pm

Zack I'm trying to tie this in with thess pieces.

http://blogs.discovermagazine.com/gnxp/2010/12/some-of-the-indo-europeans-found/

http://blogs.discovermagazine.com/gnxp/2010/12/south-asians-too-are-sons-of-the-farmers/

I am also have a sepia mutiny discussion on genes with Razib; just to clarify I'm not versed in the science so I just like skimming through the analysis.

I'm aiming for a cohesive narrative but then I probably will be making myself more confused since I don't understand many of the constituent parts. We need more theories people and random speculations lol 😛
- Zack February 17, 2011 at 5:55 pm
  
  If you look at Dienekes's bar plot which has the Dagestani component, you'll notice that it peaks among the Lezgin and is fairly low among the Baloch.
{ Brown Pundits } » Genes in the Desisphere - pingback on February 18, 2011 at 5:15 pm
Admixture K=12, HRP0011-HRP0020 | Harappa Ancestry Project - pingback on February 23, 2011 at 7:40 am
Reference I Dendrogram | Harappa Ancestry Project - pingback on February 25, 2011 at 12:33 pm
Fst for Reference I Admixture K=12 | Harappa Ancestry Project - pingback on February 27, 2011 at 12:28 am
Admixture K=12, HRP0001 to HRP0040 | Harappa Ancestry Project - pingback on February 28, 2011 at 6:15 am
Model-Based Clusters of Admixture Results | Harappa Ancestry Project - pingback on March 2, 2011 at 7:45 am
Reference I Admixture Analysis K=13 | Harappa Ancestry Project - pingback on March 4, 2011 at 6:15 am
Admixture K=12, HRP0041-HRP0050 | Harappa Ancestry Project - pingback on March 8, 2011 at 4:51 pm
Admixture K=10-12, HRP0001 to HRP0010 | Harappa Ancestry Project - pingback on March 16, 2011 at 6:51 am
My Harappa Project Results | Procrastination - pingback on March 16, 2011 at 11:00 am
Admixture K=12, HRP0051-HRP0060 | Harappa Ancestry Project - pingback on March 21, 2011 at 1:09 pm
Admixture K=12, HRP0061-HRP0070 | Harappa Ancestry Project - pingback on March 28, 2011 at 11:16 pm
Admixture K=12, HRP0071-HRP0080 | Harappa Ancestry Project - pingback on April 5, 2011 at 2:10 pm
Admixture K=12, HRP0081-HRP0090 | Harappa Ancestry Project - pingback on April 25, 2011 at 12:09 pm
Reference I Admixture Errors | Harappa Ancestry Project - pingback on May 6, 2011 at 1:24 am

Trackbacks and Pingbacks:

{ Brown Pundits } » Genes in the Desisphere - Pingback on 2011/02/18/ 17:15
Admixture K=12, HRP0011-HRP0020 | Harappa Ancestry Project - Pingback on 2011/02/23/ 07:40
Reference I Dendrogram | Harappa Ancestry Project - Pingback on 2011/02/25/ 12:33
Fst for Reference I Admixture K=12 | Harappa Ancestry Project - Pingback on 2011/02/27/ 00:28
Admixture K=12, HRP0001 to HRP0040 | Harappa Ancestry Project - Pingback on 2011/02/28/ 06:15
Model-Based Clusters of Admixture Results | Harappa Ancestry Project - Pingback on 2011/03/02/ 07:45
Reference I Admixture Analysis K=13 | Harappa Ancestry Project - Pingback on 2011/03/04/ 06:15
Admixture K=12, HRP0041-HRP0050 | Harappa Ancestry Project - Pingback on 2011/03/08/ 16:51
Admixture K=10-12, HRP0001 to HRP0010 | Harappa Ancestry Project - Pingback on 2011/03/16/ 06:51
My Harappa Project Results | Procrastination - Pingback on 2011/03/16/ 11:00
Admixture K=12, HRP0051-HRP0060 | Harappa Ancestry Project - Pingback on 2011/03/21/ 13:09
Admixture K=12, HRP0061-HRP0070 | Harappa Ancestry Project - Pingback on 2011/03/28/ 23:16
Admixture K=12, HRP0071-HRP0080 | Harappa Ancestry Project - Pingback on 2011/04/05/ 14:10
Admixture K=12, HRP0081-HRP0090 | Harappa Ancestry Project - Pingback on 2011/04/25/ 12:09
Reference I Admixture Errors | Harappa Ancestry Project - Pingback on 2011/05/06/ 01:24

Harappa Ancestry Project

Genetics and South Asia