West, Central, South & Southeast Asian Admixture

Posted by Zack on May 7, 2011

Another set of admixture runs. This one uses the South Asian, Middle Eastern, Caucasian, Central Asian, Southeast Asian and Oceanian samples from Reference 3.

Basically I consider these to be our target populations. The idea is to build out from here by adding a few samples from other populations to make the results better.

Right now, the absence of African, European, East Asian and Siberian populations makes some of the other populations substitute for them. For example, Siddi works as African substitute while Aonaga works as East Asian substitute.

Here are the admixture results. You can choose the number of ancestral components, K, from the dropdown below.

I find K=11 and K=14 to be the most interesting. They have the two lowest cross-validation errors too.

Admixturereference

← Reference I Admixture Errors

Reference 3 Population Concordance →

21 Comments.

HRP0101 May 7, 2011 at 8:08 am

Very interesting!

May I ask, do you have any idea when you'll be adding your new data (i.e. mine?)

Thanks so much!!!
- Zack May 7, 2011 at 10:18 am
  
  I am going to post the results for HRP0090 to HRP0100 in a day or so. Still waiting to reach 10 for the batch for HRP0101 onwards. As soon as that happens, I will run the analysis.
Parasar May 7, 2011 at 9:54 pm

K=14, Siddi have 10% Ongee.
- Zack May 8, 2011 at 10:32 am
  
  I am not sure how seriously to take that number. Remember that I don't have any African samples in this run and Siddi are majority African. So a Siddi component appears to take most of the variation there.
  
  These isolated group components are sometimes a bit weird. For example, in some Ref3 runs at K>12, Siddi show up as 10% Kalash.
  - Onur May 8, 2011 at 10:40 am
    
    The "Siddi" component seems to be only partially Negroid, the rest of it being Caucasoid (maybe also some ASI).
Zachary LAtif May 8, 2011 at 6:49 am

Seems West Asian mixture is associated with the South-West Levant. Druze to the area north of it and Caucasus seems to be to the north of it (Anatolia too).

Arabian is a proxy for Bedouin (but there were waves of pre-Islamic Bedouin migrants as well).
- Onur May 8, 2011 at 8:00 am
  
  Zachary, these clusters predate all of the ethnic groups, so it is wrong to use them as proxies for any of them. There are only correlations with ethnic groups, not associations.
  - Onur May 8, 2011 at 8:09 am
    
    For instance, many ethnic groups who haven't been genetically affected by Arabian migrations have the Arabian component in not so trivial amounts.
    - Onur May 8, 2011 at 8:20 am
      
      So from now on, I will enclose the component names in quotation marks when mentioning them. E.g., "Arabian" component instead of Arabian component.
      - Zack May 8, 2011 at 10:39 am
        
        Onur, we have always said that these ancestral component names should be taken with a grain of salt and just as rough mnemonics. They are not named for ancestral populations but basically after modern distributions and peaks to make discussion easier.
    - Zachary Latif May 9, 2011 at 9:49 am
      
      Onur Arabia was a source population well before Islam; there was always a constant influx of Bedouins replenishing the Middle Eastern populations.
      - Onur May 9, 2011 at 10:23 am
        
        There were small scale migrations of Bedouin to the north before Islam, but I don't think they were constant or had a high magnitude. Also their range must have been largely limited to Greater Syria and Mesopotamia. So, for instance, Anatolians, Armenians, Iranians and nearby populations must have been minimally, if any, affected by the Bedouin migrations, whether pre-Islamic or even post-Islamic. You cannot explain the not so trivial "Arabian" component in Turks, Armenians, Iranians and nearby populations with Bedouin migrations.
      - Zachary Latif May 10, 2011 at 7:40 am
        
        Must be some basal element perhaps?
      - Zack May 10, 2011 at 8:03 am
        
        Possibly.
- Zack May 8, 2011 at 10:38 am
  
  There's also the fact that I am not using any African or European samples here. So things get a bit distorted. While the peaks correspond with your map, all these three components are widely spread.
Vasishta May 8, 2011 at 9:50 am

K=14 is a most interesting run. The generic West Asian component has split into West Asian (Samaritian), Balochistan and Caucasus. What exactly differentiates the two, Zack? Could this new Balochistan component perhaps be deemed as some sort of "Indo-West Asian" component? The Chenchu component seems to be a stand-in for the Onge component - but what would the South Asian component represent now? I assume both form the bulk of the ASI as of now. It would be absolutely awesome if a similar run was done on the project participants.
- Onur May 8, 2011 at 10:15 am
  
  Both the "South Asian" component and the "Chenchu" component seem to be only partially ASI, the rest of them being Caucasoid (this is especially true for the "Chenchu" component).
- Zack May 8, 2011 at 11:11 am
  
  The Balochistan component is not limited to Balochistan and west like it was in Reference I runs, instead it is more widespread in northwest India and Pakistan.
Onur May 8, 2011 at 10:28 am

Highly inbred populations like Samaritans, Druze, Jews and various isolated South Asian groups distort the genetic picture to some extent. An alternative run without them would be nice.
Onur May 8, 2011 at 10:50 am

Onur, we have always said that these ancestral component names should be taken with a grain of salt and just as rough mnemonics. They are not named for ancestral populations but basically after modern distributions and peaks to make discussion easier.

I know. But as in the example of Zachary Latif, some people are misled by those labels. I have no problem with the labelings of the components, they are useful for visualizing the peak regions/populations of the components, but still, I will continue to use them in quotation marks to attract attention to the fact that they are just mnemonics.
Zachary Latif May 9, 2011 at 9:48 am

Onur I was matching the clusters to potential geographic source regions not ethnicities. Apols if it came across otherwise.

Harappa Ancestry Project

Genetics and South Asia