Tag Archives: iran

Participation Changes

Posted by Zack on May 16, 2012 5 comments

Now that I have DIY HarappaWorld out, I am changing the participation requirements a little bit with somewhat different requirements for South Asians compared to other regions.

If you have any real ancestry from a South Asian origin, you are eligible to participate. Partial South Asian ancestry is okay. The list of countries of origin I count as South Asian are as follows:

Afghanistan
Bangladesh
Bhutan
India
Maldives
Nepal
Pakistan
Sri Lanka

Note that 2-3% South Asian from Dr. McDonald's BGA or Dodecad Project does not count as South Asian ancestry.

If you have all four of your grandparents from one of the following countries or regions, you can also send me your data.

Burma
Tibet
Uyghur from Xinjiang, China
Tajikistan
Kyrgyzstan
Kazakhstan
Uzbekistan
Turkmenistan
Iran
Turkey
Azerbaijan
Armenia
Georgia
North Caucasian Federal District, Russia
Iraq
Syria
Lebanon
Jordan

Relatives will only be accepted when they are a better replacement for current participants. For example, replacing a participant by his/her parents or his maternal uncle and paternal aunt gets us two unrelated participants (assuming, of course, that the two sides of the family are not related by blood). Another example could be if a participant is of partial South Asian ancestry and they get replaced by a relative who has more South Asian ancestry.

Everyone else can use DIY HarappaWorld. It's fairly easy to use on both Windows and Linux. The only hard part right now is that you have to install R to standardize your genome file. I might look into creating an executable for that to make it easier.

Finally, please be honest.

June Update

Posted by Zack on June 4, 2011 Comments Off

I have a total of 123 participants in the project right now who have sent me their raw data. Six of those have relatives participating and thus have to be filtered out for most analysis other than individual admixture percentages etc where I divide participants into small groups.Укладка дикого камня

The following groups are represented:

Most are 23andme data while 4 are from FTDNA.

We are getting close to 100 South Asian participants.

April Update

Posted by Zack on May 1, 2011 5 comments

I have a total of 97 participants in the project right now who have sent me their raw data. Six of those have relatives participating and thus have to be filtered out for most analysis other than individual admixture percentages etc where I divide participants into small groups.http://mountainsphoto.ru

The following groups are represented:

Let's try to get to hundred soon.

And yes, I am accepting FTDNA Family Finder (new Illumina chip) now.

End of March Update

Posted by Zack on March 27, 2011 10 comments

I have a total of 67 participants in the project right now who have sent me their raw data. This is not counting those who have relatives participating and thus have to be filtered out for most analysis other than individual admixture percentages etc where I divide participants into small groups.http://polvam.ru

The following groups are represented:

I need to post analyses of Tamils, Bengalis and Punjabis soon.

Iranians

Posted by Zack on March 24, 2011 21 comments

Since we have 7 Iranians in the project, it's time to look at them as a group. We also have 19 Iranians from the Behar et al dataset.

Let's look at their admixture results at K=12.

The big difference between Harappa Project Iranians and Behar et al Iranians is African admixture. Only one Harappa Iranian (HRP0046) has 1% African admixture while three Behar Iranians have more than 10%.

Let's do hierarchical clustering with complete linkage using the Euclidean distance between admixture components. First a caveat or two. This is not a phylogeny. Also, the Euclidean distance measure is not a good one for measuring differences in admixture but I am not sure what would be better.

HRP0010 who is an Assyrian actually clusters better with Caucasian, Iranian and Iraqi Jews than with Iranians.

I'll run an MDS or PCA of the whole region from Punjab/Kashmir to the Levant and Caucasus soon which should be more interesting for clustering.

UPDATE: Since Palisto wondered, I checked and found out that he, an Iraqi Kurd, is very like the Iranians in his admixture result. So I have included him (HRP0059).

Another Update

Posted by Zack on March 12, 2011 28 comments

I have a total of 51 participants in the project right now who have sent me their raw data. This is not counting three people who have relatives participating and thus have to be filtered out for most analysis other than individual admixture percentages etc where I divide participants into small groups.

The following groups are represented:

Punjab: 7
Iran: 7
Tamil: 6
Bengal: 5
Andhra Pradesh: 2
Bihar: 2
Karnataka: 2
Caribbean Indian: 2
Kashmir: 2
Uttar Pradesh: 2
Sri Lankan: 2
Kerala: 2
Iraqi Arab: 2
Anglo-Indian: 1
Roma: 1
Goa: 1
Rajasthan: 1
Baloch: 1
Unknown: 1
Egyptian/Iraqi Jew: 1
Maharashtra: 1

I haven't received data from any new participants for more than a week which is the longest lull since I started Harappa Ancestry Project. So go out there and get people to send me their 23andme raw data.

Also, does anyone know if there are a significant number of South Asians who have done FamilyTreeDNA's Family Finder test? Is there a good overlap of SNPs between their test and 23andme's?

We have enough Punjabis, Iranians, Tamil and Bengalis that they deserve separate analysis posts.

Project Update

Posted by Zack on February 20, 2011 16 comments

I have a total of 42 participants in the project right now who have sent me their raw data. This is not counting two people who have relatives participating and thus have to be filtered out for most analysis other than individual admixture percentages etc where I divide participants into small groups.

The following groups are represented:

Punjab: 7
Iran: 6
Tamil: 5
Andhra Pradesh: 2
Bengal: 2
Bihar: 2
Karnataka: 2
Caribbean Indian: 2
Kashmir: 2
Anglo-Indian: 1
Roma: 1
Goa: 1
Uttar Pradesh: 1
Sri Lankan: 1
Rajasthan: 1
Kerala: 1
Baloch: 1
Unknown: 1

The unknown is Manu Sporny who has put his genetic data in the public domain and I have drafted him into our project.

In addition, out of curiosity, I have accepted data from the following:

Iraqi Arab: 2
Egyptian/Iraqi Jew: 1

I know a bunch of you have done a lot to make this project known and gotten people to submit their data. But we really do need more participants of every ethnicity and geographic region in and around South Asia. So keep on!

I am working on K=12 admixture runs for the batches we have already done. In addition, the reference I dataset will be used for even higher values of K admixture components to see where the limit is.

Also, I am looking into doing chromosome by chromosome admixture (and other analysis). I have done some experimental runs and once I have pored over that data, I'll have something to report.

As we have seen, even with the removal of the San and Pygmy, the Africans take up 3 ancestral components and most South Asians (excepting me of course) do not have any African admixture. So I am working on a reference dataset without any Africans. I have my own take on how to do that which I'll share in the next few days.

In short, my home computer is running admixture, plink, eigensoft, etc. 24x7.

Latest on Participants

Posted by Zack on February 6, 2011 1 comment

I have a total of 31 participants in the project right now who have sent me their raw data. The following groups are represented:

Punjab: 7
Tamil: 4
Iran: 4
Andhra Pradesh: 2
Bengal: 2
Bihar: 2
Karnataka: 2
Caribbean Indian: 2
Anglo-Indian: 1
Roma: 1
Kashmir: 1
Goa: 1
Uttar Pradesh: 1
Sri Lankan: 1

Keep them coming!

I am going to get some admixture analysis on the second batch (HRP0011 to HRP0020) done this week.

Participation Update

Posted by Zack on January 31, 2011 8 comments

I have a total of 23 participants in the project right now who have sent me their raw data. The following groups are represented:

Punjab: 7
Tamil: 4
Iran: 3
Bengal: 2
Andhra Pradesh: 2
Bihar: 1
Anglo-Indian: 1
Roma: 1
Karnataka: 1
Kashmir: 1

There is still a lot of ethnicities and regions missing. Uttar Pradesh comes to mind as the biggest one.

Behar et al Data

Posted by Zack on January 28, 2011 16 comments

In their paper "The genome-wide structure of the Jewish people", Behar et al analyzed the genomes of some Jewish groups. More important than the Jewish samples (which include two South Asian Jewish groups) for us are the different South Asian, Middle Eastern, and European groups they sampled:

Ethnic group	Count
Saudis	20
Jordanians	20
Georgians	20
Turks	19
Iranians	19
Hungarians	19
Ethiopians	19
Armenians	19
Lezgins	18
Chuvashs	17
Syrians	16
Romanians	16
Uzbeks	15
Spaniards	12
Egyptians	12
Cypriots	12
Moroccans	10
Lithuanians	10
North Kannadi	9
Belorussian	9
Yemenese	8
Lebanese	7
Sakilli	4
Paniya	4
Cochin Jews	4
Bene Israel	4
Samaritians	2
Russian	2
Malayan	2

Of the 466 samples, I excluded 8 because they were either duplicates or too similar in their genomes to others.

The series matrix files that I downloaded were in a somewhat different format. To convert them to Plink format, I had to look up the platform file for the Illumina genotyping BeadChip they used. Also, Illumina used an A/B alleles and Top/Bot strands system instead of the regular ACGT alleles and forward/reverse strands. This Illumina Technote explained it and I found a Perl script to convert between the two.

Harappa Ancestry Project

Genetics and South Asia

Tag Archives: iran

Participation Changes

June Update

April Update

End of March Update

Iranians

Another Update

Project Update

Latest on Participants

Participation Update

Behar et al Data

Contact

My Sites

Data

Affiliate DNA Tests

Categories

Archives

Recent Comments

Blogroll

Genetics and South Asia

Tag Archives: iran

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Contact

My Sites

Data

Affiliate DNA Tests

Categories

Tags

Archives

Recent Comments

Blogroll