In their paper "The genome-wide structure of the Jewish people", Behar et al analyzed the genomes of some Jewish groups. More important than the Jewish samples (which include two South Asian Jewish groups) for us are the different South Asian, Middle Eastern, and European groups they sampled:
Ethnic group | Count |
---|---|
Saudis | 20 |
Jordanians | 20 |
Georgians | 20 |
Turks | 19 |
Iranians | 19 |
Hungarians | 19 |
Ethiopians | 19 |
Armenians | 19 |
Lezgins | 18 |
Chuvashs | 17 |
Syrians | 16 |
Romanians | 16 |
Uzbeks | 15 |
Spaniards | 12 |
Egyptians | 12 |
Cypriots | 12 |
Moroccans | 10 |
Lithuanians | 10 |
North Kannadi | 9 |
Belorussian | 9 |
Yemenese | 8 |
Lebanese | 7 |
Sakilli | 4 |
Paniya | 4 |
Cochin Jews | 4 |
Bene Israel | 4 |
Samaritians | 2 |
Russian | 2 |
Malayan | 2 |
Of the 466 samples, I excluded 8 because they were either duplicates or too similar in their genomes to others.
The series matrix files that I downloaded were in a somewhat different format. To convert them to Plink format, I had to look up the platform file for the Illumina genotyping BeadChip they used. Also, Illumina used an A/B alleles and Top/Bot strands system instead of the regular ACGT alleles and forward/reverse strands. This Illumina Technote explained it and I found a Perl script to convert between the two.
Recent Comments