Someone asked about how to convert a FTDNA Family Finder csv data file to the Plink format. I threw together a very simple Unix script to do that and I am sharing it here:
#!/bin/bash if test -z "$1" then echo "FTDNA raw data filename not supplied as argument." exit 0 fi echo "Family ID: " read fid echo "Individual ID: " read id echo "Paternal ID: " read pid echo "Maternal ID: " read mid echo "Sex (m/f/u): " read sexchr if [[ $sexchr == m* ]] then sex=1 elif [[ $sexchr == f* ]] then sex=2 else sex=0 fi pheno=0 echo "$fid $id $pid $mid $sex $pheno" > $id.tfam dos2unix $1 sed '1d' $1 > $id.nocomment awk -F, '{gsub(/"/,""); print $2,$1,"0",$3,substr($4,1,1),substr($4,2,1)}' $id.nocomment > $id.tped rm $id.nocomment plink --tfile $id --out $id --make-bed --missing-genotype - --output-missing-genotype 0 |
This script creates three files: *.bed, *.bim and *.fam, which are the binary format files for Plink. You can then use Plink to merge multiple files, filter SNPs or individuals and do other processing.
Recent Comments