population structure prediction system

PCAj: Population Structure Prediction System for Japanese  | 日本語ページ

  

EXE (Windows) | JAR (Multi-platform) | Test Data  

This application predicts population structure of Japanese 
samples using genome-wide SNP genotypes.  It creates a 2D 
scatterplot of predicted principal components based on the
probabilistic PCA.  It also provides posterior probabilities
from which an individual has descended from each of several 
Japanese ancestries:

- Hokkaido
- Tohoku
- Kanto-Koshinetsu
- Tokai-Hokuriku
- Kinki
- Kyushu
- Okinawa

based on the result of LDA.  The application is based on
SNP markers included in the Illumina HumanHap 550K chip.
However you may use SNP genotype data observed from other 
platforms as far as a set of SNPs overlaps with that of the 
550K chip.  Note that the more SNPs you input, the better 
prediction result you get (at least 10,000 SNPs 
are recommended).

usage: <genotype_file>

Example: java -jar pca.jar test.txt 

The <genotype_file> parameter should be a file that contains
SNP genotype data. The first column of the data is SNP rs_ID
which should be sorted as dictionary order.  The second column
is SNP genotypes for all subjects in the sample set without
any separator.  The first and second columns should be TAB 
separated.  Here genotypes should be encoded as 0, 1 and 2
for a homozygote pair of A/T, a heterozygote pair of A/T and C/G
and the other homozygote pair of C/G, respectively.  The missing
genotype should be encoded as 9.

Example: AC, AA, CC, TT, AG, GG, TC, NN -> 1, 0, 2, 0, 1, 2, 1, 9 

For Windows users, you can drag & drop your data onto the icon of
the execulable file (pca.exe).

If you use this software for publication, please send a note
with the reference or a link. When citing this software, use:
Kumasaka et al. (2010) Establishment of a Standardized System 
to Perform Population Structure Analyses with Limited Sample 
Size or with Different Sets of SNP Genotypes. 
Journal of Human Genetics, 55(8):525-33.


(c) Natsuhiko Kumasaka (kumasaka AT src.riken.jp)
http://kumasakanatsuhiko.jp/projects/popstr/