Comparison of cost efficiency of SNP chips

Li C, Li M, Long J-R, Cai Q, Zheng W (2007) Evaluating cost efficiency of SNP chips in genome-wide association studies. (revision submitted)

Abstract: Genome-wide association (GWA) studies have recently emerged as a major approach to gene discovery for many complex diseases. Since GWA scans are expensive, cost efficiency is an important factor to consider in study design. However, it often requires extensive and time consuming computer simulations to compare cost efficiency across different SNP chips. Here we propose two simulation-free approaches to cost efficiency comparisons across SNP chips. In the first method, the overall power under a given disease model is calculated for each SNP chip under various sample sizes. Then SNP chips can be compared with respect to the sample sizes required to achieve the same level of power. In the second method, for a desired level of genomic coverage, the effective r2 threshold values are calculated for each SNP chip. Since r2 is inversely proportional to the sample size to achieve the same power, the required sample sizes can then be compared among SNP chips. These methods are complementary. The first approach provides direct power comparisons, but it requires information on disease model and may not be reliable for SNP chips that contain many non-HapMap SNPs. The second approach allows sample size comparisons based on the coverage of SNP chips, and it can be modified for SNP chips that contain non-HapMap SNPs. These methods are particularly relevant for large epidemiological studies in which enough subjects are available for GWA screening and follow-up stages. We illustrate these approaches using five currently available whole genome SNP chips.

Power for SNP chips: CEU, CHB, JPT, YRI. Each package has nine files, with names like CEUpower_##_$$.txt, where ## can be 01, 05, 10 for prevalence 0.01, 0.05, and 0.10, and $$ is the genotype relative risk (1.25, 1.50, 1.75). Note the power for SNP Array 6.0 and Human1M may be underestimated because each chip contains about 10% non-HapMap SNPs.

Genomic coverage for SNP chips at various r2 thresholds: CEU, CHB, JPT, YRI. The columns whose name has a "b" at the end are the alternative coverage estimate that takes into account the non-HapMap SNPs on the SNP Array 6.0 and Human1M.

HapMap SNP tabulation for a fine grid of maximum r2 and MAF: CEU, CHB, JPT, YRI. Each package has five files for the five SNP chips (SNP Array 5.0, SNP Array 6.0, HumanHap300, HumanHap550, Human1M). Each file has 100 rows and 50 columns. The number in cell (i,j) is the number of HapMap common SNPs (MAF ≥ .05) that have maximum r2 in ((i-1)/100, i/100] and MAF in ((j-1)/100, j/100]. Note the maximum r2 for SNP Array 6.0 and Human1M may be underestimated because each chip contains about 10% non-HapMap SNPs.

R function for power calculation

Other papers of relevance

Please contact Chun Li (chun.li@vanderbilt.edu) if you have any questions.

Edit | Attach | Print version | History: r9 | r8 < r7 < r6 < r5 | Backlinks | View wiki text | Edit WikiText | More topic actions...
Topic revision: r7 - 20 Dec 2007, ChunLi
 

This site is powered by FoswikiCopyright © 2013-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback