TTest Algorithm

Performed on a set of non-negative numbers and labels of size N. There must be two specific labels. Label 1 must be 1 and label 2 must be 2. There can also be any other labels that will be ignored.

T value

  1. Find the number and sum of each group.
    • Na: number of values in group 1
    • Nb: number of values in group 2
    • Sa: $\sum X_i$
    • Sb: $\sum Y_i$
    • SSa: $\sum {X_i}^2$
    • SSb: $\sum {Y_i}^2$
  2. Compute the sample mean of each group.
    • Ma: $\frac{Sa}{Na}$
    • Mb: $\frac{Sb}{Nb}$
  3. Compute the variance for each group.
  4. Estimate the variance
    • varA: $SSa - \frac{Sa^2}{Na}$
    • varB: $SSb - \frac{Sb^2}{Nb}$
    • Var: $\frac{varA + varB}{Na + Nb - 2}$
  5. Estimate the standard deviation (stDev): $\sqrt{\frac{Var}{Na} + \frac{Var}{Nb}}$
  6. t: $\frac{Ma - Mb}{stDev}$

T probability

  • P-value: $2 * (1 - probt(|t|, Na + Nb - 2))$

Permutation T probability I (BH Step-Up Procedure without adjustment)

  1. Calculate raw P value based on original grouping of each gene.
    N = totalGene
    $P_0$ = $2 * (1 - probt(|t|, Na + Nb - 2))$
  2. Randomly regrouping each gene.
    The size in each group stays the same as the original, each sample only appears once in each group, number of regrouping depends on user.
  3. For each regrouped gene, recalculate P value.
    $P_i$ = $2 * (1 - probt(|t|, Na + Nb - 2))$
  4. Count if $P_i$ less than or equals to $P_0$ for each gene.
    count = number of $P_i$ <= $P_0$
  5. Calculate permutation P value.
    PP-value = $ \frac{count}{NumRegrouping}$
  6. Rank PP-value by ascending order, calculate unadjusted T probability of each gene from the gene with largest rank to smallest rank.
    P-value = $ \frac{N}{rank}$*PP-value
    if $P-value_{rankN}$ >=1, then set $P-value_{rankN}$ = 1
    if $P-value_{ranki}$ > $P-value_{rank(i+1)}$ , then $P-value_{ranki}$ = $P-value_{rank(i+1)}$

Permutation T probability II (BH Step-Up Procedure with adjustment)

  1. Calculate raw T value based on original grouping of each gene.
    N = totalGene
    $T_0$ : $\frac{Ma - Mb}{stDev}$
  2. Randomly regrouping each gene.
    The size in each group stays the same as the original, each sample only appears once in each group, number of regrouping depends on user.
  3. For each regrouped gene, recalculate T value.
    $T_i$ : $\frac{Ma - Mb}{stDev}$
  4. for each $gene_i$, calculate number of T values of all permuted genes >= $T_0$
    count = permuted T value of all genes >= $T_0$
  5. Estimate P-value of $gene_i$
    estP-value = $ \frac{count}{NumRegrouping*N}$
  6. Rank estP-value by ascending order, calculate adjusted T probability of each gene from the gene with largest rank to smallest rank.
    P-value = $ \frac{N}{rank}$*estP-value
    if $P-value_{rankN}$ >=1, then set $P-value_{rankN}$ = 1
    if $P-value_{ranki}$ > $P-value_{rank(i+1)}$ , then $P-value_{ranki}$ = $P-value_{rank(i+1)}$

Example: number of gene = 100, permutation time = 999 , p1-p7 are in one group, p8-p14 are in the other group.

summary of running time

sample size group size permutation time running time
10 5 vs. 5 252 0.01
10 5 vs. 5 5,000 0.406
20 10 vs. 10 10,000 0.56
20 10 vs. 10 1,847,560 87.746
50 25 vs. 25 10,000 1.261
50 25 vs. 25 50,000 14.703

-- WillGray - 23 Jul 2004

Edit | Attach | Print version | History: r7 < r6 < r5 < r4 | Backlinks | View wiki text | Edit WikiText | More topic actions...
Topic revision: r5 - 15 May 2009, JoanZhang
 

This site is powered by FoswikiCopyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback