Miscellaneous Statistics Problems to Avoid

Descriptive Statistics

  • Mean ± S.D. virtually assumes normality
  • Mean ± S.E. assumes normality unless N is large; best to use a confidence interval instead of using one S.E.
  • Median is always representative of a continuous variable, although it is not as precise an estimate of central tendency if the distribution is truly normal
  • Use percentiles instead of "outside 3 s.d." for lab parameters
  • Some software makes a test of normality and depending on the result runs a parametric or nonparametric test. The problems with this approach include:
    • The test of normality may not have adequate power to detect non-normality
    • Even if the data come from a truly normal distribution, nonparametric tests are almost as efficient as parametric tests (Wilcoxon-Mann-Whitney test is 0.96 as efficient as t -test).
    • If the data are non-normal, the nonparametric test can be much more powerful than the parametric counterpart.

Parametric TestSorted ascending Nonparametric Counterpart
ANOVA Kruskal-Wallis test
Paired t -test Wilcoxon signed rank
Pearson r Spearman rho rank correlation
t -test Wilcoxon-Mann-Whitney
(r = product-moment linear correlation coefficient)
  • Geometric means are typically not good for describing central tendency of skewed data. Geometric means are greatly affected by low outliers and may be difficult to interpret.

Interpretation of P -Values

  • P -values only provide evidence against a hypothesis, never evidence in favor of it.
  • P =.8 implies there is lack of evidence of an effect, i.e., either:
    1. There is little or no effect or
    2. There is insufficient information in the sample due to small N or high variability --- "Absence of evidence is not evidence for absence" (Altman and Bland, BMJ 311:485; 1995)
  • P =.01 implies there is evidence for an effect, but this effect may be clinically insignificant
  • P =.05 in many cases provides very little evidence against the null hypothesis
  • Confidence intervals convey much more information than P -values, especially when P is large

-- FrankHarrell - 26 Jun 2004
Topic revision: r1 - 26 Jun 2004, FrankHarrell
 

This site is powered by FoswikiCopyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback