arm
package for the calculations. This package actually uses a rough approximation to full Bayes analysis by assuming that regression coefficients from an ordinary logistic model fit have a multivariate normal distribution. For logistic regression, such approximations are poor. The authors should have used exact Bayesian methods (e.g., the R package MCMCpack
).
The paper devoted an amazing amount of space to showing effects of all the predictors on the log odds ratio scale. For continuous variables such as age, odds ratios require arbitrary settings of age (2 values) and have difficulty placing the continuous variable's effect on the same scale as a categorical variable's effect. Odds ratios do not convey the needed impression because a risk factor that has a very low prevalence can have a large odds ratio. More to the point would be to compute some measures of how much of the variation in prescription patterns is explained by each patient or practice characteristic. This can be done quite concisely, and some characteristics can be pooled into classes. For example one could show the variation in prescribing a specific drug that is due to race and then the variation due to ethnicity. Then a combined race/ethnicity effect could be displayed. Likewise, geographical variables can be combined to display, using a single number, the variation due to location. These kind of "chunk tests" allow one to see what's going on when co-linearities cause risk factors to compete, knocking down each others' effects. The combined statistics allow them to sum their effects instead.
An example display is found below. Here what is being predicted is hemoglobin A1c, and the effect of various body dimensions (partial/adjusted effects) are shown individually as well as all pooled together.
size
entry sums the influence of leg length, subscapular skinfold thickness, tricep skinfold thickness, and waist circumference. re
is a combined race + ethnicity multiple degree of freedom effect. Here the measure of explained variation is the Wald chi-square statistic minus the number of degrees of freedom required to achieve that chi-square. The subtraction is to level the playing field so that risk factors having many categories do not get more changes to explain variation in the outcome variable. I | Attachment | Action | Size | Date | Who | Comment |
---|---|---|---|---|---|---|
png | anova.png | manage | 17 K | 18 Oct 2015 - 10:12 | FrankHarrell | Variation in glycohemoglobin explained by various risk factors |
html | pendletonJohnsonReview.html | manage | 33 K | 21 Oct 2015 - 06:45 | FrankHarrell | Review of Am J Psychiatry March 2015 by Pendleton and Johnson |