Review of Bakris et al, Kidney International 2004 65:1991-2002
Biostatistical Reviewier: FrankHarrell
This is a meta-analysis of randomized clinical trials with at least 6m of treatment in hypertensive patients with proteinuria, with treatments including either dihydropyridine calcium antagonists (DCAs) or non-DCAs (NDCAs). Blood pressure response data were available on 1338 patients and proteinuria on 510. The authors did not have access to individual patient data, a drawback for any analysis. There was wide variation in the durations of the studies (though not to 1715 months as claimed in Table 2 for one study).
Minor Problems
- The authors relied on often non-descriptive means and SDs for describing characteristics of patients randomized in the various studies (because presumably the original authors did).
- P < 0.05 was arbitrarily chosen to determine "significance".
- There are two arithmetic errors in Table 4, last column. The second 18 should be 23 and the -54 should be -53.
Moderate Problems
- The meta-analysis treated each study as a sample size of one. The authors adjusted for sample size as a covariate in analysis of covariance instead of weighting individual studies by their sample sizes as is usually preferred.
Severe Problems
- The authors compared treatment groups using percent change from baseline values. No attempt was made to determine whether percent change was the correct metric, e.g., whether it resulted in values that could be taken out of context (were not dependent on initial values). See ReviewTepel04 for more on this point.
- If percent change is to be used, it is not at all clear that it should be computed on group means for the studies rather than on an individual patient basis. The percent change of means does not equal the mean percent change over patients.
- Percent changes can seldom be analyzed as raw data because they are asymmetric, e.g., if something falls by 50% it has to rise by 100% to make it back to the starting point. Percent changes cannot be added as was done by the authors, because the result cannot be interpreted at all easily. If proteinuria indeed operates on a relative (ratio) basis, studies should be summarized by log ratios. The log ratio is a symmetric measure in which positive and negative values can properly cancel each other.
- Related to the previous point, the authors performed analyses in which percents were added. The only percents that may be added are percents that equal 100 times a simple proportion of a whole in which the categories being added are mutually exclusive. For example, the percent of cancer and non-cancer deaths may be added. It makes no sense to add percents that are from general ratios, especially when both negative and positive percents are possible. If an overall percent change is needed, it can only be obtained by converting individual percents to log ratios, averaging these, and converting back to a percent.
Potentially Fatal Problem
The pre-post design is one of the worst designs used in research because it is contaminated by general time trends, natural history, placebo effect, use of concomitant therapies after baseline, and the Hawthorne effect. The authors use a pre-post design. They ignored control arm results in the randomized trials and used percent change from baseline to end of study (and what happened to patients who died before the end of the study?). Only by subtracting out the time trends observed in control arms can one obtain a treatment effect that is corrected for the above effects. If the DCA studies and the NDCA studies had different natural history or regression to the mean due to different entry criteria, qualification periods (e.g., placebo run-ins), etiology, or concomitant therapies, the two types of treatments cannot be compared. Meta-analyses need to include control arms and deal with "double differences" rather than single differences between treatments given in different studies. Also, correcting for study duration as if it were any other baseline variable is problematic. Natural history may not be linear or additive.
The placebo-controlled parallel group randomized trial design has enormous advantages. The parallel design should be used in all analyses whenever possible.
One other key issue is whether variations in sample size were adequately adjusted for. It seems that the authors adjusted for sample size as a covariate in an analysis of covariance rather than as case (frequency or sampling) weights. The following bubble plot has symbols sized proportionial to the number of patients used in each study, and demonstrates that treatment effects in NDCA were related both to baseline proteinuria and sample size. The two largest studies showed no effects.