### Department of Biostatistics Seminar/Workshop Series

# Response Feature (Two-Stage) Analysis of Longitudinal Data

# A simple approach to analyzing complicated data

## William D. Dupont, PhD

### Professor of Biostatistics and Preventive Medicine, Vanderbilt University School of Medicine

### Wednesday, April 21, 1:30-2:30pm, MRBIII Conference Room 1220

### Intended Audience: People interested in analyzing data with multiple observations per subject, statistical graphics, Stata users or potential users, dose escalation studies, racial differences in isoproterenol-mediated vasodilation.

The analysis of repeated-measures data is complicated by the fact that multiple observations on the same patient are usually correlated. There are several sophisticated approached to analyzing these data. A simple approach, which is easy to explain to clinical scientists, is that of response feature analysis (also know as two-staged analysis). The idea behind this approach is to reduce the observations on each patient to a univariate statistic (response feature) that captures the most important aspect of her/his response. These response features are then analyzed using standard fixed-effects methods. This approach is illustrated by analyzing the dose-escalation data of Lang et al. (

*N Engl J Med* 1995;

**333**:155 – 60). These authors hypothesize a racial difference in the attenuation of isoproterenol-mediated vasodilatation. Forearm blood flow was measured at baseline and at six escalating doses of isoproterenol in normotensive, healthy African-American and Caucasian men. Exploratory analyses suggest a log-linear relationship between change from baseline blood flow (ΔBF) and log dose of isoproterenol. This suggests regressing ΔBF against log isoproterenol dose in each individual patient and using each patient’s slope estimate as a response feature. These response features can then be compared in African-American and Caucasian men using a Wilcoxon rank-sum test.

This approach is contrasted with generalized estimating equation (GEE) analysis, with a repeated measures analysis of variance using a random intercept model, and with a repeated measures analysis with both random slopes and random intercepts. This data set is challenging because the dispersion in ΔBF at any given dose is greater for whites than for blacks and the dispersion of ΔBF in men increases with increasing dose. GEE analysis has the advantage that the Huber-White sandwich estimator can correct the variance-covariance of the parameter estimates regardless of the choice of working covariance matrix. This can lead to robust inferences in situations in which the variance-covariance matrix of the repeated measures is complicated. Other repeated measures analyses can lead to misleading inferences when the model does not adequately capture the variance-covariance structure of the data. In this particular example the response-feature analysis, the GEE analysis and the mixed effects model with both random slopes and intercepts all give similar results. The random intercepts model, however, leads to misleading conclusions. These analyses are demonstrated using Stata.

Response-feature analyses can sometimes perform as well as more advanced techniques, and are much easier to explain. They can be valuable as a cross-check against more sophisticated methods.