# Rafe's Dataset Reshape Problem

Rafe posed the following problem. The original data set had three measurements for each 'specimen' (spec). The goal was was to turn a wider data set into one that was three times as tall. There were two rows for each specimen, one for the left hand and one for the right hand. At the end, there is a variable for each hand and three readings for each of those variables. So the result is three observations for each specimen, since specimens were nested within gender. The SAS code follows.
```proc sort data=measurements out=measurements; by gender spec hand;
run;

data measurements;
set measurements(drop=ra_ fcr_ pl_ mn_ fcu_ unvb_);
obnumber = _n_;
percentra = ra/width;
percentfcr = fcr/width;
percentpl = pl/width;
percentmn = mn/width;
percentfcu = fcu/width;
percentunvb = unvb/width;
run;

proc print data=measurements u; run;

proc format;
value \$loc 'percentra'='r'
'percentmn'='m'
'percentunvb'='u';
run;

proc transpose data=measurements
out=talldata(rename=(col1=hand1 col2=hand2)) name=location;
by gender spec;
var percentra percentmn percentunvb;
run;

proc print u data=talldata; run;

proc plot data=talldata;
format location \$loc.;
plot hand1*hand2=location; run;
```
There are at least four solutions in `R`: customized programming, the builtin `reshape` function, the `reShape` function in the `Hmisc` package, and the `reshape` package. The following solution is based on the last approach. A simulated dataset is used that is similar in design to Rafe's but not exactly the same.

```library(Hmisc)
library(reshape)
set.seed(1)
d <- expand.grid(gender=c('f','m'), spec=c('a','b','c'), hand=c('L','R'),
loc=LETTERS[1:5])
n <- nrow(d)
d <- transform(d, p1=runif(n), p2=runif(n), p3=runif(n))
options(digits=2)
```
```  gender spec hand loc   p1   p2   p3
1      f    a    L   A 0.27 0.91 0.99
2      m    a    L   A 0.37 0.29 0.50
3      f    b    L   A 0.57 0.46 0.48
4      m    b    L   A 0.91 0.33 0.17
5      f    c    L   A 0.20 0.65 0.75
6      m    c    L   A 0.90 0.26 0.45
```

```m <- melt(d, measure.var=c('p1','p2','p3'), preserve.na=FALSE) # no NAs here but in general ...
```
```  gender spec hand loc variable value
1      f    a    L   A       p1  0.27
2      m    a    L   A       p1  0.37
3      f    b    L   A       p1  0.57
4      m    b    L   A       p1  0.91
5      f    c    L   A       p1  0.20
6      m    c    L   A       p1  0.90
```

```head(cast(m, ... ~ hand + variable))
head(cast(m, ... ~ variable + hand))
```
```  gender spec loc L_p1 L_p2  L_p3 R_p1  R_p2 R_p3
1      f    a   A 0.27 0.91 0.992 0.94 0.479 0.51
2      f    a   B 0.69 0.35 0.036 0.38 0.777 0.99
3      f    a   C 0.27 0.76 0.729 0.48 0.240 0.61
4      f    a   D 0.79 0.46 0.530 0.78 0.270 0.45
5      f    a   E 0.73 0.92 0.724 0.07 0.148 0.38
6      f    b   A 0.57 0.46 0.484 0.63 0.084 0.23

gender spec loc p1_L p1_R p2_L  p2_R  p3_L p3_R
1      f    a   A 0.27 0.94 0.91 0.479 0.992 0.51
2      f    a   B 0.69 0.38 0.35 0.777 0.036 0.99
3      f    a   C 0.27 0.48 0.76 0.240 0.729 0.61
4      f    a   D 0.79 0.78 0.46 0.270 0.530 0.45
5      f    a   E 0.73 0.07 0.92 0.148 0.724 0.38
6      f    b   A 0.57 0.63 0.46 0.084 0.484 0.23
```
For more information about `reshape` see http://had.co.nz/reshape .
Topic revision: r1 - 15 Mar 2007, FrankHarrell

• Biostatistics Webs Copyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback