You are here:
Vanderbilt Biostatistics Wiki
>
Main Web
>
Seminars
>
RClinic
>
DMEx4
(15 Mar 2007,
FrankHarrell
)
(raw view)
E
dit
A
ttach
---+ Rafe's Dataset Reshape Problem Rafe posed the following problem. The original data set had three measurements for each 'specimen' (spec). The goal was was to turn a wider data set into one that was three times as tall. There were two rows for each specimen, one for the left hand and one for the right hand. At the end, there is a variable for each hand and three readings for each of those variables. So the result is three observations for each specimen, since specimens were nested within gender. The SAS code follows. <verbatim> proc sort data=measurements out=measurements; by gender spec hand; run; data measurements; set measurements(drop=ra_ fcr_ pl_ mn_ fcu_ unvb_); obnumber = _n_; percentra = ra/width; percentfcr = fcr/width; percentpl = pl/width; percentmn = mn/width; percentfcu = fcu/width; percentunvb = unvb/width; run; proc print data=measurements u; run; proc format; value $loc 'percentra'='r' 'percentmn'='m' 'percentunvb'='u'; run; proc transpose data=measurements out=talldata(rename=(col1=hand1 col2=hand2)) name=location; by gender spec; var percentra percentmn percentunvb; run; proc print u data=talldata; run; proc plot data=talldata; format location $loc.; plot hand1*hand2=location; run; </verbatim> There are at least four solutions in =R=: customized programming, the builtin =reshape= function, the =reShape= function in the =Hmisc= package, and the =reshape= *package*. The following solution is based on the last approach. A simulated dataset is used that is similar in design to Rafe's but not exactly the same. <highlight> library(Hmisc) library(reshape) set.seed(1) d <- expand.grid(gender=c('f','m'), spec=c('a','b','c'), hand=c('L','R'), loc=LETTERS[1:5]) n <- nrow(d) d <- transform(d, p1=runif(n), p2=runif(n), p3=runif(n)) options(digits=2) head(d) </highlight> <verbatim> gender spec hand loc p1 p2 p3 1 f a L A 0.27 0.91 0.99 2 m a L A 0.37 0.29 0.50 3 f b L A 0.57 0.46 0.48 4 m b L A 0.91 0.33 0.17 5 f c L A 0.20 0.65 0.75 6 m c L A 0.90 0.26 0.45 </verbatim> <highlight> m <- melt(d, measure.var=c('p1','p2','p3'), preserve.na=FALSE) # no NAs here but in general ... head(m) </highlight> <verbatim> gender spec hand loc variable value 1 f a L A p1 0.27 2 m a L A p1 0.37 3 f b L A p1 0.57 4 m b L A p1 0.91 5 f c L A p1 0.20 6 m c L A p1 0.90 </verbatim> <highlight> head(cast(m, ... ~ hand + variable)) head(cast(m, ... ~ variable + hand)) </highlight> <verbatim> gender spec loc L_p1 L_p2 L_p3 R_p1 R_p2 R_p3 1 f a A 0.27 0.91 0.992 0.94 0.479 0.51 2 f a B 0.69 0.35 0.036 0.38 0.777 0.99 3 f a C 0.27 0.76 0.729 0.48 0.240 0.61 4 f a D 0.79 0.46 0.530 0.78 0.270 0.45 5 f a E 0.73 0.92 0.724 0.07 0.148 0.38 6 f b A 0.57 0.46 0.484 0.63 0.084 0.23 gender spec loc p1_L p1_R p2_L p2_R p3_L p3_R 1 f a A 0.27 0.94 0.91 0.479 0.992 0.51 2 f a B 0.69 0.38 0.35 0.777 0.036 0.99 3 f a C 0.27 0.48 0.76 0.240 0.729 0.61 4 f a D 0.79 0.78 0.46 0.270 0.530 0.45 5 f a E 0.73 0.07 0.92 0.148 0.724 0.38 6 f b A 0.57 0.63 0.46 0.084 0.484 0.23 </verbatim> For more information about =reshape= see http://had.co.nz/reshape .
E
dit
|
A
ttach
|
P
rint version
|
H
istory
: r1
|
B
acklinks
|
V
iew topic
|
Edit
w
iki text
|
M
ore topic actions
Topic revision: r1 - 15 Mar 2007,
FrankHarrell
Main
Department Home Page
Biostatistics Graduate Program
Vanderbilt University Medical Center
Main Web
Main Web Home
Search
Recent Changes
Changes
Topic list
Biostatistics Webs
Archive
Main
Sandbox
System
Register
|
Log In
Copyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki?
Send feedback