You are here:
Vanderbilt Biostatistics Wiki
>
Main Web
>
FrankHarrellrmanip
>
ManyDataFrames
(10 Jul 2008,
JohnHarrell
)
(raw view)
E
dit
A
ttach
---+ Examples of Dealing with Multiple Data Frames Stored in a List ---+++ Common Application: Used sasxport.get to import multiple SAS datasets into one S list object <verbatim> > a <- data.frame(x1=1:3, x2=c('a','b','c'), x3=2:4) > a <- upData(a, labels=c(x1='Label for x1',x3='Label for x3'), + units=c(x1='mmHg', x3='minutes')) > b <- data.frame(x1=3:5, x4=5:7) > b <- upData(b, labels=c(x1='Label for x1'), units=c(x1='cm')) > d <- data.frame(x5=1:3, x6=2:4) > w <- llist(a,b,d) # llist in Hmisc remembers argument names > contents(w) Obs Var Var.NA a 3 3 0 b 3 2 0 d 3 2 0 > for(u in names(w)) print(describe(w[[u]], descript=u) a 3 Variables 3 Observations --------------------------------------------------------------------------- x1 : Label for x1 [mmHg] n missing unique Mean 3 0 3 2 1 (1, 33%), 2 (1, 33%), 3 (1, 33%) --------------------------------------------------------------------------- x2 n missing unique 3 0 3 a (1, 33%), b (1, 33%), c (1, 33%) --------------------------------------------------------------------------- x3 : Label for x3 [minutes] n missing unique Mean 3 0 3 3 2 (1, 33%), 3 (1, 33%), 4 (1, 33%) --------------------------------------------------------------------------- b 2 Variables 3 Observations . . . > n <- unlist(lapply(w, names)) > datadict <- + data.frame(dataset=rep(names(w), sapply(w,length)), + variable=n, + label=unlist(lapply(w, function(x) sapply(x, label))), + units=unlist(lapply(w, function(x) sapply(x, units))), + row.names=NULL) > datadict dataset variable label units 1 a x1 Label for x1 mmHg 2 a x2 3 a x3 Label for x3 minutes 4 b x1 Label for x1 cm 5 b x4 6 d x5 7 d x6 > ## print in order of variable names > i <- order(datadict$variable) > datadict[i,] dataset variable label units 1 a x1 Label for x1 mmHg 4 b x1 Label for x1 cm 2 a x2 3 a x3 Label for x3 minutes 5 b x4 6 d x5 7 d x6 > ## check for inconsistencies in labels or units (when non-blank) > chka <- function(atr) { > w <- tapply(datadict[[atr]], datadict$variable, + function(x) length(unique(x[x != ""]))) + if(any(w > 1)) + cat('\nVariables with inconsistent ', atr, ' across datasets:\n', + paste(names(w[w > 1]),collapse=' '),'\n', sep='') + invisible() + } > chka('label') > chka('units') Variables with inconsistent units across datasets: x1 </verbatim> -- Main.FrankHarrell - 13 Jun 2004
E
dit
|
A
ttach
|
P
rint version
|
H
istory
: r2
<
r1
|
B
acklinks
|
V
iew topic
|
Edit
w
iki text
|
M
ore topic actions
Topic revision: r2 - 10 Jul 2008,
JohnHarrell
Main
Department Home Page
Biostatistics Graduate Program
Vanderbilt University Medical Center
Main Web
Main Web Home
Search
Recent Changes
Changes
Topic list
Biostatistics Webs
Archive
Main
Sandbox
System
Register
|
Log In
Copyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki?
Send feedback