You are here:
Vanderbilt Biostatistics Wiki
>
Main Web
>
Seminars
>
RClinic
>
DatadensityFunction
(14 Nov 2006,
TheresaScott
)
(raw view)
E
dit
A
ttach
---+ The generic datadensity() function The =datadensity()= function is a generic functions used to show data densities in more complex situations. There are two class-specific methods of the generic =datadensity()= function: 1 The =Hmisc= package's =datadensity.data.frame()= function, which displays the variables in a data frame. 2 The =Design= package's =datadensity.plot.Design()= function, which can be used in conjunction with the =Design= package's =plot.Design()= function. The =Design= package's =plot.Design()= function is used to plot the results of a regression model fit with one of the =Design= package's regression functions (e.g., =ols()=, =lrm()=, and =cph()=). <highlight> library(Hmisc) library(Design) methods(datadensity) </highlight> To illustrate the differences between the two methods, let's use the =samplefile.txt= data file. * [[%ATTACHURL%/samplefile.txt][samplefile.txt]] <highlight> x<-read.table("samplefile.txt", header=TRUE) x<-upData(x, labels=c(age="Age", race="Race", sex="Sex", weight="Weight", visits="No. of Visits", tx="Treatment"), levels=list(sex=c("Female", "Male"), race=c("Black", "Caucasian", "Other"), tx=c("Drug X", "Placebo")), units=c(age="years", weight="lbs.")) contents(x) </highlight> Let's first illustrate the =Hmisc= package's =datadensity.data.frame()= method. As mentioned, this method displays the variables of a data frame. More specifically, rug plots are used to display continuous variables and, by default, bars plots are used to display frequencies of categorical, character, or discrete numeric variables. * By default, the =datadensity.data.frame()= function will construct one axis (i.e., one strip) per variable in the data frame. * Variable names appear to the left of the axes, and the number of missing values (if greater than zero) appear to the right of the axes. * For categorical or character variables, only the first few characters from each level are used when the total length of the value labels exceeds 200. * An optional =group== variable can be used for stratification, where the different strata are depicted using different colors. * If the =q== argument is specified, the desired quantiles (over all groups) are displayed with solid triangles below each axis. Here are some specific examples. <highlight> datadensity(x) datadensity(x, which = "continuous") datadensity(x, group = x$race) datadensity(x, group = x$tx) datadensity(x, ranges = list(age = c(5, 100))) datadensity(x, q = c(0.25, 0.5, 0.75)) # tiny triangles datadensity(x, labels = as.character(contents(x)$contents[,"Labels"])) </highlight> Now let's illustrate the =Design= package's =datadensity.plot.Design()= method. For this, let's first add some variable to our =x= data frame in order to fit a logistic regression model using the =Design= package's =lrm()= function. <highlight> x<-upData(x, # Specify population model for log odds that Y=1 L = .4*(sex=='Male') + .045*(age-50) + (log(weight - 10)-5.2)*(-2*(sex=='Female') + 2*(sex=='Male')), # Simulate binary y to have Prob(y=1) = 1/[1+exp(-L)] y = ifelse(runif(n) < plogis(L), 1, 0)) ddist <- datadist(x) ; options(datadist='ddist') mfit <- lrm(y ~ visits + sex * (age + rcs(weight,4)), data = x, x=TRUE, y=TRUE) anova(mfit) z <- plot(mfit, age=NA) with(x, datadensity(z, age)) </highlight>
E
dit
|
A
ttach
|
P
rint version
|
H
istory
: r2
<
r1
|
B
acklinks
|
V
iew topic
|
Edit
w
iki text
|
M
ore topic actions
Topic revision: r2 - 14 Nov 2006,
TheresaScott
Main
Department Home Page
Biostatistics Graduate Program
Vanderbilt University Medical Center
Main Web
Main Web Home
Search
Recent Changes
Changes
Topic list
Biostatistics Webs
Archive
Main
Sandbox
System
Register
|
Log In
Copyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki?
Send feedback