Common R Programming Techniques That Can Be Improved
Attaching Data Frames
Attaching a data frame makes it easy to reference its variables, but if you have two data frames attached things can get confusing, and users frequently forget to detach an attached object. Better to use data=, (when calling functions that use the statistical modeling language), with or within. within is an alternative to transform and upData. Unlike with, within allows you to change or add variables in the referenced data frame if it was a storable object (e.g., not subscripted).
xyplot(y ~ x | g, data=mine)
with(mine,
{
plot(x, y)
plot(x, z)
})
within(mine,
{
y <- 2*y
x <- x-1
new <- x + y
}
Logical Operations
R can subscript using integer vectors consisting of those subscripts meeting a certain condition, or using logical TRUE/FALSE vectors whose lengths are the lengths of the original objects tested. The latter usually leads to more readable and reliable code. Instead of
male <- which(sex=='male')
al <- which(state=='AL')
mean(x[intersect(male,al)]) # mean of male Alabamians
use
mean(x[sex=='male' & state=='AL']) #or:
maleal <- sex=='male' & state=='AL'
mean(x[maleal])