for
loop Often, we wish to generate multiple regression models that are all similar, but all different. For example, all of the models have the same outcome and main covariate, but each has a different second covariate.
We could assign each individual model, copying and pasting the model expression and appropriately change the name of the second covariate, but this could take a lot of time and typing.
Instead, consider assigning the models using afor
loop, where the loop iterates over the values of the second covariate.
For this example, let's use our old standby -- the samplefile.txt
data file.
Let's first load the necessary packages, read in our data file, and make some changes to our data frame -- add some variable labels, level labels, and units.
lrm(gt10visits ~ tx + covariate2, data = x)
, where covariate2
can have the value age
, race
, sex
, and weight
. We would also like to keep the model separate by incorporating the value of covariate2
into the name.
We can do this using the following code:
"age"
). Because of this, we can use the paste()
function to past together our formula, which is in the form of a character string. Unfortunately, we can not supply this character representation into the Design
package's lrm()
function; it won't recognize it. Instead, we use the as.formula()
function to coerce the pasted together formula character string (e.g., "gt10visits ~ tx + age"
) to a formula class, which the lrm()
function will recognize.
Another problem with character strings arises when we try to assign the resulting model to a name. Normally, we assign a name to an object using the assignment operator <-
. For example, m1 <- lrm(gt10visits ~ tx + age, data = x)
. However, in this example, we want to incorporate the value of the second covariate into the name of the object (e.g., m.age
). We can generate these unique model names using paste("m", i, sep = ".")
, but if you tried to evaluate the following code, you would get an error:
paste("m", i, sep = ".") <- lrm(as.formula(modelformula), data = x)
The reason you would get an error is because the assignment operator <-
will not accept a character string on its left hand side. An alternative is to use the assign()
function, which the assignment operator is a shortcut for. With the assign()
function, the name of the object is given as a character string.
Now that we've figured out how to correctly generate and assign each of the models, let's think ahead to what we might want to do with them. Most likely, we'll want to generate a summary and/or plot of each model. Based on the for
loop we used above, we now have N defined lrm
objects (model fits), where N is the number of variables in varnames
. We can see this using the ls()
function. So, if we want to manipulate all of the defined lrm
objects, we'll need to reference them all, most likely, in another for
loop.
An alternative would be to generate an object that contains each of the model fits. In R, the easiest way to do this would be to generate a list, where each component of the list would be one of the fitted models --- that is, we would have a list of lists (the result of the lrm
function is a list). Having a list of lists would allow us to use one the apply
functions to generate a summary and/or plot of each component of the list (i.e., each fitted model).
To do this, we can use the following code as an alternative to the code above:
vector()
function. We then name each component of the list, so we can assign the corect model fit to each slot in the for
loop. We can use the names()
function and str()
(structure) function to look at what modelfits
contains --- BE AWARE, the str()
function returns a lot of output.