The `rms` Package for R : Regression Modeling Strategies

Documentation | CRAN

Massive update for version 5.0-0 including interactive plotly graphics and advanced html tables useful with RStudio rmarkdown/knitr
Non-downward-compatible change: if using latex methods for models put this at the top of the code: options(prType='latex'). To use html put options(prType='html').
Examples in an R markdown/knitr html
- Source code
Video demonstrating survplotp interactive survival curves
Online help with executable examples
Nicely formatted documentation
Github repository and its Changelog
Overview of the rms Package
Manual
Graphics output of help file examples (note that columns of anova tables inside graphics do not line up correctly because of a font family not being available when running R CMD check)
Reference card | For regular printing

Evolution

rms is an R package that is a replacement for the Design package. The package accompanies FE Harrell's book Regression Modeling Strategies. rms does not use any C level interfaces to other packages as Design did to the survival package. Thus rms will be easier to maintain.

rms has cleaned up graphics routines to make them more modular, to use lattice graphics, and to make it easier to use ggplot2 graphics. Defaults for confidence bands are now gray scale-shaded polygons. Starting with version 5.0-0 plotly interactive graphics and HTML output are implemented for many of the functions.

User-Visible Changes

The most visible change to the user is the replacement of the plot.Design function with the Predict, plot.Predict, and bplot functions. plot.Predict is used for bivariate graphics (using lattice), and bplot is used for 3-d graphics using base graphics functions image, contour, and persp. Multi-panel lattice graphics are usually better than 3-d graphics for showing the effects of multiple predictors varying simultaneously. The output of Predict is suitable for direct use by lattice (e.g., the xyplot function) and ggplot2 if you don't want to use plot.Predict.
As of version 2.2-0, the user will not longer specify predictor=. or predictor=NA when a predictor is varying over its detault range in Predict, summary, nomogram, survplot, gendata. Instead, just list the predictor name in the call. There is also a ggplot method.

rms also implements a high-level interface to the quantreg package through the new Rq function for quantile regression.

With version 3.1-0 a significant change is the addition of a latex argument to print methods for fit objects. This causes the fit object to be typeset using LaTeX.

Unlike Design, beginning with version 3.2-1 the rms validate and calibrate functions do not print summaries of which variables were selected if bw is TRUE. This is left up to the print methods.

The linear predictor stored in lrm fit objects now uses the first intercept for proportional odds models and not the middle one.

Other new features include bootstrap nonparametric confidence bands for predicted values, simultaneous confidence bands, and simultaneous contrasts.

Example Changes Needed for Plotting Predicted Values

plot(fit, x1=NA, x2=NA, ...) changed to p <- Predict(fit, x1, x2, ...) plot(p) # ?plot.Predict for details; produces a lattice object print(plot(p)) # needed if using Sweave or are inside { } plot(p, ~ x1 | x2) plot(p, ~ x1 | groups='x2') ggplot(p)

plot(fit, .., method='image' or 'contour' or 'persp') changed to p <- Predict(..., np=50) # type ?Predict for details bplot(p, ...) # ?bplot for details; uses lattice graphics; can also give a formula

datadensity is no longer a separate function to use after the predictions are plotted; the user specifies data= to plot.Predict to get a rug plot, for example.
Legend was changed to iLegend
nomogram does not plot by default. You must use plot(nomogram result) to plot, and the plotting arguments are separated into the arguments for plot.nomogram.
survplot defaults to conf='bands' and this now produces shaded confidence bands instead of bands made by pairs of lines
Varcov is changed to the more R-base concordant vcov
glmD is renamed Glm
glsD is renamed Gls
Rq allows fitting of quantile regression models with the full complement of rms capabilities including restricted cubic splines for covariates, restricted interaction surfaces, effect plots, bootstrap covariance estimation, and nomograms

Other Changes

Less visible to the user are the following changes:

Removed all uses of single() or storage.mode 'single'
Removed all use of .newSurvival
predictDesign changed to predictrms, added ref.zero argument
There is no longer a generic nomogram function, so nomogram.Design was renamed nomogram

To Do

When bug fixed in survfit.coxph.null remove n.all stuff
Make Function.rms and latex.rms work when the model contains terms like rcs()*rcs() (thanks: Rob James 30Aug10)

Substantial Correction to Hmisc package `fit.mult.impute` function

In May 2013 a correction was made for an upcoming release of the Hmisc package that fixes a major error in covariance matrix calculation when fit.mult.impute was used with glm . The covariance matrix returned from vcov and used in a variety of other rms functions was incorrect, not referring to imputation-corrected estimates

Recent Changes

See Changelog

Changes in version 4.0-0 (2013-07-10)

Cleaned up label logic in Surv, made it work with interval2 (thanks:Chris Andrews)
Fixed bug in val.prob - wrong denominator for Brier score if obs removed for logistic calibration
Fixed inconsistency in predictrms where predict() for Cox models used a design matrix that was centered on medians and modes rather than means (thanks: David van Klaveren <d.vanklaveren.1@erasmusmc.nl>)
Added mean absolute prediction error to Rq output
Made pr argument passed to predab.resample more encompassing
Fixed logLik method for ols
Made contrast.rms and summary.rms automatically compute bootstrap nonparametric confidence limits if fit was run through bootcov
Fixed bug in Predict where conf.type='simultaneous' was being ignored if bootstrap coefficients were present
For plot.Predict made default gray scale shaded confidence bands darker
For bootcov exposed eps argument to fitters and default to lower value
Fixed bug in plot.pentrace regarding effective.df plotting
Added setPb function for pop-up progress bars for simulations; turn off using options(showprogress=FALSE) or options(showprogress='console')
Added progress bars for predab.resample (for validate, calibrate) and bootcov
Added bootBCa function
Added seed to bootcov object
Added boot.type='bca' to Predict, contrast.rms, summary.rms
Improved summary.rms to use t critical values if df.residual defined
Added simultaneous contrasts to summary.rms
Fixed calculation of Brier score, g, gp in lrm.fit by handling special case of computing linear predictor when there are no predictors in the model
Fixed bug in prModFit preventing successful latex'ing of penalized lrms
Removed \synopsis from two Rd files
Added prmodsel argument to predab.resample
Correct Rd files to change Design to rms
Restricted NAMESPACE to functions expected to be called by users
Improved Fortran code to use better dimensions for array declarations
Added the basic bootstrap for confidence limits for bootBCa, contrast, Predict, summary
Fixed bug in latex.pphsm, neatened pphsm code
Neatened code in rms.s
Improved code for bootstrapping ranks of variables in anova.rms help file
Fixed bug in Function.rms - undefined Nami if strat. Thanks: douglaswilkins@yahoo.com
Made quantreg be loaded at end of search list in Rq so it doesn't override latex generic in Hmisc
Improved plot.summary.rms to use blue of varying transparency instead of polygons to show confidence intervals, and to use only three confidence levels by default: 0.9 0.95 0.99
Changed Surv to Srv; use of Surv in fitting functions will result in lack of time labels and assumption of Day as time unit; no longer override Surv in survival
Changed calculation of Dxy (and c-index) to use survival package survConcordance service function when analyzing (censored) survival time; very fast
Changed default dxy to TRUE in validate.cph, validate.psm
Dxy is now negated if correlating Cox model log relative hazard with survival time
Removed dxy argument from validate.bj as it always computed
Added Dxy to standard output of cph, psm
Added help file for Srv
Removed reference to ps.slide from survplot help page
Added the general ordinal regression fitting function orm (and orm.fit) which efficiently handles thousands of intercepts because of sparse matrix representation of the information matrix; implements 5 distribution families
Added associated functions print.orm, vcov.orm, predict.orm, Mean.orm, Quantile.orm, latex.orm, validate.orm
Changed predab.resample to allow number of intercepts from resample to resample
Fixed bug in Mean.cph (thanks: Komal Kapoor <komal.bitsgoa@gmail.com>)
Removed incl.non.slopes and non.slopes arguments from all predict methods
Changed all functions to expect predict(..., type='x') to not return intercept columns, and all fitting functions to not store column of ones if x=TRUE
Changed nomogram argument intercept to kint, used default as fit$interceptRef
Made bootcov behave in a special way for orm, to use linear interpolation to select a single intercept targeted at median Y
Revamped all of rms to never store intercepts in design matrices in fit objects and to add intercepts on demand inside predictrms
Added new function generator ExProb to compute exceedance probabilities from orm fits

Changes in version 3.7-0 (2013-03-18)

Cleaned up label logic in Surv, made it work with interval2 (thanks:Chris Andrews)
Fixed bug in val.prob - wrong denominator for Brier score if obs removed for logistic calibration
Fixed inconsistency in predictrms where predict() for Cox models used a design matrix that was centered on medians and modes rather than means (thanks: David van Klaveren <d.vanklaveren.1@erasmusmc.nl>)
Added mean absolute prediction error to Rq output
Made pr argument passed to predab.resample more encompassing
Fixed logLik method for ols
Made contrast.rms and summary.rms automatically compute bootstrap nonparametric confidence limits if fit was run through bootcov
Fixed bug in Predict where conf.type='simultaneous' was being ignored if bootstrap coefficients were present
For plot.Predict made default gray scale shaded confidence bands darker
For bootcov exposed eps argument to fitters and default to lower value
Fixed bug in plot.pentrace regarding effective.df plotting
Added setPb function for pop-up progress bars for simulations
Added progress bars for predab.resample (for validate, calibrate) and bootcov
Added bootBCa function
Added seed to bootcov object
Added boot.type='bca' to Predict, contrast.rms, summary.rms
Improved summary.rms to use t critical values if df.residual defined
Added simultaneous contrasts to summary.rms
Fixed calculation of Brier score, g, gp in lrm.fit by handling special case of computing linear predictor when there are no predictors in the model
Fixed bug in prModFit preventing successful latex'ing of penalized lrms
Removed \synopsis from two Rd files
Added prmodsel argument to predab.resample
Correct Rd files to change Design to rms
Restricted NAMESPACE to functions expected to be called by users
Improved Fortran code to use better dimensions for array declarations
Added the basic bootstrap for confidence limits for bootBCa, contrast, Predict, summary

Changes in version 3.6-3 (2013-01-11)

Added Li-Shepherd residuals in residuals.lrm.s, become new default (same as ordinary residuals for binary models)
Remove glm null fit usage as this is no longer in R

Changes for version 3.6-2 (10 Dec 12)

bootcov, predab.resample: captured errors in all fits (to ignore bootstrap rep) using tryCatch. Thanks: Max Gordoin <max@gforge.se>
predab.resample: made as.matrix(y) conditional to handle change in the survival package whereby the "type" attribute did not exist for a matrix
anova.rms: added new parameter vnames to allow use of variable labels instead of names in anova table; added vinfo attribute
residuals.lrm: removed intercept from partial residuals for binary models
moved comprehensive examples in rmsOverview to ~/rms/demo/all.R; greatly speeds up package checking but demo needs to be run separately for better checking, using demo(all, 'rms')
Fixed survfit.formula to not use .Global environment

Changes for version 3.6-1 (5 Nov 12)

bootcov: set loglik to default to FALSE and added code to fill in missing intercepts in coef vector for prop. odds model when levels of Y not resampled; see coef.reps to default to TRUE
Predict: implemented fun='mean' to get proper penalty for estimating the mean function for proportional odds models
Added usebootcov argument to Predict to allow the user to force the use of bootstrap covariance matrix even when coef.reps=TRUE was in effect for bootcov

Changes for version 3.6-0 (27 Oct 12)

Gls: Updated optimization calls - had become inconsistent with gls and failed if > 1 correlation parameter (thanks: Mark Seeto <markseeto@gmail.com>); removed opmeth argument
print.fastbw: added argument: estimates
survplot.survfit: handled fact that survival:::summary.survfit may not preserve order of strata levels. Also fixed survit.cph and cph; Thanks: William.Fulp@moffitt.org
plot.Predict: added example showing how to rename variables in plot
print(fit object, latex=TRUE): added latex.naprint.delete, used new Hmisc latexDotchart function to make a dot chart of number of NAs due to each model variable if at least 4 variables have NAs
added trans argument to plot.anova.rms to allow transformed scales
Corrected cph to use model.offset(); thanks: Simon Thornley <s.thornley@auckland.ac.nz>
Changed latex.anova.rms to use REGRESSION instead of TOTAL label
Changed gendata, contrast.rms to allow expand=FALSE to prevent expand.grid from being called to generate all combinations of predictors
Added type= to plot.Predict to allow user to specify a different default line/point type (especially useful when x is categorical)
Corrected bug in offset in psm - made default offset the length of X
Corrected bug in calibrate.psm (fixed -> parms)
predab.resample, calibrate.cph, calibrate.default, calibrate.psm: stopped putting results from overall initial fit into .Global and instead had predab.resample put them in attribute keepinfo, obtained from measure()

Changes in version 3.5-0 (2012-03-24)

contrast.rms: saved conf.type and conf.int in returned object, added to print method
Added debug= to predab.resample so user can see all the training and test sample subscripts
Added validate.Rq function
Fixed bug in Rq that caused 2 copies of fitted.values to be in fit object, which caused fit.mult.impute to double fitted.values
Added how to reorder predictors if using plot(Predict(fit))
Added new function perlcode written by Jeremy Stephens and Thomas Dupont; converts result of Function to Perl code
Fixed partial argument matches in many functions to pass new R checks
Changed matrx and DesignAssign to allow validate.Rq to consider null models; neatened code

Changes for version 3.5-0 (24 Mar 12)

contrast.rms: saved conf.type and conf.int in returned object, added to print method
Added debug= to predab.resample so user can see all the training and test sample subscripts
Added validate.Rq function
Fixed bug in Rq that caused 2 copies of fitted.values to be in fit object, which caused fit.mult.impute to double fitted.values
Added how to reorder predictors if using plot(Predict(fit))
Added new function perlcode written by Jeremy Stephens and Thomas Dupont; converts result of Function to Perl code
Fixed partial argument matches in many functions to pass new R checks
Changed matrx and DesignAssign to allow validate.Rq to consider null models; neatened code

Changes for version 3.4-0 (17 Jan 12)

psm: fixed logcorrect logic (thanks: Rob Kushler)
Added suggested package multcomp (required for simultaneous CLs)
Implemented simultaneous confidence intervals in Predict, predictrms, contrast.rms, all specific model predict methods
Add multiplicity adjustment for individual confidence limits computed by contrast.rms, to preserve family-wise coverage using multcomp package
Improved rbind.Predict to preserve order of groups as presented, as levels of .set.
Added example for plot.Predict showing how to suppress predictions for certain intervals/groups from being plotted
Added example in plot.Predict help file for graphing multiple types of confidence bands simultaneously

Changes for version 3.3-3 (6 Dec 11)

robcov: used vcov to get var-cov matrix
vcov.Glm: gave precedence to $var object in fit
Added residuals.Glm to force call to residuals.glm, and make robcov fail as type="score" is not implemented for glm
Fixed bootcov for Glm to sense NA in coefficients and skip that iteration
Fixed digit -> digits error in latex.rms
Fixed f$coef error in pentrace; thanks christopher.hane@optum.com
Added new feature for Predict() to plot bootstrap nonparametric confidence limits if fit was run through bootcov with coef.reps=TRUE
Added ylim argument to plot.residuals.lrm

Changes for version 3.3-2 (9 Nov 11)

calibrate.default: add var-cov matrix to ols objects
print.lrtest: discarded two formula attributes before printing
Added digits, size, and after arguments for latex methods for model fits, made before argument work with inline=TRUE, changed \needspace to \Needspace in latex.validate and prModFit
latex: fixed to consider digits for main effects
plot.xmean.ordinaly: added new argument cex.points
print.lrm: improved printing of -2 LL overall penalty
plot.calibrate.default: invisibly return prediction errors
plot.Predict: added cex.axis argument to pass to x scales; added subdata
print.pentrace: neatened up output
added title as an argument to all high-level function print methods
prModFit: fixed bug where Score chi2 was not translated to LaTeX
prModFit: changed to use LaTeX longtable style for coefficients etc.
prModFit: added arguments long and needspace
prModFit: suppressed title if title=""
rmsMisc: added nobs.rms and added nobs to object returned by logLik.rms
Added new argument cex.points to plot.xmean.ordinaly
Changed example in anova.rms to use reorder instead of reorder.factor
Added new argument cex.points to plot.xmean.ordinaly
Changed example in anova.rms to use reorder instead of reorder.factor

Changes for version 3.3-1 (1 Jun 11)

Added new example for anova.rms for making dot plots of partial R^2 of predictors
Defined logLik.ols (calls logLik.lm)
Fixed and cleaned up logLik.rms, AIC.rms
Fixed residuals.psm to allow other type= values used by residuals.survreg
Fixed Predict and survplot.rms to allow for case where no covariates present
Fixed bug in val.prob where Eavg wasn't being defined if pl=FALSE (thanks: Ben Haller)
Fixed bug in Predict so that it could get a list or vector from predictrms
Fixed latex.rms to not treat * as a wild card in various contexts (may be interaction)
Fixed predictrms to temporarily get std.err if conf.int requested even it std.err not; omitted std.err in returned object if not wanted
Enhanced plot.Predict to allow plots for different predictors to be combined, after running rbind.Predict (varypred argument)
Also enhanced to allow groups= and cond= when varying the predictors
Corrected bug where sometimes would try to plot confidence limits when conf.int=FALSE was given to Predict
Added india, indnl arguments to anova.rms to suppress printing individual tests of interaction/nonlinearity
Changed anova.rms so that if all non-summary terms have (Factor+Higher Order Factor) in their labels, this part of the labels is suppressed (useful with india and indnl)

Changes for version 3.3-0 (28 Feb 11)

In survplot.rms, fixed bug (curves were undefined if conf='bands' and labelc was FALSE)
In survfit.cph, fixed bug by which n wasn't always defined
In cph, put survival::: on exact fit call
Quit ignoring zlim argument in bplot; added xlabrot argument
Added caption argument for latex.anova.rms
Changed predab to not print summaries of variables selected if bw=TRUE
Changed predab to pass force argument to fastbw
fastbw: implemented force argument
Added force argument to validate.lrm, validate.bj, calibrate.default, calibrate.cph, calibrate.psm, validate.bj, validate.cph, validate.ols
print.validate: added B argument to limit how many resamples are printed summarizing variables selected if BW=TRUE
print.calibrate, print.calibrate.default: added B argument
Added latex method for results produced by validate functions
Fixed survest.cph to convert summary.survfit std.err to log S(t) scale
Fixed val.surv by pulling surv object from survest result
Clarified in predict.lrm help file that doesn't always use the first intercept
lrm.fit, lrm: linear predictor stored in fit object now uses first intercept and not middle one (NOT DOWNWARD COMPATIBLE but makes predict work when using stored linear.predictors)
Fixed argument consistency with validate methods

Changes for version 3.2-0 (14 Feb 11)

Changed to be compatible with survival 2.36-3 which is now required
Added logLik.rms and AIC.rms functions to be compatible with standard R
Fixed oos.loglik.Glm
Fixed bootcov related to nfit='Glm'
Fixed (probably) old bug in latexrms with strat predictors

Changes for Version 3.1-0 (12 Sep10)

Fixed gIndex to not use scale for labeling unless character
Changed default na.action in Gls to na.omit and added a note in the help file that na.delete does not work with Gls
Added terms component to Gls fit object (latex was not working)
Added examples in robcov help file testing sandwich covariance estimator
Added reference related to the effects package under help file for plot.Predict
Added more examples and simulations to gIndex
Fixed ancient bug in lrm.fit Fortran code to handle case where initial estimates are nearly perfect (was trying to step halve); thanks: Dan Hogan
Changed survdiffplot to use gray(.85) for bands instead of gray(.95)
Fixed formatting problem in print.psm
Added prStats and reVector functions to rmsMisc.s
Changed formatting of all print.* functions for model fits to use new prStats function
Added latex=TRUE option to all model fit print methods; requires LaTeX package needspace
Re-wrote printing routines to make use of more general model
Removed long and scale options from cph printing-related routines
Prepare for version 2.36-1 of survival package by adding censor=FALSE argument to survfit.coxph
Added type="ccterms" to various predict methods
Made type="ccterms" the default for partial g-indexes in gIndex, i.e., combine all indirectly related (through interactions) terms
Added Spiegelhalter calibration test to val.prob
Added a check in cph to trigger an error if strata() is used in formula
Fixed drawing of polygon for shaded confidence bands for survplot.survfit (thanks to Patrick Breheny <patrick.breheny@uky.edu>)
Changed default adjust.subtitle in bplot to depend on ref.zero, thanks to David Winsemius <dwinsemius@comcast.net>
Used a namespace and simplified referenced to a few survival package functions that survival actually exports

Changes for Version 3.0-0 (16May10)

Note: This is a major release because of the addition of new functions gIndex and GiniMd and of the addition of the g-index to all of the regression fitters, in addition to adding type="cterms" to predict.rms

Made Gls not store data label() in residuals object, instead storing a label of 'Residuals'
Fixed handling of na.action and check for presence of offsets in Glm
Added type="cterms" to predict methods; computes combined terms for main effects + any interaction terms involving that main effect; in preparation for new geffects function
Added GiniMd and gIndex functions
Change lrm (lrm.fit) to use the middle intercept in computing Brier score
Added 3 g-indexes to lrm fits
Added 1 g-index to ols, Rq, Glm, Gls fits
Added 2 g-indexes to cph, psm fits
Added g to validate.ols, .lrm, .cph, .psm, but not to validate.bj
Added print.validate to set default digits to 4
Changed validate.lrm to compute 3 indexes even on ordinal response data

Changes for Version 2.2-0 (23Feb10)

Added levels.only option to survplot.* to remove variablename= from curve labels
Added digits argument to calibrate.default
Added new ref in val.prob help page
Corrected location of dataset in residuals.lrm help page (thanks frederic.holzwarth@bgc-jena.mpg.de)
Fixed latex.rms to latex-escape percent signs inside value labels
Added scat1d.opts to plot.Predict
Changed method of specifying variables to vary by not requiring an equals sign and a dot after the variable name, for Predict, summary, nomogram, gendata, survplot.rms
Added factors argument to Predict to handle the above for survplot
Made gendata a non-generic function, changed the order of its arguments, removed editor options, relying on R de function always
Thanks to Kevin Thorpe <kevin.thorpe@utoronto.ca> to make latex.summary.rms and latex.anova.rms respect the table.env argument
Fixed bug in calibrate.default related to digits argument
Re-wrote bplot to use lattice graphics (e.g., levelplot contourplot wireframe), allowing for multiple panels for 3-d plots

Changes for Version 2.1-0

Made Predict not return invisibly if predictors not specified
New option nlines for plot.Predict for getting line plots with 2 categorical predictors
Added rename option to rbind.Predict to handle case where predictor name has changed between models
Added ties=mean to approx( ) calls that did not have ties= specified
Added nlevels argument to bplot to pass to contour
Added par argument to iLegend - list to pass to par().
Redirected ... argument to iLegend to image( ).
Fixed groupkm - was printing warning messages wrongly
Added new semiparametric survival prediction calibration curve method in val.surv for external validation; this is the first implementation of smooth calibration curves for survival probability validation with right-censored data
Fixed calibrate confidence limits from groupkm
Added smooth calibration curve using hare (polspline package) for calibrate.cph and calibrate.psm
Added display of predicted risks for cph and psm models even for the stratified KM method (old default)

Full Regression Modeling Strategies course featuring the rms package
Syllabus for a 3-day short course on the rms package and associated statistical methodology.
Datasets for use in learning how to use Hmisc, rms, logistic regression, survival analysis, ordinary regression, penalized estimation, missing value imputation, data reduction, etc. (R save() formats)
An Introduction to S and the Hmisc and Design Packages; CF Alzola and FE Harrell (308 pages, PDF, 2004). For a list of recent changes to this document click here.
Examples of use of rms functions and of adapting them for special uses

Go to the home page for the text REGRESSION MODELING STRATEGIES

Bug Reports

Please see the bug reporting page

Topic attachments
I	Attachment	Action	Size	Date	Who	Comment
pdf	refcard.pdf	manage	59.0 K	13 Sep 2009 - 09:03	FrankHarrell	Reference card for R rms package
pdf	refcardo.pdf	manage	59.2 K	13 Sep 2009 - 09:03	FrankHarrell	Reference card for rms with pages in usual order (not suitable for booklets)
pdf	rms-Ex.pdf	manage	1694.6 K	29 Dec 2011 - 22:22	FrankHarrell	Graphics output for all help file examples

This topic: Main > WebHome > StatComp > RmS > Rrms
Topic revision: revision 61

Copyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback

The rms Package for R : Regression Modeling Strategies