Changes to document since 1 August 1997


Added section 3.1.1 - Adjustments to Variables after Input

15 Aug 97: Added section 4.1.7 - Accessing Data Frames in S-Plus Version 4.0

3 Sep 97: Chapter 1 - described how to set up a shortcut to S-Plus 4.0- target is e.g. c:\splus4\cmd\splus.exe S_PROJ=directoryname Mentioned that you get a warning the first time S-Plus 4.0 is invoked, but that S-Plus will create _Data and _Prefs correctly

8 Oct 97:
  • Chapter 4 - added more recode examples
  • Chapter 6 - added a few background facts about summary.formula relating to FUN vs fun, na.rm, marginal stats
  • Chapter 2 - added earlier mention of .First to automatically do library(Hmisc,T)

12 Oct 97: Chapter 2 - added list of system overrides done by Hmisc, esp. [.factor

13 Oct 97:
  • Added new subsection in Chapter 2 telling how to move between vectors and matrices for holding serial data on each subject
  • Added text on using x[] to remove unused levels of a factor variable x

31 Oct 97: Added new examples in Sections 5.2 and 5.4.2

8 Nov 97:
  • Added new examples of lists using states of U.S.
  • Added new section in Chapter 2 defining how functions and arguments work; moved Getting Help section to early in chapter
  • Added example of trellis group= and panel.superpose
  • Added trellis example with key=
20 Nov 97: Added densityplot examples in Section 9.3

22 Nov 97: Mentioned that you can specify S_PROJ=. on the command line if you fill in 'Start in'


28 Jan 98: Added alternative commands for UNIX batch processing

7 Feb 98:
  • Added subsection on how to add variables to a data frame,
  • added panel.plsmo and mgp.axis.labels, trellis.strip.blank functions

17 Mar 98:
  • Added more complex example for reshaping serial (longitudinal) data
  • Added new chapter on use of lm (from supplemental notes for biostat modeling course)

20 Mar 98: Added stuff about the merge function and moved related examples to that location in chapter 4

29 Jun 98:
  • MAJOR REVISION including a chapter on areg.boot.
  • Updated for S-PLUS 4.5.

14 Jul 98: Corrections for setting up for 4.5 usage

17 Jul 98:
  • Converted to Y&Y LaTeX, produced .pdf file with hyperlinks,bookmarks
  • Added section on transcan for multiple imputation

19 Jul 98: Clarified why SPLUS needs shortcuts where WORD, Excel don't

26 Jul 98: Added need to go through Filtering ... dialog after set up S+ shortcut

4 Aug 98: Corrected names(df) <- casefold(names(df))

10Aug 98: Finished MAJOR REVISION, printed copies for bookstore
28Aug Added examples of builtin function na.pattern in chapters 2,3

  • Updated rm.boot description to include cor.pattern, rho.
  • Added review section for creating data frames near end of Chapter 4. Updated trellis description to reflect new Hmisc functions. Fixed a few typos.

14Dec98: Added example in Chap. 4 of using substring to manipulate character variables, especially dates and times.

15Dec Added trellis example where reorder.factor is used.

16Dec Added example of computing and plotting row percents from a contingency table, in Chapter 6

24Dec Added description of new features in setps, setpdf, setTrellis

28Dec Added examples of use of reShape function, references to rm.impute function


4Jan99: Added histSpike function, new options to ecdf

7Jan Added example of using bootstrap to get confidence limits for ranks of department means, and plotting these using Dotplot (Chapter 4)

3Feb Added info on tweakUI for Win 98 (Xmouse capability) - Chap. 1

15Feb Added example of graphing 3 quartiles of height for each age and sex group, at end of Chapter 10. Also showed how to do this with the new method='quantiles' option to xYplot.

  • Added to Design chapter some mention of Hmisc dataRep function to describe how well a new subject was represented in a dataset used to develop a predictive model.
  • Added text about anova for ols fits now printing F stats. Need to change all anova output for ols fits to reflect this, in the future.

4Mar Added example in Chapter 4 of computing differences with the previous observation within subject, added Lag function

12Mar Made new version for online viewing, using pdfscreen LaTeX style

27Jul Many small revisions, improved System Tools section (version used for Fall '99)

8Dec Added comparison of various methods for aggregating data for plotting (end of Chap. 10)


  • Chapter 1: updated comparison with SAS for SAS V7/8
  • Chapter 4: added material on S+ 2000 object explorer, workspace. Added example for grouping large numbers of categories of a categorical variable.
  • Added brief chapter containing suggestions for making good statistical graphics, with many bibliographic references.
  • Updated xYplot for method=function definition.

  • Added new section on batch processing in Windows
  • Aded section on the make utility and reproducible analysis (both of these in new chapter 13)

27Jul Added some stuff about Linux in Chapter 1

22Sep Added limited info on graphical perception, Chapter 10

  • Added hierarchy of graphical perception tasks Chap. 10
  • Added Perl example for program control, Chap. 13

1Nov Added section 12.2.4 on inserting S+ graphics in MS Office

15Nov Added more general guidelines in new Chapter 10

27Nov Added more info about important trellis options in section 11.4. These include strip.default, layout, skip, banking.

  • Added new sections in Chap. 10 on choosing the best graph type and methods for conditioning (stratifying).
  • Added paragraph at end of table making chapter on using image() and hist2d().


9Jan01 Added more pointers for effectively adding S-Plus graphics to Word documents (thanks to Julian Wells).

18Jan Added pointer to pstoedit in section about getting graphics into Word.

10May Added information about the Hmisc upData function in Chapters 3 and 4, and stated that this is the preferred method for updating/changing data frames instead of attaching in search position 1.

22May Added Tufte's views on graphical excellence

2Jun Corrected error in sort example in 4.2.1 in missing {} in 4.2.3

10Aug Added graphviz and TeXmacs in 1.9, R in 1.1

30Nov Added example of new usage of reShape to reshape serial plus baseline variables (in the same data frame) in 4.2.7


  • Added description and examples of subset function in 4.1.2
  • Changed subsection 3.2.3 to be a section
  • Inserted a new section before it on the contents function.
  • Replaced discussion of transcan in Chapter 4 with a discussion of aregImpute
18Jun Fixed a few typos, updated some information about S-Plus 6, removed references to S-Plus 4

8Jul Added material on NoteTab editor and its use with R at end of Chapter 1.

  • Added material in Chapter 2 on installing and accessing Hmisc and Design in R, deleted material on model.frame.default.
  • Added mention of getHdata in Chapter 4.
  • Added subsection in Chapter 3 on reading S-Plus data into R.
  • Added subsection in Chapter 4 on managing R objects (load, save)
  • Improved tools section in Chapter 1, brought hardware requirements a bit more up to date, removed some specific references in Chapter 1 to Windows 95/98/NT, added Xemacs, new address for ESS.
  • Change several references to S-Plus to the generic S in all chapters.
  • Contributed to "Contributed Documentation" section


2Jan Added new section in Chapter 3 on setting up R date variables

3Jan Added comment about R dates for inputting 4-digit years

18Mar Added example of reshaping serial data in which an additional layer of "stringing out" into long format is done by specifying a character variable containing the name of a variable whose value is in the numeric variable "value".

22Mar Added example in Chapter 4 for getting last non-NA value and date of such value for 2 variables on a data frame; used mApply or by

29May Interchanged definitions of split= argument in Section 11.4


Added mention in Chapter 1 of SAS class. trees and GAM. Thanks: Andrew Yu (
Changed U. Virginia URLs to Vanderbilt addresses. Changed several references to S-Plus to S.
Added new Section 4.6: Dealing with lists of data frames.
Fixed typo b -> mydata in example at start of section 4.2
Added example of new option in reShape function to handle multiple id variables and create a data frame, in section 4.3.8
Added example in section 11.4.1 on using R Lattice function xyplot to do error bars using only builtin features.
In section 4.1 added mention of R with and transform functions, and recommendations to not attach is search position one in S-Plus and in many cases not to attach at all (in R).
Added example of putting confidence interval for 2 group means and for difference on same plot, in chapter 10


Added better method for R lattice graphics when needing to use black and white with transparent strip backgrounds, in Section 11.4


Added new data manipulation subsection in Chapter 4 for subsetting on qualified observations from a set of repeated observations per subject
Used shorter twiki URLs in several files

-- FrankHarrell - 11 Feb 2004, Modified 5 May, 13 Jun, 10 Sep, 16 Nov 2004, 2 Jan 2005, 24 Sep 2006
Topic revision: r7 - 24 Sep 2006, FrankHarrell

This site is powered by FoswikiCopyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback