---+++ Methods of Retrieving Datasets Most of the datasets on this site are in the S =dumpdata= format (file suffix of =.sdd=) and R compressed =save()= file format (suffix of =.sav=). Some datasets are available in Excel or ASCII formats. * To manually download and install a dataset, right click on a file to save it to a temporary disk location, e.g., into a directory such as =\windows\temp= or =/tmp= * In S-Plus 2000 or 6.x you can import the datasets using the =File ... Import ... S-Plus Transport File= dialog * Alternatively, use the S-Plus or R command =data.restore('/mydir/file.sdd')= * In R, =data.restore= is found in the =foreign= package, but the =save= files are easier to use. * If you have version 1.4-1 or later of the =Hmisc= package for R you can download and =load()= a dataset by just typing =getHdata(dataset name)=. To list available dataset names just type =getHdata()=. Type =?getHdata= to see other options including ones to browse a dataset's =html(contents())= file or its description file (if available) on our web site. Here's an example: <verbatim> getHdata(prostate) attach(prostate) ... </verbatim> * =getHdata= is available for recent releases of S-Plus in =Hmisc=. If your Windows system does not have the =wget= executable you must install it for =getHdata= or =download.file= to work. You may obtain =wget.exe= [[ftp://sunsite.dk/projects/wget/windows here]]. Download to a temp file, unzip, and put =wget.exe= in the same directory in which Windows stores =ftp.exe=. * In S-Plus 5.x or 6.x you will need to run imported data frames through the =Hmisc= library's =cleanup.import= function if not using =getHdata=, e.g., =pbc <- cleanup.import(pbc)= to remove =class=es that are not allowed in Version 4 of the S language due to its inability to handle multiple inheritance. If using =.sdd= files in =R= you may want to also run the files through =cleanup.import= to store them more efficiently (=save= files are already stored efficiently). When using =getHdata= in S-Plus it will automatically run =cleanup.import= for you. * It is best to use the datasets with the =Hmisc= library in effect. Among other things, this will allow you to use the =Hmisc describe= and =contents= functions to obtain documentation about the variables. -- Main.FrankHarrell - 24 Jan 2004
This topic: Main
>
WebHome
>
DataSets
>
RetrievalDatasets
Topic revision:
24 Jan 2004,
WikiGuest
(raw view)
Copyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki?
Send feedback