You are here:
Vanderbilt Biostatistics Wiki
>
Main Web
>
RBinaryFormat
(05 Mar 2010,
JeremyStephens
)
(raw view)
E
dit
A
ttach
---+ R Binary File Format R's =save()= function is able to convert data in your R workspace into a binary format which can be later recovered via the =load()= function. This page has a little information about the format =save()= uses when writing files. %TOC% ---++ Number conversion R uses the [[http://en.wikipedia.org/wiki/External_Data_Representation][XDR]] format to save numeric data. The GNU C Library (libc) has functions included to write out this format. See the [[http://www.manpagez.com/man/3/xdr/][XDR man page]] for more information. ---++ Example Program Here's an example C program that writes out the variable =x=, which is a vector of reals: =c(1,2,3)=. See =main()= to get started. It is a simplified version of what R does when you run: <verbatim> x <- c(1,2,3) save(x, file="test.rda") </verbatim> Nearly all of the relevant code for this can be found in [[http://svn.r-project.org/R/trunk/src/main/saveload.c][saveload.c]] (see the =do_save= function) and [[http://svn.r-project.org/R/trunk/src/main/serialize.c][serialize.c]] (see the =R_serialize= function) in the R source tree. <highlight lang=c> #include <stdio.h> #include <string.h> #include <rpc/xdr.h> #define R_XDR_INTEGER_SIZE 4 #define R_XDR_DOUBLE_SIZE 8 int pack_flags(type, levels, is_object, has_attr, has_tag) int type; // R object type int levels; int is_object; int has_attr; int has_tag; { int flags; if (type == 9) { // scalar string type, used for symbol names levels &= (~((1 << 5) | 1)); } flags = type | (levels << 12); if (is_object) flags |= (1 << 8); if (has_attr) flags |= (1 << 9); if (has_tag) flags |= (1 << 10); return flags; } void encode_integer(i, buf) int i; char *buf; { XDR xdrs; int success; xdrmem_create(&xdrs, buf, R_XDR_INTEGER_SIZE, XDR_ENCODE); success = xdr_int(&xdrs, &i); xdr_destroy(&xdrs); if (!success) { printf("encode_integer failed\n"); exit(1); } } void encode_double(d, buf) double d; char *buf; { XDR xdrs; int success; xdrmem_create(&xdrs, buf, R_XDR_DOUBLE_SIZE, XDR_ENCODE); success = xdr_double(&xdrs, &d); xdr_destroy(&xdrs); if (!success) { printf("encode_double failed\n"); exit(1); } } void write_data(buf, len, fp) char *buf; int len; FILE *fp; { int res; res = fwrite(buf, sizeof(char), len, fp); if (res != len) { printf("Write failed\n"); exit(1); } } void write_integer(i, buf, fp) int i; char *buf; FILE *fp; { encode_integer(i, buf); write_data(buf, R_XDR_INTEGER_SIZE, fp); } void write_double(d, buf, fp) double d; char *buf; FILE *fp; { encode_double(d, buf); write_data(buf, R_XDR_DOUBLE_SIZE, fp); } int main(argc, argv) int argc; char *argv[]; { FILE *fp; int res; char buf[128]; fp = fopen("test.rda", "w"); if (fp == NULL) { printf("Couldn't open file for writing\n"); return 1; } // Write magic: XDR_V2 write_data("RDX2\n", 5, fp); // Write format write_data("X\n", 2, fp); // Write R version information write_integer(2, buf, fp); // serialization version: 2 write_integer(133633, buf, fp); // Current R version (2.10.1 in this case) write_integer(131840, buf, fp); // Version number for R 2.3.0 (for compatibility reasons, I believe) // The saved R objects are wrapped in a list of dotted pairs before saving. // Next we write out flags needed for this list. write_integer(pack_flags(2, 0, 0, 0, 1), buf, fp); // Write the name of the variable we're storing write_integer(1, buf, fp); // symbol type write_integer(pack_flags(9, 33, 0, 0, 0), buf, fp); // symbol flags write_integer(1, buf, fp); // length of name write_data("x", 1, fp); // actual name // Now write the actual variable data write_integer(pack_flags(14, 0, 0, 0, 0), buf, fp); // vector of reals write_integer(3, buf, fp); // length of vector write_double(1.0, buf, fp); // first value write_double(2.0, buf, fp); // second value write_double(3.0, buf, fp); // third value // Tell R we're done write_integer(254, buf, fp); fclose(fp); return 0; } </highlight> ---+++ Compiling To compile this example program, you only need run: <verbatim> gcc -o r-save <filename> </verbatim> You _DO NOT_ need to link to R for this program. In Ubuntu, you need the =libc6-dev= package installed for the XDR headers. ---+++ Running Simply run: <verbatim> ./r-save </verbatim> This will create a file called =test.rda= which you can =load()= in R. ---++ Tips The =ascii= parameter in R's =save()= function is useful for figuring out the binary file format: <verbatim> x <- c(1,2,3) save(x, file="ascii.rda", ascii=TRUE) </verbatim> Writes the following to =ascii.rda=: <verbatim> RDA2 A 2 133633 131840 1026 1 9 1 x 14 3 1 2 3 254 </verbatim> Each line represents a write call in binary format. There are a few differences. The first two lines in binary mode are =XDR2= and =X=. Also, the first two lines have newlines in binary mode, but the rest of the lines don't. In addition, the numbers in the ASCII format are *not* XDR encoded. Here's a hexdump of the binary version of the same data: <verbatim> 00000000 52 44 58 32 0a 58 0a 00 00 00 02 00 02 0a 01 00 |RDX2.X..........| 00000010 02 03 00 00 00 04 02 00 00 00 01 00 00 00 09 00 |................| 00000020 00 00 01 78 00 00 00 0e 00 00 00 03 3f f0 00 00 |...x........?...| 00000030 00 00 00 00 40 00 00 00 00 00 00 00 40 08 00 00 |....@.......@...| 00000040 00 00 00 00 00 00 00 fe |........| 00000048 </verbatim> ---++ Caveats By default, R's =save()= function compresses the resulting file. If you want to compare your file to R's file, you may need to decompress it first by using =gunzip=, or you can call =save()= with <code>compress = FALSE</code>. -- Main.JeremyStephens - 05 Mar 2010
E
dit
|
A
ttach
|
P
rint version
|
H
istory
: r2
<
r1
|
B
acklinks
|
V
iew topic
|
Edit
w
iki text
|
M
ore topic actions
Topic revision: r2 - 05 Mar 2010,
JeremyStephens
Main
Department Home Page
Biostatistics Graduate Program
Vanderbilt University Medical Center
Main Web
Main Web Home
Search
Recent Changes
Changes
Topic list
Biostatistics Webs
Archive
Main
Sandbox
System
Register
|
Log In
Copyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki?
Send feedback