You are here: Vanderbilt Biostatistics Wiki>Main Web>WritingRExtensions (revision 2)EditAttach

Writing Extensions in R

You can write extensions for R in FORTRAN or C to help speed things up. This is useful if your R code involves lots of loops or other things that can be better handled in an external language. Most of this information and more can be found in the Writing R Extensions manual.

Hello world!

Let's start with the simplest extension in C: the obligatory "Hello world!" extension. For simple computations, all you have to do is create a C source file with the function you want to call. In this example, we have a hello_world() function.

#include <stdio.h>

void hello_world() {
    printf("Hello world!\n");

That's it. Easy, eh? If you want to send arguments to any external functions (which you will probably want to do), you'll need to add some parameters with types specific to what you're expecting from the user. More on that with the next example.

To use this function in R, you first have to compile it. The easiest way to do this is by using R CMD SHLIB like so:
R CMD SHLIB hello_world.c

This creates a shared library named (and also hello_world.o, which you can safely delete), and you can load this library into R by using the dyn.load() function like so:


Now you can call the hello_world C function through an R function called .C(), like this:


Your output should look similar to this:
> dyn.load("")
> .C("hello_world")
Hello world!

R executes the hello_world C function, then it returns a list object. This list contains the arguments (and their possible alterations) that you sent to the C function, and since our hello_world function didn't have any arguments, this list is empty. Using .C(), if you want to return information back to R, you must do so through parameters, since R doesn't care about what the C function returns.

You can find the hello_world source below.

Hello, you!

Now on to a very slightly more complicated external function. In this function, we'll accept an argument from the user (their name) and use it to print out a statement. In order to do this, we'll need to add some additional headers in order to tell R how this function should be called. (ALERT! You can still write R extensions that accept arguments without these libraries, but they allow us to prequalify arguments so that we don't get anything we don't expect. Since that's a good idea, this example contains the extra steps necessary to do this.)

Put this in the top of your source file along with your other includes:

#include "Rdefines.h"
#include "R_ext/Rdynload.h"

Here's an example of a function that can take one string argument:

void hello_you(char **name) {
    printf("Hello, %s!\n", name[0]);

Next you need to register this function with R and tell it how it should be called. Here's the code needed for our example:

R_NativePrimitiveArgType hello_youArgs[1] = {STRSXP};
R_CMethodDef cMethods[] =
    {"hello_you", (DL_FUNC)&hello_you, 1, hello_youArgs},
    {NULL,NULL, 0}

void R_init_hello_you(DllInfo *dll)

This code needs a bit of explaining. The R_NativePrimitiveArgType array is really just a plain old int array (typedef'd in the Rdynload.h file for readability and consistency I suppose) and is of length 1 since we have one parameter. If you had 5 parameters, this array would be of length 5. This array contains a list of types (which are also just of type int internally). STRSXP tells R that we expect a character vector for our first argument. If you want your function to accept other types of R objects, you can find what constants you should use here.

The cMethods array is list of R_CMethodDef structs, which are pretty simple. For each function you want to register, you need a definition in the cMethods array. The first element of each struct is a string by which you can call the function from R. The second is a pointer to the function (DL_FUNC is a cast to convert the pointer into a different type internally). The third is the number of arguments the function has, and the fourth is the array of data types (hello_youArgs).

Finally, the R_init_<lib> function, where lib is the name of the shared library you want to create (in this case, "hello_you"). This function actually does the registering of our function and is automatically called when R loads the shared library that is created when we compile the code.

Now just follow the same steps in the first example to compile the code and load it (with an additional argument passed to .C()), and your R output should look like this:
> dyn.load("")
> .C("hello_you", "Penpen the penguin")
Hello, Penpen the penguin!
[1] "Penpen the penguin"

This time we get the argument we passed to hello_you back in a list. The source code for this example can be found below.

Pretty matrix

There is an aspect of extensions in R that is very important to note. There a two different ways to write extensions. In the .C() way, R copies arguments first before passing them along to your external function. This is fine when you're passing arguments that are small-ish in size, but it isn't really the best idea if you're passing a huge data frame, for example. If you want to handle large R objects in your external functions, you'll want to use the .Call() method, which passes arguments by reference. The next example uses .Call() with a matrix passed as an argument.

After you include the necessary libraries, your C function header should look like this:

SEXP pretty_matrix(SEXP s_matrix) {

SEXP is the data type that C uses to handle R objects when using .Call(). Unlike in .C(), the return value from external functions called with .Call() are returned as-is back to the caller, so our function will return a type of SEXP instead of void like before.

Before you can manipulate the raw data in an SEXP object, you need to test it for it's R type (character, double, integer, etc.) yourself and assign it to a type you can work with (int, for example). Here's an example of how to do this in your C function:

int *matrix, height, width;

if (isMatrix(s_matrix) && isInteger(s_matrix)) {
    matrix = INTEGER(s_matrix);
    width  = INTEGER(GET_DIM(s_matrix))[1];
    height = INTEGER(GET_DIM(s_matrix))[0];
else {
    printf("invalid matrix.\n");
    return R_NilValue;

This checks to see if the argument is an integer matrix. If it is, it assigns the revelent information using macros defined in the R headers (INTEGER, GET_DIM, etc). If it's not an integer matrix, the function prints out a message and returns R_NilValue, which in R is NULL. Once you get past this section of code, you can treat the matrix variable as a normal int array. Matrices in R are stored internally as single-dimensional row-major arrays, so you'll have to do some arithmetic in order to access "rows" and "columns" like R does (ex: the value in the third row, first column can be accessed in this case by referring to matrix[2*width + 0]. Remember, C indexing starts at 0 instead of 1!).
Topic attachments
I Attachment Action Size Date Who Comment hello_world.c manage 0.3 K 06 Jan 2006 - 17:38 JeremyStephens hello world extension hello_you.c manage 0.4 K 06 Jan 2006 - 17:38 JeremyStephens hello you extension pretty_matrix.c manage 1.7 K 06 Jan 2006 - 17:39 JeremyStephens pretty matrix extension
Edit | Attach | Print version | History: r6 | r4 < r3 < r2 < r1 | Backlinks | View wiki text | Edit WikiText | More topic actions...
Topic revision: r2 - 09 Jan 2006, JeremyStephens

This site is powered by FoswikiCopyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback