Writing Extensions in R
A small tutorial for writing extensions in R.
Integrating R with External Languages
You can write extensions for R in FORTRAN or C to help speed things up. This is useful if your R code involves lots of loops or other things that can be better handled in an external language. Most of this information and more can be found in the
Writing R Extensions manual.
Hello world!
Let's start with the simplest extension in C: the obligatory "Hello world!" extension. For simple computations, all you have to do is create a C source file with the function you want to call. In this example, we have a hello_world() function.
#include <stdio.h>
void hello_world() {
printf("Hello world!\n");
}
That's it. Easy, eh? If you want to send arguments to any external functions (which you will probably want to do), you'll need to add some parameters with types specific to what you're expecting from the user. More on that with the next example.
To use this function in R, you first have to compile it. The easiest way to do this is by using
R CMD SHLIB
like so:
R CMD SHLIB hello_world.c
This creates a shared library named
hello_world.so
(and also
hello_world.o
, which you can safely delete), and you can load this library into R by using the
dyn.load()
function like so:
dyn.load("hello_world.so")
Now you can call the
hello_world
C function through an R function called
.C()
, like this:
Your output should look similar to this:
> dyn.load("hello_world.so")
> .C("hello_world")
Hello world!
list()
>
R executes the
hello_world
C function, then it returns a list object. This list contains the arguments (and their possible alterations) that you sent to the C function, and since our
hello_world
function didn't have any arguments, this list is empty. Using
.C()
, if you want to return information back to R, you must do so through parameters, since R doesn't care about what the C function returns (
.Call()
works differently though; see 3rd example).
You can find the
hello_world
source
below.
Hello, you!
Now on to a very slightly more complicated external function. In this function, we'll accept an argument from the user (their name) and use it to print out a statement. In order to do this, we'll need to add some additional headers in order to tell R how this function should be called. (

You can still write R extensions that accept arguments without these libraries, but they allow us to prequalify arguments so that we don't get anything we don't expect. Since that's a good idea, this example contains the extra steps necessary to do this.)
Put this in the top of your source file along with your other includes:
#include "Rdefines.h"
#include "R_ext/Rdynload.h"
Here's an example of a function that can take one string argument:
void hello_you(char **name) {
printf("Hello, %s!\n", name[0]);
}
Next you need to register this function with R and tell it how it should be called. Here's the code needed for our example:
SourceHighlighting Error: unknown error
This code needs a bit of explaining. The
R_NativePrimitiveArgType
array is really just a plain old
int
array (typedef'd in the Rdynload.h file for
readability and
consistency I suppose) and is of length 1 since we have one parameter. If you had 5 parameters, this array would be of length 5. This array contains a list of types (which are also just of type
int
internally).
STRSXP
tells R that we expect a character vector for our first argument. If you want your function to accept other types of R objects, you can find what constants you should use
here.
The
cMethods
array is list of
R_CMethodDef
structs, which are pretty simple. For each function you want to register, you need a definition in the cMethods array. The first element of each struct is a string by which you can call the function from R. The second is a pointer to the function (DL_FUNC is a cast to convert the pointer into a different type internally). The third is the number of arguments the function has, and the fourth is the array of data types (
hello_youArgs
).
Finally, the
R_init_<lib>
function, where
lib is the name of the shared library you want to create (in this case, "hello_you"). This function actually does the registering of our function and is automatically called when R loads the shared library that is created when we compile the code.
Now just follow the same steps in the first example to compile the code and load it (with an additional argument passed to
.C()
), and your R output should look like this:
> dyn.load("hello_you.so")
> .C("hello_you", "Penpen the penguin")
Hello, Penpen the penguin!
[[1]]
[1] "Penpen the penguin"
>
This time we get the argument we passed to
hello_you
back in a list. The source code for this example can be found
below.
Pretty matrix
There is an aspect of extensions in R that is
very important to note. There a two different ways to write extensions. In the
.C()
way, R
copies arguments first before passing them along to your external function. This is fine when you're passing arguments that are small-ish in size, but it isn't really the best idea if you're passing a huge data frame, for example. If you want to handle large R objects in your external functions, you'll want to use the
.Call()
method, which passes arguments by
reference. The next example uses
.Call()
with a matrix passed as an argument.
After you include the necessary libraries, your C function header should look like this:
SEXP pretty_matrix(SEXP s_matrix) {
...
}
SEXP
is the data type that C uses to handle R objects when using
.Call()
. Unlike in
.C()
, the return value from external functions called with
.Call()
are returned as-is back to the caller, so our function will return a type of
SEXP
instead of
void
like before.
Before you can manipulate the raw data in an
SEXP
object, you need to test it for it's R type (character, double, integer, etc.) yourself and assign it to a type you can work with (
int
, for example). Here's an example of how to do this in your C function:
SourceHighlighting Error: unknown error
This checks to see if the argument is an integer matrix. If it is, it assigns the revelent information using macros defined in the R headers (
INTEGER
,
GET_DIM
, etc). If it's not an integer matrix, the function prints out a message and returns
R_NilValue
, which in R is
NULL
. Once you get past this section of code, you can treat the
matrix
variable as a normal
int
array. Matrices in R are stored internally as single-dimensional row-major arrays, so you'll have to do some arithmetic in order to access "rows" and "columns" like R does (e.g. the value in the third row, first column can be accessed in this case by referring to
matrix[2*width + 0]
. Remember, C indexing starts at 0 instead of 1!).
When you're done doing calculations and want to return something, there a couple of things you need to do first. R does something called
garbage collection. Loosely, garbage collection involves freeing memory associated with unused objects. So any object you want to return needs to be protected from garbage collection first.
SourceHighlighting Error: unknown error
This creates a new numeric R object of length 1, gives it a value (in this case, the number of lines printed out; see code
below for details), and returns it back to R.
Registering functions for use in R's
.Call()
is a little different from registering
.C()
functions, but not by much.
SourceHighlighting Error: unknown error
Notice that you must use a different struct type than with
.C()
. Instead of
R_CMethodDef
, you use
R_CallMethodDef
. Also, telling R what types to expect for your external function is not necessary, since R expects you to check those on your own. The
R_registerRoutines
call is also a little different. The
callMethods
array needs to be the 3rd argument instead of the 2nd (
cMethods
goes in the 2nd slot). The other two arguments are reserved for functions to be called with
.Fortran()
and
.External()
(not covered here). If you have some functions for use with
.C()
and others for use with
.Call()
, and you can send both the
cMethods
and
callMethods
arrays to
R_registerRoutines
.
Compiling this source code is the same as before, by running
R CMD SHLIB pretty_matrix.c
. You load it into R the same way, too. The only difference is that you use
.Call()
instead of
.C()
. Here's the output:
> dyn.load("pretty_matrix.so")
> .Call("pretty_matrix", matrix(sample(1:99, 99), nrow=9))
+----+----+----+----+----+----+----+----+----+----+----+
| | | | | | | | | | | |
| 38 | 73 | 59 | 50 | 46 | 51 | 86 | 37 | 4 | 47 | 21 |
| | | | | | | | | | | |
+----+----+----+----+----+----+----+----+----+----+----+
| | | | | | | | | | | |
| 94 | 17 | 60 | 35 | 72 | 18 | 39 | 93 | 53 | 43 | 91 |
| | | | | | | | | | | |
+----+----+----+----+----+----+----+----+----+----+----+
| | | | | | | | | | | |
| 88 | 58 | 68 | 75 | 44 | 23 | 87 | 45 | 9 | 55 | 62 |
| | | | | | | | | | | |
+----+----+----+----+----+----+----+----+----+----+----+
| | | | | | | | | | | |
| 25 | 64 | 61 | 33 | 31 | 74 | 71 | 76 | 13 | 92 | 99 |
| | | | | | | | | | | |
+----+----+----+----+----+----+----+----+----+----+----+
| | | | | | | | | | | |
| 30 | 15 | 63 | 36 | 52 | 65 | 89 | 82 | 41 | 28 | 7 |
| | | | | | | | | | | |
+----+----+----+----+----+----+----+----+----+----+----+
| | | | | | | | | | | |
| 42 | 57 | 32 | 77 | 96 | 69 | 81 | 78 | 40 | 12 | 22 |
| | | | | | | | | | | |
+----+----+----+----+----+----+----+----+----+----+----+
| | | | | | | | | | | |
| 26 | 29 | 34 | 97 | 66 | 19 | 48 | 67 | 3 | 84 | 90 |
| | | | | | | | | | | |
+----+----+----+----+----+----+----+----+----+----+----+
| | | | | | | | | | | |
| 49 | 83 | 20 | 79 | 95 | 1 | 6 | 24 | 80 | 98 | 10 |
| | | | | | | | | | | |
+----+----+----+----+----+----+----+----+----+----+----+
| | | | | | | | | | | |
| 70 | 56 | 2 | 14 | 27 | 5 | 11 | 16 | 54 | 85 | 8 |
| | | | | | | | | | | |
+----+----+----+----+----+----+----+----+----+----+----+
[1] 37
>
Finité
So there you have it. There are lots of other things you can do with extensions, and if you want to learn more, check out the R documentation
here.