You are here: Vanderbilt Biostatistics Wiki>Main Web>Seminars>LinuxWorkshops>LinuxWorkshopRProgramingTipsAndGotchas (29 Jun 2009, WillGray)EditAttach

- Commonly referred to as 'Not Applicable' or Missing. It is most like a representation of all possible values. For this reason almost any operation applied to a NA will return NA.
- NA + 5 is equivalent to all possible values + 5 which should equal all possible values.
NA + 5 [1] NA

- A[NA,] is asking for all possible rows as such it returns a vector of NAs representing all possible values for those columns.
A <- matrix(1:25, ncol=5) A[c(1,NA),] [,1] [,2] [,3] [,4] [,5] [1,] 1 6 11 16 21 [2,] NA NA NA NA NA

- NA + 5 is equivalent to all possible values + 5 which should equal all possible values.
- Two operations that do not all ways return NA when applied to a NA are the AND ("&") and the OR ("|") operations.
- A & FALSE must be false. There is no possible value for A which would make this statement true. Therefor NA & FALSE is equal to FALSE.
NA & FALSE [1] FALSE

- A | TRUE must be true. There is no possible value for A which would make this statement false. Therefor NA | TRUE is equal to TRUE.
NA | TRUE [1] TRUE

- A & FALSE must be false. There is no possible value for A which would make this statement true. Therefor NA & FALSE is equal to FALSE.
- It is impossible to directly compare an NA to anything. In order to check if a value is equal to NA the is.na() function must be used. Other wise we are asking whether a value is equal to all possible values.
5 == NA [1] NA is.na(NA) [1] TRUE

sum(c(1,5,10,NA)) [1] NA sum(c(1,5,10,NA), na.rm=TRUE) [1] 16

- A NULL special value meaning essentially 'has no value'.
- Comparing NULL to anything is an invalid question. In order to check if a value is equal to NULL the is.null() function must be used.
- The A == B statement asks is the value A the same as the value B.
- The A == NULL statement asks is the value A the same as the value which has no value. As there is no value to compare against the operation returns a logical vector of length 0.
A <- 5 A == NULL logical(0) is.null(A) [1] FALSE

- An element of a list that is equal to NULL means that this element contains no data.
- Assigning the NULL value to an element of a list indicates that the data presently residing in that location should be forgotten about.

l <- list() l[[1]] <- 4 l$x NULL l[[2]] Error in list()[[2]] : subscript out of bounds

- Inf, -Inf, and NaN are special numeric values.
- Inf, and -Inf represent positive and negative infinity and behave accordingly.
A <- 1/0 B <- -1/0 A [1] Inf B [1] -Inf 5 * A [1] Inf

- NaN is short for Not a Number. It is the result of any undefined mathematical operation.
0/0 [1] NaN

- Locations where objects are stored.
- All environments (except the global environment) have a parent environment.
- All function calls are executed in its own environment that is a child of the call environment.
- values stored in parent environment are inherited by the child environment.
a <- 1 test <- function() { print(a) invisible(NULL) } test() [1] 1

- However the child environment only has access to copies of the original values. Any modifications to values done in the child environment will not propagate to the parent environment.
a <- 1 test <- function() { print(a) a <- 2 print(a) invisible(NULL) } test() [1] 1 [1] 2 print(a) [1] 1

- A parent environment cannot access any values from the child environment.

- This means that typos can lead to functions that work but are using wrong values from the parent enviroment instead of throwing an error.
- This is the function we want to write:
test1 <- function(cat) { 5 + cat } test1(5) [1] 10

- This is the function we actually wrote:
test1 <- function(cat) { 5 + car ## Typo should be cat } test1(5) Error in test1() : object 'car' not found

- What happens if the object 'car' exists in your working environment.
car <- 2 test1(5) [1] 7

- This is the function we want to write:
- If attach(), with() or within() functions are used there is a chance that this will lead to confusion on the user's part about which objects are being referenced.
- Lets say we want to add the object 'mod' to column 'a' in the data frame 'junk'.
mod <- 15 junk <- data.frame(a = 1:10) with(junk, a + mod) [1] 16 17 18 19 20 21 22 23 24 25

- What happens if there is a column in 'junk' named 'mod'?
mod <- 15 junk <- data.frame(a = 1:10, mod = 6:15) with(junk, a + mod) [1] 7 9 11 13 15 17 19 21 23 25

- Lets say we want to add the object 'mod' to column 'a' in the data frame 'junk'.

- always use 'TRUE' and 'FALSE' Variables. Variables 'T' and 'F' can be assigned other values.
T <- 0 TRUE == T [1] FALSE TRUE <- 0 Error in TRUE <- 0 : invalid (do_set) left-hand side to assignment

- Logical operations come in two forms. Vectorized and Non-Vectorized. The single character version ('&') is the vectorized version and the double character version ('&&') is the non vectorized version.
- The vectorized operation will operate element by element, returning a result for the set.
a <- c(1:10) b <- c(1:10) a < 7 & b > 3 [1] FALSE FALSE FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE

- The non vectorized operation only compares the first elements of the arguments returning a single value.
a <- c(1:10) b <- c(1:10) a < 7 && b > 3 [1] FALSE

- The vectorized operation will operate element by element, returning a result for the set.

- Partial argument matching can cause difficulties when attempting to pass arguments through the '...' argument.
test1 <- function(x=5, b=2) { b*x-5 } test2 <- function(f, bob="Hi", ...) { print(bob) f(...) } test2(test1) [1] "Hi" [1] 5 test2(test1, b=8) [1] 8 [1] 5

- We would expect that b=8 should have altered the returned value of the function. Looking closer at the test2() function arguments reveals the answer. test() has an argument 'bob' which is before the '...' argument. Partial argument matching happens mapping b=4 to bob=4. Following that no unmatched arguments remain to be assigned to the '...' argument.

- The read.table and data.frame() functions by default convert string vectors into factor vectors.
- NA value conversion
- By default read.table() converts the string value "NA" to the R NA value.
- For non-character vectors the zero length string value "" is converted to the R NA value.
- For character vectors the zero length string value is kept as is.
cat('A,B 1,a ,b 5, NA,"NA" 6,NA ', file="tmp.csv") read.table(file="tmp.csv", sep=',', header=TRUE, stringsAsFactors=TRUE) A B 1 1 a 2 NA b 3 5 4 NA <NA> 5 6 <NA>

- By default read.table() believes that a "#" character is a comment character. It will ignore all text between the "#" and the next line.
cat('A,B 7,Patient #5 8,Patient #8 ', file='tmp.csv') read.table(file="tmp.csv", sep=',', header=TRUE, stringsAsFactors=TRUE) A B 1 7 Patient 2 8 Patient

Edit | Attach | Print version | History: r3 < r2 < r1 | Backlinks | View wiki text | Edit wiki text | More topic actions

Topic revision: r3 - 29 Jun 2009, WillGray

Copyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.

Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback

Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback