### R Notes for Classes

Some of R codes for classes are listed where [EMS] refers to Essential Medical Statistics.

#### [EMS] Chapter 3 Displaying the data

Shapes of frequency distributions R code:

#### [EMS] Chapter 4 Means, standard deviations and standard errors

• [EMS] Example 4.4 on page 39
```## population mean and s.d.
set.seed(20)
mu <- 78.2 # the mean of population dbp
sigma <- 9.4 # the s.d. of population dbp
N <- 250
Y <- round(rnorm(N, mean=mu, sd=sigma), 2)
Y[1:10] # 10 values of the population

## distribution of population
hist(Y, xlim=c(40, 120))

## sampling from the population
n <- 10 # the number of observation in each sample
y.sam <- sample(Y, size=n, replace=FALSE)
y.sam # this is what you got

## sample mean and s.d
mean(Y)
mean(y.sam)
sd(Y)
sd(y.sam)
## comparison of distributions between population and one sample
par(mfrow=c(1,2))
hist(Y, xlim=c(40, 120))
hist(y.sam, xlim=c(40, 120))

## sampling variation
sampling.var <- function(n, n.trial, sigma, mu, N){
Y <- rnorm(N, mean=mu, sd=sigma)
sam.mean <- NULL
for(i in 1:n.trial){
y.sam <- sample(Y, size=n, replace=FALSE)
sam.mean <- c(sam.mean, mean(y.sam) )
}
return(sam.mean)
}

n <- 10
n.trial <- 100 # the number of trials
y.sam <- sample(Y, size=n, replace=FALSE)
y.sam # observe what you got; try another sample

set.seed(20)
sam.mean <- sampling.var(n = n, n.trial=n.trial, sigma=sigma, mu=mu, N=N)
## standard error (s.e.)
se <- sigma/sqrt(n) #theoretical s.e.
se
sd(sam.mean)

y.sam <- sample(Y, size=n, replace=FALSE) # you only have one sample
s <- sd(y.sam)
s/sqrt(n)

par(mfrow=c(1,1))
plot(50.5, mean(Y), xlab="Identification number of trial", ylab="Sample mean", col=2, pch=15, cex=1.3, xlim=c(1,100), ylim=c(65, 95))
abline(h=mu, col=3, lty=2)
points(1:n.trial, sam.mean)
abline(h=mu+1.96*se, col=4, lty=3)
abline(h=mu-1.96*se, col=4, lty=3)

## sampling distribution as function of n and sigma
bin <- 10
hist(sam.mean, bin, xlim=c(40, 120))

sam.mean.10.s10 <- sampling.var(n = 10, n.trial=n.trial, sigma=10, mu=mu, N=N) # n=10, sigma=10
sam.mean.30.s10 <- sampling.var(n = 30, n.trial=n.trial, sigma=10, mu=mu, N=N) # n=30, sigma=10
sam.mean.10.s20 <- sampling.var(n = 10, n.trial=n.trial, sigma=20, mu=mu, N=N) # n=10, sigma=20
sam.mean.30.s20<- sampling.var(n = 30, n.trial=n.trial, sigma=20, mu=mu, N=N) # n=30, sigma=20

par(mfrow=c(2,2))
hist(sam.mean.10.s10, bin, xlim=c(40, 120))
hist(sam.mean.30.s10, bin, xlim=c(40, 120))
hist(sam.mean.10.s20, bin, xlim=c(40, 120))
hist(sam.mean.30.s20, bin, xlim=c(40, 120))
```
• Underlying distribution is non-normal: R code:

#### [EMS] Chapter 5 The normal distribution

R code for learning normal distributions and standard normal distributions

#### [EMS] Chapter 6 Confidence interval for a mean

• Interpretation of confidence interval: R code for Fig. 6.2
• Confidence interval using t distributions:
• normal vs. t distribution:
```mu.mean <- 0
mu <- seq(mu.mean-5, mu.mean+5, 0.001)
sigma <- 1
t.plot <- function(df){
plot(mu, dnorm(mu, mu.mean, sigma), col=1, lty=2, ylim=c(0.002, 0.6), type="l", xlab="", ylab=" ", xaxs="i", yaxs="i", yaxt="n", bty="n")
lines(mu, dt(mu, df=df), col=2, lty=1)
text(2.2, .4, "normal distribution")
text(3, .3, paste("t distribution with",df, "d.f."), col=2 )
}
t.plot(df=5) # try another df
```
Topic revision: r7 - 14 Jan 2007, LeenaChoi

• Biostatistics Webs

Copyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback