LeenaRNotes < Main < Vanderbilt Biostatistics Wiki

<font size="3"><span style="font-family: times new roman,times,serif;">
---+++!! R Notes for Classes
Some of R codes for classes are listed where [EMS] refers to [[http://www.blackwellpublishing.com/essentialmedstats][Essential Medical Statistics]].
%TOC%

---++++ [EMS] Chapter 3 Displaying the data
*Shapes of frequency distributions* [[%ATTACHURL%/distributionShape.R][R code]]: 

---++++ [EMS] Chapter 4 Means, standard deviations and standard errors
   * Underlying distribution is normal
<verbatim>
## population mean and s.d.
set.seed(20)
mu <- 80 # the mean of population dbp
sigma <- 10 # the s.d. of population dbp
N <- 20000
Y <- rnorm(N, mean=mu, sd=sigma)

## distribution of population
hist(Y, xlim=c(40, 120))

## sampling from the population
n <- 30 # the number of observation in each sample
y.sam <- sample(Y, size=n, replace=FALSE)

## sample mean and s.d
mean(Y)
mean(y.sam)
sd(Y)
sd(y.sam)
bin <- 10
## comparison of distributions between population and one sample
par(mfrow=c(1,2))
hist(Y, xlim=c(40, 120))
hist(y.sam, bin, xlim=c(40, 120))

## sampling variation
n <- 30
n.trial <- 100 # the number of trials

sampling.var <- function(n, n.trial, sigma, mu, N){
Y <- rnorm(N, mean=mu, sd=sigma)
sam.mean <- NULL
for(i in 1:n.trial){
y.sam <- sample(Y, size=n, replace=FALSE)
sam.mean <- c(sam.mean, mean(y.sam) )
}
return(sam.mean)
}

sam.mean <- sampling.var(n = n, n.trial=n.trial, sigma=sigma, mu=mu, N=N)
## standard error
y.sam <- sample(Y, size=n, replace=FALSE)
s <- sd(y.sam)
s/sqrt(n)
sigma/sqrt(n)

par(mfrow=c(1,1))
plot(50.5, mean(Y), xlab="", col=2, pch=15, cex=1.3, xlim=c(1,100), ylim=c(65, 95))
abline(h=mu, col=3, lty=2)
points(1:n.trial, sam.mean)

## sampling distribution as function of n and sigma
hist(sam.mean, bin, xlim=c(40, 120))

sam.mean.10.s10 <- sampling.var(n = 10, n.trial=n.trial, sigma=10, mu=mu, N=N) # n=10, sigma=10
sam.mean.30.s10 <- sampling.var(n = 30, n.trial=n.trial, sigma=10, mu=mu, N=N) # n=30, sigma=10
sam.mean.10.s20 <- sampling.var(n = 10, n.trial=n.trial, sigma=20, mu=mu, N=N) # n=10, sigma=20
sam.mean.30.s20<- sampling.var(n = 30, n.trial=n.trial, sigma=20, mu=mu, N=N) # n=30, sigma=20

par(mfrow=c(2,2))
hist(sam.mean.10.s10, bin, xlim=c(40, 120))
hist(sam.mean.30.s10, bin, xlim=c(40, 120))
hist(sam.mean.10.s20, bin, xlim=c(40, 120))
hist(sam.mean.30.s20, bin, xlim=c(40, 120))
</verbatim>
   * Underlying distribution is non-normal: [[%ATTACHURL%/samplingVariation.R][R code]]: 
---++++ [EMS] Chapter 5 The normal distribution
[[%ATTACHURL%/normalDistribution.R][R code for learning normal distributions and standard normal distributions]]
---++++ [EMS] Chapter 6 Confidence interval for a mean
   * Interpretation of confidence interval: 
   * Confidence interval using _t_ distributions:
      * normal vs. _t_ distribution: 

---++++ [EMS] Chapter 7 Comparison of two means: confidence intervals, hypothesis tests and p-values 


<verbatim>

</verbatim>

This topic: Main > WebHome > Education > IntroBiostatCourse2007 > LeenaRNotes
Topic revision: revision 6 (raw view)

Copyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback