Problem 1

170 plants (tulipa armena) were randomly sampled to examine the association between stem length and flower color. The following data were collected:
  • 40 Short and Red
  • 40 Tall and Red
  • 30 Short and Yellow
  • 60 Tall and Yellow

  1. What is the probability of being
    • Yellow given short
    • Yellow given tall
    • Yellow
    • Short given yellow
  2. Test the null hypothesis that there is no association between flower color and stem length. Provide the name of the test used, the test statistic, degrees of freedom (if appropriate), p-value, and decision to reject or fail to reject the null hypothesis using a significance level of 0.05
  3. Calculate the odds ratio for a plant being red given short relative to red given tall. Also provide a 95% confidence interval for the odds ratio and an interpretation of your findings. What similar information is provide by the confidence interval for the odds ratio and the significance test conducted in (2)?
  4. If two additional flower colors are also observed (e.g. blue and green), what test would you use to determine if there is any association between color and stem length? What would be the degrees of freedom of this test?

Problem 2

It is believed that the proportion of smokers among teenage boys is 0.30 (or 30%). 6 teenage boys are randomly sampled from the population
  1. What is the probability that we find 0 smokers? What about 4 or more smokers?
  2. Plot the binomial probability distribution for $\pi = 0.3$ and $n = 6$. You can either do this by hand or use a computer program (see notes below).
  3. Suppose that we observe 0 smokers in our sample of 6 teenage boys. What is the probability of observing data "as extreme or more extreme" than we collected in this experiment (a p-value)? What is your definition of "as extreme or more extreme"?
  4. Did we include enough subjects in the study to adequately test $H_0: \pi = 0.30$? Why or why not?

Problem 3

We wish to examine the association between race and low birth weight.
  • The outcome (y) is low birth weight, with y = 1 if weight is <= 2500 gram or y = 0 if weight is > 2500 grams
  • Race takes on one of three levels: White, Black, and Other. The model is fit by defining two indicator variables
    • x1 = 1 if race is 'Black' and x1 = 0 otherwise
    • x2 = 1 if race is 'Other' and x2 = 0 otherwise
Consider the following output from a logistic regression model

Coef S.E. Wald Z P Intercept -1.1550 0.2391 -4.83 0.0000 x1 0.8448 0.4634 1.82 0.0683 x2 0.6362 0.3478 1.83 0.0674

  1. What is the odds ratio of low birth weight comparing 'Black' to 'White'? Also give an approximate 95% confidence interval for the odds ratio.
  2. Find the predicted probability of low birth weight for 'White' race and 'Black' race.
  3. Show how the predicted probabilities found in part 2 can be used to find the odds ratio estimate obtained in part 1.
  4. Extra credit: What is the odds ratio estimate comparing 'Other' race to 'Black' race?

Notes

  • To create a plot of the probability density (e.g. Fig 15.1 in the EMS book) using R commander, use the menus: Distributions... Discrete distributions... Binomial distribution... Plot binomial distribution. Enter the correct number of trials and the probability. You want the default, the 'probability mass function'; the other option plots a cumulative density

  • Using R-commander, the output from Fisher's exact test automatically provides an estimate of a type of odds ratio with a 95% confidence interval. However, just like Fisher's Test, this odds ratio is calculated by making the strange, restrictive assumption that the marginal totals are fixed. When calculating the odds ratio and confidence interval for Problem #1, question 3, do not use the odds ratio from Fisher's test. Instead calculate it by hand using formulas from your notes or in the EMS book.
Topic revision: r7 - 06 May 2009, WikiGuest
 

This site is powered by FoswikiCopyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback