CepiNotes < Main < Vanderbilt Biostatistics Wiki

You are here: Vanderbilt Biostatistics Wiki>Main Web>CepiProject>CepiPlans>CepiNotes (09 Mar 2009, BenSaville)EditAttach

Threshold Results for CES-D

Ok, thanks. Try to get in touch with me by 2pm if you can.

I've finished the parts of the threshold tasks for CES-D. I copied the results below. (The actual document has two rows for each variable, but this option isn't availible without using latex().)

Descriptive Statistics by Study group (experimental condition)

+----------------+--------------------+--------------------+--------------------+--------------------+------------------------------+
|                |1 VUBook            |2 CommercialBook    |3 NoBookGiven       |Combined            |  Test                        |
|                |(N=62)              |(N=66)              |(N=70)              |(N=198)             |Statistic                     |
+----------------+--------------------+--------------------+--------------------+--------------------+------------------------------+
|awcess >= 8     |           63% ( 39)|           71% ( 47)|           63% ( 44)|           66% (130)|Chi-square=1.36 d.f.=2 P=0.508|
+----------------+--------------------+--------------------+--------------------+--------------------+------------------------------+
|depress.increase|           19% ( 12)|           24% ( 16)|           23% ( 16)|           22% ( 44)|Chi-square=0.47 d.f.=2 P=0.792|
+----------------+--------------------+--------------------+--------------------+--------------------+------------------------------+
|depress.consec  |           37% ( 23)|           30% ( 20)|           26% ( 18)|           31% ( 61)|Chi-square=2.01 d.f.=2 P=0.366|
+----------------+--------------------+--------------------+--------------------+--------------------+------------------------------+

Before we had said to only look at threshold items on the aims page that deal with baseline (wave 1) data. Only one item on the aims only uses wave 1 data, so I went ahead and did the calculations for more until we have a chance to meet. I was going to grab some lunch now, but should we touch base sometime this afternoon about the threshold scores (or anything else)? 3/9/09

Document ready to post

I think I've made all the changes we talked about Monday. Will you take a look here and let me know if I should go ahead and post it?

Here's the document, with the changes you indicated yesterday. - 03/05/09

All looks good, with a couple minor edits:

Replace <> with the following: This low p-value might be due just to chance, as opposed to a true relationship between income and group, because we have computed a large number of tests.

Before <> add the following (or something similar): We also noted that the NoBookGroup tended to have a greater proportion of whites and mothers with higher education compared to the other groups (although statistically non-significant), which would also agree with a tendency for higher income mothers in the NoBookGroup.

Chi-Square test warnings in baseline comparison doc

There are only two warnings, now that I removed the two-way frequency tables with expected counts. (The expected cell counts use the chisq.test function and gave warnings.) The two variables that give warnings are

 marital

and race. I would expect race to give a warning because of the low expected counts, but marital does not have this problem:

> chisq.test(bigdat$marital[bigdat$awwave == 1], bigdat$studygrp[bigdat$awwave == 1])$expected
                                  bigdat$studygrp[bigdat$awwave == 1]
bigdat$marital[bigdat$awwave == 1]  1 VUBook 2 CommercialBook 3 NoBookGiven
                           Single  48.467005        51.593909     53.939086
                           Married 10.385787        11.055838     11.558376
                           Other    3.147208         3.350254      3.502538
Warning message:
In chisq.test(bigdat$marital[bigdat$awwave == 1], bigdat$studygrp[bigdat$awwave ==  :
  Chi-squared approximation may be incorrect

03/02/09

I looked up chisq.test() in the help, and under "details" it says that the continuity correction is only used for 2 by 2 tables. I looked on wikipedia, which confirmed that the correction should only be used on 2 by 2 tables. This explains why we've been getting the same answer for chi square tests with corrected = TRUE and corrected = FALSE.

Nice catch. I've never used the continuity correction in real life, but that does seem to explain things. So I think it's a matter of Pearson vs. Fishers for small expected cell counts.

02/27/09

Instead of trying to get a CMH row mean score, it makes the most sense to either use the built in Prop. odds p-value (if you make the var ordinal) or the Krusal Wallis test (if you remove the factor so that it's numeric), which is the same as the MH row-mean score with scores-ranks. I don't know how to get standardized ranks, but I think these other ones will work just fine.

The baseline tests look great. The only one that is significant is income. Out of curiosity, what happens to income if use a Kruskal Wallis (i.e. remove the factor status on the variable)? I'm not trying to data snoop - just somewhat suspicious of the Prop Odds p-value still.

For the income variable, I'm a little unclear: we don't have their actual incomes. Are you thinking of something like: as.numeric(income)? I will try it now. I think it would give something like 1, 2, 3, 4, 5, or 6.



Descriptive Statistics by Study group (experimental condition)

+---------------------------+---+-------------------+-------------------+-------------------+-----------------------------+
|                           |N  |1 VUBook           |2 CommercialBook   |3 NoBookGiven      |  Test                       |
|                           |   |(N=62)             |(N=66)             |(N=70)             |Statistic                    |
+---------------------------+---+-------------------+-------------------+-------------------+-----------------------------+
|income : 1 Less than $8,000|120|           29% (13)|           29% (10)|           15% ( 6)|Chi-square=7.2 d.f.=2 P=0.027|
+---------------------------+---+-------------------+-------------------+-------------------+-----------------------------+
|    2 $8,000 - $12,000     |   |           22% (10)|           15% ( 5)|           20% ( 8)|                             |
+---------------------------+---+-------------------+-------------------+-------------------+-----------------------------+
|    3 $12,001 - $16,000    |   |           11% ( 5)|           18% ( 6)|            2% ( 1)|                             |
+---------------------------+---+-------------------+-------------------+-------------------+-----------------------------+
|    4 $16,001 - $21,000    |   |            4% ( 2)|           15% ( 5)|           12% ( 5)|                             |
+---------------------------+---+-------------------+-------------------+-------------------+-----------------------------+
|    5 $21,001 - $26,000    |   |           13% ( 6)|            6% ( 2)|           12% ( 5)|                             |
+---------------------------+---+-------------------+-------------------+-------------------+-----------------------------+
|    6 $26,001 - $30,000    |   |            4% ( 2)|            3% ( 1)|            5% ( 2)|                             |
+---------------------------+---+-------------------+-------------------+-------------------+-----------------------------+
|    7 $30,001 - $40,000    |   |            4% ( 2)|            9% ( 3)|           10% ( 4)|                             |
+---------------------------+---+-------------------+-------------------+-------------------+-----------------------------+
|    8 $40,001 - $50,000    |   |            7% ( 3)|            0% ( 0)|            7% ( 3)|                             |
+---------------------------+---+-------------------+-------------------+-------------------+-----------------------------+
|    over $50,000           |   |            4% ( 2)|            6% ( 2)|           17% ( 7)|                             |
+---------------------------+---+-------------------+-------------------+-------------------+-----------------------------+
|as.numeric(income)         |120|            1/2/5  |            1/3/4  |            2/5/7  |   F=3.6 d.f.=2,117 P=0.031  |
+---------------------------+---+-------------------+-------------------+-------------------+-----------------------------+

Did we group "other" for awbq09 as "not working"? Maybe we should make them working since maternity leave would imply they have a job.

Originally we did say "other" should be with unemployed, I think, but after seeing that they were on maternity leave, I put them in with full time.

02/26/09

For the employment status variable, awbq09, in wave 1 all 8 mothers who said "other" specified "maternity leave."

02/25/09

I copied this from the SAS online documentation:

ANOVA (Row Mean Scores) Statistic

The ANOVA statistic can be used only when the column variable Y lies on an ordinal (or interval) scale so that the mean score of Y is meaningful. For the ANOVA statistic, the mean score is computed for each row of the table, and the alternative hypothesis is that, for at least one stratum, the mean scores of the R rows are unequal. In other words, the statistic is sensitive to location differences among the R distributions of Y.

The matrix of column scores C_h has dimension 1 ×C, the column scores are determined by the SCORES= option.

The matrix of row scores R_h has dimension (R-1) ×R and is created internally by PROC FREQ as

R_h = [ I_{R-1} , -J_{R-1} ]

where I_{R-1} is an identity matrix of rank R-1, and J_{R-1} is an (R-1) ×1 vector of ones. This matrix has the effect of forming R-1 independent contrasts of the R mean scores.

When there is only one stratum, this CMH statistic is essentially an analysis of variance (ANOVA) statistic in the sense that it is a function of the variance ratio F statistic that would be obtained from a one-way ANOVA on the dependent variable Y. If nonparametric scores are specified in this case, then the ANOVA statistic is a Kruskal-Wallis test.

If there is more than one stratum, then this CMH statistic corresponds to a stratum-adjusted ANOVA or Kruskal-Wallis test. In the special case where there is one subject per row and one subject per column in the contingency table of each stratum, this CMH statistic is identical to Friedman's chi-square. See Example 2.8 for an illustration.

Topic revision: r14 - 09 Mar 2009, BenSaville

Main

Department Home Page

Biostatistics Graduate Program

Vanderbilt University Medical Center

Biostatistics Webs
- Archive
- Main
- Sandbox
- System

Copyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback