Chat Room for Q and A
- The followings are from year 2006
Q1
In discussing the homework with some members of our group, we have noticed that there is something slightly different about the SUPPORT dataset that was updated on the website yesterday (Feb. 7), and the one that existed before that. You might want to mention to everyone in class that the dataset has been updated, and the current version is the one that should be downloaded. It makes a slight difference in the results of the linear regression models involving the transformed total cost variable.
A1
Thanks for letting me know. I am not sure why this happened, because I only uploaded it once, so there should not be two different versions... Anyway, I will let the rest of the class know.
Q2
In the updated dataset, when transforming totcst to ln_cost in SPSS, we now get a "warning" message in the Output file that says: "The argument for the natural log function is less than or equal to zero. The result has been set to the system-missing value." Is this okay? Some people are getting slightly different numbers for the results of the confidence intervals and B values, and I believe it may be due to the issue of these missing values--which must be different between the new and old datasets.
A2
It probably because some patients in the database have zero hospital cost (it sounds odd, though). When you have zero value, log transformation does not work, so it returns missing. When you have missing, the whole observations for this person will not be used in the analysis, so it better not to leave it as missing. Try to impute missing (before log transformation) with $1, it converts to zero in log scale. Another solution is to use power transformation cubic root, for example. totcst**(1/3)
Q3
How do you deal with ln_cost in this instance? It doesn't seem to make sense to say "There was a $.542 decrease in the natural log of hospital cost when albumin level increases by 1 g/dL," does it? Would it be best just to report the B value and 95% CI for a transformed variable, or can you still attach biological meaning to it (and try to explain what it indicates)?
A3
This is a great question! You may probably want to antilog (take exponential of $.542) and put is in your paper (same way for 95% CI). Problem with this is now your CI is not symmetric after antilog... These (Q2 and Q3) are well known difficulty using transformed variable.