log using 5.20.EsophagealCa.ClassVersion.log, replace set more on * 5.20.EsophagealCa.ClassVersionlog * * Regress esophageal cancers against age and dose of alcohol * and tobacco using a multiplicative model. * use 5.5.EsophagealCa.dta, clear sort tobacco by tobacco: tabulate cancer alcohol [freq=patients] , column * * Combine tobacco levels 2 and 3 in a new variable called smoke * generate smoke = tobacco recode smoke 3=2 4=3 label variable smoke "Smoking (gm/day)" label define smoke 1 "0-9" 2 "10-29" 3 ">= 30" label values smoke smoke table smoke tobacco [freq=patients], row col * * Regress cancer against age, alcohol and smoke * using a multiplicative model * logistic cancer i.age i.alcohol i.smoke [freq=patients] lincom 2.alcohol + 2.smoke, or lincom 3.alcohol + 2.smoke, or lincom 4.alcohol + 2.smoke, or lincom 2.alcohol + 3.smoke, or lincom 3.alcohol + 3.smoke, or lincom 4.alcohol + 3.smoke, or * * Regress cancer against age, alcohol and smoke. * Include alcohol-smoke interaction terms. * logistic cancer i.age alcohol##smoke [freq=patients] lincom 2.alcohol + 2.smoke + 2.alcohol#2.smoke, or lincom 2.alcohol + 3.smoke + 2.alcohol#3.smoke, or lincom 3.alcohol + 2.smoke + 3.alcohol#2.smoke, or lincom 3.alcohol + 3.smoke + 3.alcohol#3.smoke, or lincom 4.alcohol + 2.smoke + 4.alcohol#2.smoke, or lincom 4.alcohol + 3.smoke + 4.alcohol#3.smoke, or * * Calculate delta deviance * display 2*( 351.96823 -349.29335 ) display chi2tail(6, 5.34976) more * * Perform Pearson chi-squared and Hosmer-Lemeshow tests of * goodness of fit. * lfit lfit, group(10) table * * Perform residual analysis * predict p, p label variable p /// "Estimate of {&pi} for the j{superscript:th} Covariate Pattern" predict dx2, dx2 predict rstandard, rstandard generate dx2_pos = dx2 if rstandard >= 0 generate dx2_neg = dx2 if rstandard < 0 label variable dx2_pos "Positive residual" label variable dx2_neg "Negative residual" predict dbeta, dbeta scatter dx2_pos p [weight=dbeta] /// , msymbol(Oh) mlwidth(medthick) mcolor(red) /// || scatter dx2_neg p [weight=dbeta] /// , msymbol(Oh) mlwidth(medthick) mcolor(blue) /// ||, ylabel(0(1)8, angle(0)) /// ymtick(0(.5)8) yline(3.84, lwidth(medthick)) /// xlabel(0(.1)1) xmtick(0(.05)1) /// ytitle("Squared Standardized Pearson Residual") /// xscale(titlegap(2)) more save temporary, replace drop if patients == 0 generate ca_no = cancer*patients collapse (sum) n = patients ca = ca_no, by(age alcohol smoke dbeta dx2 p) * * Identify covariate patterns associated with large squared residuals * list n ca age alcohol smoke dbeta dx2 p if dx2 > 3.84, nodisplay * * Rerun analysis without the covariate pattern A * use temporary, clear drop if age == 4 & alcohol ==4 & smoke == 1 logistic cancer i.age i.alcohol##i.smoke [freq=patients] lincom 2.alcohol + 2.smoke + 2.alcohol#2.smoke, or lincom 2.alcohol + 3.smoke + 2.alcohol#3.smoke, or lincom 3.alcohol + 2.smoke + 3.alcohol#2.smoke, or lincom 3.alcohol + 3.smoke + 3.alcohol#3.smoke, or lincom 4.alcohol + 2.smoke + 4.alcohol#2.smoke, or lincom 4.alcohol + 3.smoke + 4.alcohol#3.smoke, or * * Rerun analysis without the covariate pattern B * use temporary, clear drop if age == 4 & alcohol ==1 & smoke == 3 logistic cancer i.age i.alcohol##i.smoke [freq=patients] lincom 2.alcohol + 2.smoke + 2.alcohol#2.smoke, or lincom 2.alcohol + 3.smoke + 2.alcohol#3.smoke, or lincom 3.alcohol + 2.smoke + 3.alcohol#2.smoke, or lincom 3.alcohol + 3.smoke + 3.alcohol#3.smoke, or lincom 4.alcohol + 2.smoke + 4.alcohol#2.smoke, or lincom 4.alcohol + 3.smoke + 4.alcohol#3.smoke, or log close