version 11 set more on log using 9.3.Framingham.log, replace * 9.3.Framingham.log * * Estimate the effect of age and gender on coronary heart disease (CHD) * using several Poisson regression models (Levy 1999). * use 8.12.Framingham.dta, clear * * Fit a multiplicative model of the effect of gender and age on CHD * glm chd_cnt i.age_gr male, family(poisson) link(log) /// lnoffset(pt_yrs) eform * * Tabulate patient-years of follow-up and number of * CHD events by sex and age group. * table sex, contents(sum pt_yrs sum chd_cnt) by(age_gr) * * Calculate age-sex specific incidence of CHD * collapse (sum) patients = pt_yrs chd = chd_cnt, by(age_gr male) generate rate = 1000*chd/patients generate men = rate if male == 1 generate women = rate if male == 0 graph bar men women, over(age_gr) ytitle(CHD Morbidity Rate per 1000) /// ylabel(0(5)40, angle(0)) subtitle(Age, position(6)) /// legend(order(1 "Men" 2 "Women") ring(0) position(11) col(1)) more use 8.12.Framingham.dta, clear * * Add interaction terms to the model * glm chd_cnt age_gr##male, family(poisson) link(log) lnoffset(pt_yrs) lincom 1.male, irr lincom 1.male + 50.age_gr#1.male, irr lincom 1.male + 55.age_gr#1.male, irr lincom 1.male + 60.age_gr#1.male, irr lincom 1.male + 65.age_gr#1.male, irr lincom 1.male + 70.age_gr#1.male, irr lincom 1.male + 75.age_gr#1.male, irr lincom 1.male + 80.age_gr#1.male, irr lincom 1.male + 81.age_gr#1.male, irr display chi2tail(8, 1391.341888 - 1361.574107) * * Refit model with interaction terms using fewer parameters. * generate age_gr2 = recode(age_gr, 45,55,60,80,81) glm chd_cnt age_gr2##male /// , family(poisson) link(log) lnoffset(pt_yrs) eform lincom 1.male + 55.age_gr2#1.male, irr lincom 1.male + 60.age_gr2#1.male, irr lincom 1.male + 80.age_gr2#1.male, irr lincom 1.male + 81.age_gr2#1.male, irr table bmi_gr * * The i. syntax only works for integer variables. bmi_gr gives the * quartile boundarys to one decimal place. We multiply this variable * by 10 in order to be able to use this syntax. Since indicator * covariates are entered into the model, multiplying by 10 will * not affect our estimates * gen bmi_gr10 = bmi_gr*10 * * Adjust analysis for body mass index (BMI) * glm chd_cnt age_gr2##male i.bmi_gr10 /// , family(poisson) link(log) lnoffset(pt_yrs) display chi2tail(3,1400.582451 - 1327.64597) * * Adjust estimates for BMI and serum cholesterol * glm chd_cnt age_gr2##male i.bmi_gr10 i.scl_gr /// , family(poisson) link(log) lnoffset(pt_yrs) display chi2tail(3,1327.64597 - 1207.974985) * * Adjust estimates for BMI, serum cholesterol and * diastolic blood pressure * glm chd_cnt age_gr2##male i.bmi_gr10 i.scl_gr i.dbp_gr /// , family(poisson) link(log) lnoffset(pt_yrs) eform lincom 1.male + 55.age_gr2#1.male, irr lincom 1.male + 60.age_gr2#1.male, irr lincom 1.male + 80.age_gr2#1.male, irr lincom 1.male + 81.age_gr2#1.male, irr display chi2tail(3,1207.974985 - 1161.091086) * * Compress data set for residual plot * sort male bmi_gr10 scl_gr dbp_gr age_gr2 collapse (sum) pt_yrs=pt_yrs chd_cnt=chd_cnt /// , by (male bmi_gr10 scl_gr dbp_gr age_gr2) * * Re-analyze previous model using collapsed data set. * glm chd_cnt age_gr2##male i.bmi_gr10 i.scl_gr i.dbp_gr /// , family(poisson) link(log) lnoffset(pt_yrs) * * Estimate the expected number of CHD events and the * standardized deviance residual for each record in the data set. * predict e_chd, mu predict dev, standardized deviance generate e_rate = 1000*e_chd/pt_yrs label variable e_rate "Incidence of CHD per Thousand" * * Draw scatterplot of the standardized deviance residual versus the * estimated incidence of CHD. Include lowess regression curve on this plot. * lowess dev e_rate, bwidth(0.2) msymbol(Oh) ylabel(-3(1)4) ytick(-3(0.5)4) /// lineopts(color(red) lwidth(medthick)) yline(-2 0 2, lcolor(blue)) /// xlabel(0(10)80) xtick(5(10)75) log close