* Change directories and open a log file cd "C:\Desktop\PBC_Analysis" log using pbclog, replace * Read in the data set use pbc.dta, clear * Explore the data set describe codebook browse list list id drug sex list in 1/10 list if sex == 1 list id drug if sex == 1 * Create new continuous variables generate ageyrs = age/365.25 generate fuyrs = fudays/365.25 * Create new categorical variable generate censored = . replace censored = 1 if status == 0 | status == 1 replace censored = 2 if status == 2 * Explore the revised data set describe * Add labels to new variables label variable ageyrs "Age (yrs)" label variable fuyrs "Follow-up (yrs)" label variable censored "2-level survival status" * Overwrite some existing labels label variable age "Age (days)" label variable fudays "Follow-up (days)" label variable status "3-level survival status" * Add value labels label define statuslabel 0 "Censored" 1 "Censored due to liver treatment" 2 "Dead" label value status statuslabel label define censoredlabel 1 "Censored" 2 "Dead" label value censored censoredlabel label define druglabel 1 "D-penicillamine" 2 "Placebo" label value drug druglabel label define sexlabel 1 "Female" 2 "Male" label value sex sexlabel label define asciteslabel 0 "No" 1 "Yes" label value ascites asciteslabel label define stagelabel 1 "1" 2 "2" 3 "3" 4 "4" label value stage stagelabel * Explore the revised data set codebook list in 1/10 * Summarize continuous variables, just the no. non-missing obs, mean, SD, min, and max summarize fudays fuyrs age ageyrs bili chol album * detail = percentiles (1, 5, 10, 25, 40, 75, 90, 95, 99), 4 smallest, 4 largest, n obs, mean, SD, variance, skewness, kurtosis summarize fudays fuyrs age ageyrs bili chol album, detail summarize ageyrs, meanonly summarize ageyrs if sex == 1 summarize ageyrs if sex == 2 tabstat ageyrs, by(sex) statistics(count mean sd p25 median p75) missing tabstat ageyrs, by(drug) statistics(count mean sd p25 median p75) missing * Summarize categorical variables tabulate sex tabulate drug, missing tabulate sex drug, missing column * sex in rows, drug in columns table stage sex drug, missing * stage in rows, sex in (nested) columns, drug in "super"-columns table stage sex drug, contents(freq mean ageyrs) missing * Close the log file log close