Schildcrout JS, Basford MA, Pulley JM, Masys DR, Roden DM, Wang D, Chute CG, Kullo IJ, Carrell D, Peissig P, Kho A, Denny JC. An analytical approach to characterize morbidity profile dissimilarity between distinct cohorts using electronic medical records. J Biomed Inform. 2010 Dec;43(6):914-23. doi: 10.1016/j.jbi.2010.07.011. Epub 2010 Aug 3.
- DAT: data.frame
- S: character string, site variable name
- R: character string, to identify the response (morbidity) columns. In DAT, this character string must appear in all of the response variable names and in none of the other variable names
- n.boot: number of bootstrap replicates
- print.i: if TRUE, prints the bootstrap replicate number
- seed: set the seed so as to obtain (exactly) reproducible results
- rd: number of digits outputted after the decimal
- ci: Confidence limits (quantiles from the bootstrap distribution)
- MDI: Matrix of MDIs
- est: estimates of pairwise site to site differences for individual morbidities (Dx) on the log odds ratio scales
- lci: lower confidence limit of est
- uci: upper confidence limit of est
- vcov: each element is a variance-covariance matrix for pairwise comparisons across morbidities
## Now run the function ## Read in the data and have a look at them
TheData <- read.table("MDIData.txt", header=TRUE, sep="\t") ## Run the MDI calculator
B <- CalcMDI(TheData,S='site',R='Dx',n.boot=250,print.i=TRUE,seed=2,rd=3,ci=c(0.025,0.975))