Simulation Results (n=1002):
Summary_simulation_results_1002_iterations_patientlevel_07282009_histograms.rtf: Exactly 1002 iterations.rtf
Simple Descriptives of Characteristics to decide if we will incorporate some Stratified Sampling or Oversampling into our current simulations: (7/30)
4 reviewers and an adjudicator tag raw text of 10 clinical documents each (with a total of 20 unique documents to choose from with only 2 documents shared by all 5) for any of a list of 55 potential clinical concepts. Any combination or amount of text span can be tagged/associated with a particular concept (of those 55 listed) in the annotation tool: Knowtator. Knowtator's export capabilities are such that it can produce XML output of all this information (note: this is the most structured form of data that Knowtator is able to export). ~Ruth sent data on 3/1
Step 1) write a script that parses out the relevant structured elements of this XML output (all XML output follows the same general schema) into the following variables for each reviewer (where a single observation is any span of text): noteid (ie docid), person (ie reviewer), classtype (ie concept tagged for associated text span), start (ie start of span location), end (ie end of span location), slotvalue (ie associated positive, negative, or neutral assertion), text (ie the raw text tagged) where an observation is a single span of text. (this code written months ago~but actual data arrived 3/1)
Step 2) Use these 5 | delimited datasets (separate dataset for each reviewer where an observation is a single span of text) and produce 2 datasets for sub-study 1:
2a) concept level sub-study 1: complete factorial including data only from the 2 documents shared by all
variables in dataset: documentid, reviewer id, concept id, tp, fn, fp, precision, recall
observation unique key identifier = documentid * reviewer * concept
TP for obs @ unique documentB *reviewer B * concept B = number of times we see any span of text overlap between reviewer B and adjudicator (by 1 or more characters) that was subsequently tagged with concept B by both reviewer B and adjudicator on the same document B
FN for unique obs @ documentB * reviewer B * concept B is number of times that for each span of text tagged by gold standard for document B and concept B, reviewer B either did not tag any corresponding overlapping text (by 1 or more characters) OR reviewer did not tag with concept B
FP for unique obs @ documentB * reviewerB * concept B is number of times that for each span of text tagged by reviewer B for concept B on document B, the gold standard either did not tag any corresponding overlapping text (by 1 or more characters) or did not mark concept B for overlapping text
2 b) class level sub-study 1: complete factorial including data only from 2 docs shared by all
variables in dataset: documentid, reviewerid, class id, tp, fn, fp, precision, recall
observation unique key identifier = documentid * reviewer * class
class is a "superset" of concept so hopefully we can follow how this dataset would be created.
Step 3) Produce Graphs:
Scatter Plot overlayed with Box plot for Outcome Precision by Concept using dataset from Step 2a.
Box Plot of Outcome Precision by concept w/inset statistics listed for each concept using data from Step 2a.
Scatter Plot overlayed with Box plot for Outcome Recall by Class using dataset from Step 2b
Box Plot of Outcome Recall by class w/inset statistics listed for each concept using data from step 2b.
Previous Steps Completed 3/16