The meaning of “contribution” seems obvious: It refers to the confluence of multiple factors or causes toward a result. In a healthcare context, for instance, a patient’s known past behavior and lifestyle, plus the therapies he received, have all contributed to his observed health outcome. Now, we are looking back from the outcome to the observed causes and wish to know what degree of the patient’s condition is due to a particular factor.
Contribution Defined
In other words, calculating contributions means to allocate an outcome to each of its causes proportionally. We propose to define contribution formally as the difference between factual and counterfactual outcomes that correspond to factual and counterfactual conditions of causes, respectively.
Counterfactuals Require Causality
But how can we infer a counterfactual outcome? A causal model would allow us to compute a counterfactual outcome by simulating a counterfactual condition of a cause. Of course, definitive causal models rarely exist for any domain.
Causal Inference with Non-Causal Models?
While we still cannot use Artificial Intelligence to create a causal model, we can machine-learn a non-causal, predictive model. And, utilizing
VanderWeele’s Disjunctive Cause Criterion for confounder selection, we can even employ such a non-causal, predictive model for causal inference.
Bayesian Networks to the Rescue!
At this point, Bayesian networks present themselves as a practical modeling framework. For example, we can use
BayesiaLab’s learning algorithms to search for a predictive model. Furthermore, we can use
BayesiaLab’s Likelihood Matching algorithms to condition on the previously-identified confounders for simulating a causal intervention.
The “Causal Oracle” Computes Contributions
Given our new “causal oracle,” we can infer any outcome by setting causes to any number of counterfactual conditions. In this seminar, we demonstrate using the proposed Bayesian network framework for efficiently calculating contributions from synthetic data in a fictional problem domain. By virtue of having a known data-generating process for this domain, we can examine how well this methodology recovers the true contributions.
.
MRBIII, Room 1220
11 March 2020
1:30pm