This week we review the R-Learner, a 2-step causal inference algorithm to estimate heterogeneous treatment effects from observational data.
Why the R-learner
Observations , features , binary treatment decision for each unit , with assumptions:
Overlap for some
treatment decision and potential outcomes are independent given the features
- Quite general assumptions, much less constrained than strict experimental setups like randomized control trials (RCTs).
- Answers the very important question: "What is the expected effect of the treatment on unit given its features "
obtained via the R-learner achieves asymptotic error rates of the same scale as , with an "oracle" learner knowing perfectly the following functions:
In the paper, error rates are obtained when functions are approximated via penalized kernel regression
In the raw notes below, we sketch the proof when functions are approximated via Lasso-penalized linear regression. In this case, we get:
- Today, Sisu customers can use our software to identify subpopulation of their dataset that impact changes in their metric of interest.
- Tomorrow with causal inference, if a customer takes action on those subpopulations, they could come back to Sisu to estimate the "treatment effect" that their action had on each subpopulation!
R-loss Squared Error of the approximation correspondence
Correspondence (leverages Overlap for tightness)
"proof of concept proof" of the isomorphic projection bound in the Lasso linear regression case
If you like applying these kinds of methods practical ML problems, join our team.