
Recent black-box methods for estimating heterogeneous treatment effects lack interpretability, limiting their practical utility.
Recent black-box methods for estimating heterogeneous treatment effects lack interpretability, limiting their practical utility. We introduce causal distillation trees (CDT), a two-stage approach that first fits any machine learning model to estimate individual-level treatment effects, then "distills" these estimates into interpretable subgroups using a simple decision tree. CDT inherits the predictive performance of black-box models while preserving interpretability. We derive theoretical consistency guarantees for the estimated subgroups and introduce stability-driven diagnostics to evaluate subgroup quality. We demonstrate CDT on a randomized controlled trial of antiretroviral HIV treatment, showing it outperforms state-of-the-art approaches in constructing stable, clinically relevant subgroups.