《数量经济技术经济研究》编辑部

文章摘要

胡诗云,江弘毅,解海天.双重机器学习的理论与应用——从“黑箱”到“工具箱”的实践指南[J].数量经济技术经济研究,2026,(3):154-179

双重机器学习的理论与应用——从“黑箱”到“工具箱”的实践指南

The Theory and Practices of Double Machine Learning：From “Black Box” to “Tool Box”

DOI：

英文关键词: Double Machine Learning Causal Inference High-dimensional Data Econometrics Empirical Strategy

基金项目:

作者	单位
胡诗云	北京大学国家发展研究院
江弘毅	北京大学国家发展研究院
解海天	北京大学光华管理学院

中文摘要:

基于线性回归的因果效应估计难以捕捉经济数据中的高维特征与非线性关系。双重机器学习允许研究者在灵活控制高维复杂协变量的同时，获得对因果效应的稳健估计和推断。本文旨在为实证研究者提供关于双重机器学习的理论指引与实践指南。本文系统性地回顾了双重机器学习的理论脉络，并阐述了其与工具变量、双重差分、断点回归等经典识别策略的结合。进一步通过数值模拟讨论了双重机器学习的性能表现与适用边界，并以评估“精准扶贫”对农村居民增收的政策效应为例，展示了双重机器学习在实证研究中的实施流程以及在模型选择、诊断性检验等方面的注意事项。本文为实证研究者正确理解和应用双重机器学习方法提供了参考。

英文摘要:

Double machine learning (DML) is a significant methodological advancement in causal inference, particularly when high-dimensional confounders and complex nonlinear relationships are involved. Despite its growing adoption in empirical economics, misconceptions persist regarding its role and proper application. This study provides a comprehensive overview of DML, clarifying its theoretical foundations, dispelling common misunderstandings, and offering practical guidance for empirical researchers. We emphasize that DML is not a novel identification strategy but rather a sophisticated estimation framework that enhances robustness within established causal inference paradigms.We begin by systematically reviewing the theoretical underpinnings of DML, which rest on two fundamental principles-neyman orthogonality and sample splitting (cross-fitting). Neyman orthogonality renders the estimator of the causal parameter asymptotically insensitive to first-order estimation errors in nuisance parameters, while cross-fitting prevents overfitting bias by using separate subsamples for nuisance parameter estimation and structural parameter inference. Our theoretical exposition progresses from the partially linear model (PLM), which accommodates homogeneous treatment effects, to the interactive regression model (IRM), which allows for fully flexible treatment effect heterogeneity, and culminates in a general method of moments framework. We pay particular attention to the interpretation of the PLM coefficient, revealing that it corresponds to an overlap-weighted average treatment effect when treatment effects are heterogeneous and covariate overlap is imperfect.We then demonstrate how DML integrates with standard identification strategies in applied econometrics, including instrumental variables, difference-in-differences, and regression discontinuity designs. For each strategy, we review the relevant identifying assumptions, explain how DML can be employed for parameter estimation while maintaining identification, and provide detailed implementation algorithms. Through systematic comparison with conventional parametric approaches, we clarify that DML’s value lies in strengthening estimation procedures under existing identification frameworks rather than supplanting them.To assess DML’s performance characteristics and practical limitations, we conduct Monte Carlo simulations comparing PLM and IRM against two-way fixed effects estimators in difference-in-differences settings across various data-generating processes. The simulation evidence demonstrates that DML methods deliver substantial gains in estimation accuracy when the true data-generating process features nonlinear confounding relationships and heterogeneous treatment effects. However, we document that PLM exhibits markedly superior performance relative to IRM when covariate overlap is limited, highlighting important practical considerations for method selection.We illustrate DML’s empirical utility through a reanalysis of China’s targeted poverty alleviation policy and its effects on rural household income. Contrasting DML estimates with those from conventional fixed effects specifications, we employ random forests for nuisance parameter estimation. The DML analysis confirms a positive and statistically significant policy effect, although the magnitude is more conservative than linear model estimates, reflecting DML’s ability to flexibly account for nonlinear confounding trends that may be imperfectly controlled in parametric specifications. Moreover, the flexibility of DML facilitates detailed heterogeneity analysis, revealing that the policy effects were substantially higher in counties with revolutionary historical legacies and in mountainous regions.Drawing on theoretical insights, simulation evidence, and empirical applications, we offer practical recommendations for implementing DML in applied research. These guidelines follow the natural research workflow as follows: (1) Prioritize identification over estimation-establish credible identifying assumptions through standard econometric reasoning before employing DML for estimation. (2) Base the decision to use DML and the choice between IRM and PLM on sample size considerations and the extent of covariate overlap. (3) Select machine learning algorithms (e.g., penalized regression, random forests, and neural networks) based on the dimensionality of the feature space relative to sample size. (4) Conduct careful diagnostics of nuisance parameter estimates, examining propensity score distributions and checking boundary predictions to guide hyperparameter tuning. (5) Implement comprehensive specification checks, including covariate balance tests and sensitivity analyses, and compare estimates across alternative specifications. (6) Adopt transparent reporting practices that document machine learning model selection and nuisance parameter fit quality, without unnecessarily reiterating the DML theory or reporting aggregate model fit statistics that may be misleading.In conclusion, DML is a tool for enhancing estimation precision and robustness but not a substitute for rigorous research design. It equips researchers with principled methods for controlling high-dimensional confounding and accommodating complex functional form flexibility. By enabling more robust estimation and credible inference in the presence of model misspecification concerns, DML is a valuable addition to the applied econometrician’s toolkit for drawing reliable causal inferences from the increasingly complex and high-dimensional data characteristic of modern economic research.

查看全文相关附件：下载数据代码附录