Quickstart¶
The main class in the package is dualbounds.generic.DualBounds, which computes dual bounds on a partially identified estimand of the form
\[\theta = E[f(Y(0), Y(1), X)].\]
Crucially, the confidence intervals produced by DualBounds are always valid in randomized experiments, even if the underlying machine learning model is arbitrarily misspecified.
[1]:
# Import packages
import sys; sys.path.insert(0, "../../")
import dualbounds as db
from dualbounds.generic import DualBounds
# Generate synthetic data from a heavy-tailed linear model
data = db.gen_data.gen_regression_data(n=900, p=30, sample_seed=123)
# Initialize dual bounds object
dbnd = DualBounds(
f=lambda y0, y1, x: y0 < y1,
covariates=data['X'],
treatment=data['W'],
outcome=data['y'],
propensities=data['pis'],
outcome_model='ridge',
)
# Compute dual bounds and observe output
results = dbnd.fit(alpha=0.05).results()
print(results.to_markdown())
Cross-fitting the outcome model.
Estimating optimal dual variables.
| | Lower | Upper |
|:-----------|----------:|----------:|
| Estimate | 0.6832 | 0.934563 |
| SE | 0.0210876 | 0.0125664 |
| Conf. Int. | 0.641869 | 0.959193 |
There are two estimates—both a lower and an upper estimate—because the estimand is not identified.