Quickstart

The main class in the package is dualbounds.generic.DualBounds, which computes dual bounds on a partially identified estimand of the form

\[\theta = E[f(Y(0), Y(1), X)].\]

Crucially, the confidence intervals produced by DualBounds are always valid in randomized experiments, even if the underlying machine learning model is arbitrarily misspecified.

[1]:
# Import packages
import sys; sys.path.insert(0, "../../")
import dualbounds as db
from dualbounds.generic import DualBounds

# Generate synthetic data from a heavy-tailed linear model
data = db.gen_data.gen_regression_data(n=900, p=30, sample_seed=123)

# Initialize dual bounds object
dbnd = DualBounds(
    f=lambda y0, y1, x: y0 < y1,
    covariates=data['X'],
    treatment=data['W'],
    outcome=data['y'],
    propensities=data['pis'],
    outcome_model='ridge',
)

# Compute dual bounds and observe output
results = dbnd.fit(alpha=0.05).results()
print(results.to_markdown())
Cross-fitting the outcome model.
Estimating optimal dual variables.
|            |     Lower |     Upper |
|:-----------|----------:|----------:|
| Estimate   | 0.6832    | 0.934563  |
| SE         | 0.0210876 | 0.0125664 |
| Conf. Int. | 0.641869  | 0.959193  |

There are two estimates—both a lower and an upper estimate—because the estimand is not identified.