dualbounds.delta.DeltaDualBounds

class dualbounds.delta.DeltaDualBounds(h: callable, z1: callable, z0: callable, *args, **kwargs)[source]

Computes generalized dual bounds via the delta method.

The estimand is

\(h(E[f(Y(0), Y(1), X)], E[z_1(Y(1), X)], E[z_0(Y(0), X)])\)

where h must be monotone increasing in its first argument.

Parameters:
h : function

real-valued function of fval, z0, z1, e.g., h = lambda fval, z0, z1 : fval / z0 + z1.

z0 : function

vector-valued function of y0, x.

z1 : function

vector-valued function of y1, x.

f : function

Function which defines the partially identified estimand. Must be a function of three arguments: y0, y1, x (in that order). E.g., f = lambda y0, y1, x : y0 <= y1

outcome : np.array | pd.Series

n-length array of outcome measurements (Y).

treatment : np.array | pd.Series

n-length array of binary treatment (W).

covariates : np.array | pd.Series

(n, p)-shaped array of covariates (X).

propensities : np.array | pd.Series

n-length array of propensity scores \(P(W=1 | X)\). If None, will be estimated from the data.

clusters : np.array

Optional n-length array of clusters, so clusters[i] = j indicates that observation i is in cluster j.

outcome_model : str | dist_reg.DistReg | list

The model for estimating the law of \(Y | X, W\). Three options:

  • A str identifier, e.g., ‘ridge’, ‘lasso’, ‘elasticnet’, ‘randomforest’, ‘knn’.

  • An object inheriting from dist_reg.DistReg.

  • A list of dist_reg.DistReg objects to automatically choose between.

E.g., when outcome is continuous, the default is outcome_model=dist_reg.CtsDistReg(model_type='ridge').

propensity_model : str | sklearn classifier

How to estimate the propensity scores if they are not provided. Two options:

  • A str identifier, e.g., ‘ridge’, ‘lasso’, ‘elasticnet’, ‘randomforest’, ‘knn’.

  • An sklearn classifier, e.g., sklearn.linear_model.LogisticRegressionCV().

model_selector : dist_reg.ModelSelector

A ModelSelector object which can choose between several outcome models. The default performs within-fold nested cross-validation. Note: this argument is ignored unless outcome_model is a list.

discrete : bool

If True, treats the outcome as a discrete variable. Defaults to None (inferred from the data).

support : np.array

Optional support of the outcome, if known and discrete. Defaults to None (inferred from the data).

model_kwargs : dict

Additional kwargs for the outcome_model, e.g., feature_transform. See dualbounds.dist_reg.CtsDistReg or dualbounds.dist_reg.BinaryDistReg for more kwargs.

Methods

compute_dual_variables([y0_dists, y0_vals, ...])

Estimates dual variables using the outcome model.

cross_fit([nfolds, suppress_warning, ...])

Cross-fits the outcome model.

diagnostics([plot, aipw])

Reports a set of technical diagnostics.

eval_outcome_model()

Thinly wraps dist_reg._evaluate_model_predictions.

eval_treatment_model()

Thinly wraps dist_reg._evaluate_model_predictions.

fit([nfolds, aipw, alpha, y0_dists, ...])

Main function which (1) performs cross-fitting, (2) computes optimal dual variables, and (3) computes final dual bounds.

fit_propensity_scores(nfolds[, clip, verbose])

Cross-fits the propensity scores.

plot_dual_variables([i])

Plots the estimated dual variables for the ith data-point.

results([minval, maxval])

Returns a dataframe of key inferential results.

summary([minval, maxval])

Prints a summary of main results from the class.