dualbounds.iv.DualIVBounds

class dualbounds.iv.DualIVBounds(exposure: array | Series, instrument: array | Series, exposure_model: str | DistReg = 'ridge', suppress_iv_warning: bool = False, *args, **kwargs)[source]

Beta version. Computes dual bounds on \(E[f(W(0), W(1), Y(0),Y(1), X)].\) in the instrumental variables context.

Here, \(X\) are covariates, \(Y(0), Y(1)\) are potential outcomes, and \(W(0), W(1)\) are potential outcomes of a binary exposure/treatment.

Parameters:
f : function

Function which defines the partially identified estimand. Must be a function of three arguments: w0, w1, y0, y1, x (in that order). E.g., f = lambda w0, w1, y0, y1, x : (y0 <= y1) * (w0 <= w1)

outcome : np.array | pd.Series

n-length array of outcome measurements (Y).

instrument : np.array | pd.Series

n-length array of binary instrument (Z).

exposure : np.array | pd.Series

n-length array of binary exposure (W).

covariates : np.array | pd.Series

(n, p)-shaped array of covariates (X).

propensities : np.array | pd.Series

n-length array of propensity scores \(P(Z=1 | X)\). If None, will be estimated from the data.

clusters : np.array | pd.Series

Optional n-length array of clusters, so clusters[i] = j indicates that observation i is in cluster j.

outcome_model : str | dist_reg.DistReg | list

The model for estimating the law of \(Y | X, W, Z\). Three options:

  • A str identifier, e.g., ‘ridge’, ‘lasso’, ‘elasticnet’, ‘randomforest’, ‘knn’.

  • An object inheriting from dist_reg.DistReg.

  • A list of dist_reg.DistReg objects to automatically choose between.

E.g., when outcome is continuous, the default is outcome_model=dist_reg.CtsDistReg(model_type='ridge').

exposure_model : str | dist_reg.DistReg

The model for estimating the law of \(W | X, Z\). Two options:

  • A str identifier, e.g., ‘ridge’, ‘lasso’, ‘elasticnet’, ‘randomforest’, ‘knn’.

  • An object inheriting from dist_reg.DistReg.

The default is exposure_model=dist_reg.BinaryDistReg(model_type='ridge').

propensity_model : str | sklearn classifier

How to estimate the propensity scores if they are not provided. Two options:

  • A str identifier, e.g., ‘ridge’, ‘lasso’, ‘elasticnet’, ‘randomforest’, ‘knn’.

  • An sklearn classifier, e.g., sklearn.linear_model.LogisticRegressionCV().

model_selector : dist_reg.ModelSelector

A ModelSelector object which can choose between several outcome models. The default performs within-fold nested cross-validation. Note: this argument is ignored unless outcome_model is a list.

discrete : bool

If True, treats the outcome as a discrete variable. Defaults to None (inferred from the data).

support : np.array

Optional support of the outcome, if known and discrete. Defaults to None (inferred from the data).

support_restriction : function

Boolean-valued function of w0, w1, y0, y1, x where support_restriction(w0, w1, y0, y1, x) = False asserts that w0, w1, y0, y1, x is not in the support of \(W(0), W(1), Y(0), Y(1), X\). Defaults to None (no a-priori support restrictions). See the user guide for important usage tips.

model_kwargs : dict

Additional kwargs for the outcome_model, e.g., feature_transform. See dualbounds.dist_reg.CtsDistReg or dualbounds.dist_reg.BinaryDistReg for more kwargs.

suppress_iv_warning : bool

If True, suppresses the beta warning for DualIVBounds.

Notes

This is currently slower than the DualBounds class.

Methods

compute_dual_variables(wprobs, ydists[, ...])

Same signature as generic.DualBounds.compute_dual_variables with the following exceptions.

cross_fit([nfolds, suppress_warning, verbose])

Cross-fits the outcome model.

diagnostics([plot, aipw])

Reports a set of technical diagnostics.

eval_exposure_model()

Thinly wraps dist_reg._evaluate_model_predictions.

eval_outcome_model()

Thinly wraps dist_reg._evaluate_model_predictions.

eval_treatment_model()

Thinly wraps dist_reg._evaluate_model_predictions.

fit([nfolds, aipw, alpha, wprobs, ydists, ...])

Main function which (1) performs cross-fitting, (2) computes optimal dual variables, and (3) computes final dual bounds.

fit_propensity_scores(nfolds[, clip, verbose])

Cross-fits the propensity scores.

plot_dual_variables([i])

Plots the estimated dual variables for the ith data-point.

results([minval, maxval])

Returns a dataframe of key inferential results.

summary([minval, maxval])

Prints a summary of main results from the class.