dualbounds.lee.LeeDualBounds

class dualbounds.lee.LeeDualBounds(selections: array | Series, *args, selection_model: str | BinaryDistReg | None = None, **kwargs)[source]

Computes dual bounds on the ATE under selection bias.

Precisely, this class bounds

\(E[Y(1) - Y(0) | S(0) = S(1) = 1]\)

where \(Y(1), Y(0)\) are potential outcomes and \(S(1), S(0)\) are post-treatment selection events. These bounds assume monotonicity, i.e., \(S(1) >= S(0)\) a.s. (see Lee 2009).

Parameters:
selections : np.array

n-length array-like of binary selection indicators

outcome : np.array | pd.Series

n-length array of outcome measurements (Y).

treatment : np.array | pd.Series

n-length array of binary treatment (W).

covariates : np.array | pd.Series

(n, p)-shaped array of covariates (X).

propensities : np.array | pd.Series

n-length array-like of propensity scores \(P(W=1 | X)\). If None, will be estimated from the data.

outcome_model : str | dist_reg.DistReg

The model for estimating the law of \(Y | X, W\). Two options:

  • A str identifier, e.g., ‘ridge’, ‘lasso’, ‘elasticnet’, ‘randomforest’, ‘knn’.

  • An object inheriting from dist_reg.DistReg.

E.g., when outcome is continuous, the default is outcome_model=dist_reg.CtsDistReg(model_type='ridge').

propensity_model : str | sklearn classifier

How to estimate the propensity scores if they are not provided. Two options:

  • A str identifier, e.g., ‘ridge’, ‘lasso’, ‘elasticnet’, ‘randomforest’, ‘knn’.

  • An sklearn classifier, e.g., sklearn.linear_model.LogisticRegressionCV().

selection_model : str | dist_reg.BinaryDistReg

How to estimate the selection probabilities \(P(S =1 | W, X)\). Two options:

  • A str identifier, i.e., ‘monotone_logistic’, ‘ridge’, ‘lasso’.

  • An object inheriting from dist_reg.BinaryDistReg.

The default is monotone_logistic.

support : np.array

Optional support of the outcome, if known and discrete. Defaults to None (inferred from the data).

model_kwargs : dict

Additional kwargs for the outcome_model, e.g., feature_transform. See dualbounds.dist_reg.CtsDistReg or dualbounds.dist_reg.BinaryDistReg for more kwargs.

Methods

compute_dual_variables(s0_probs, s1_probs[, ...])

Estimates dual variables using the outcome model.

cross_fit([nfolds, suppress_warning, verbose])

Cross-fits the outcome and selection models.

diagnostics([plot, aipw])

Reports a set of technical diagnostics.

eval_outcome_model()

Thinly wraps dist_reg._evaluate_model_predictions.

eval_treatment_model()

Thinly wraps dist_reg._evaluate_model_predictions.

fit([nfolds, alpha, aipw, s0_probs, ...])

Main function which (1) performs cross-fitting, (2) computes optimal dual variables, and (3) computes final dual bounds.

fit_propensity_scores(nfolds[, clip, verbose])

Cross-fits the propensity scores.

plot_dual_variables([i])

Plots the estimated dual variables for the ith data-point.

results([minval, maxval])

Returns a dataframe of key inferential results.

summary([minval, maxval])

Prints a summary of main results from the class.