dualbounds.dist_reg.cross_fit_predictions

class dualbounds.dist_reg.cross_fit_predictions(W: array, X: array, y: array, Z: array | None = None, S: array | None = None, propensities: array | None = None, sample_weight: array | None = None, nfolds: int = 5, train_on_selections: bool = True, model: list | DistReg | None = None, probs_only: bool = False, verbose: bool = False, model_selector: ModelSelector | None = None)[source]

Performs cross-fitting for a model inheriting from dist_reg.DistReg.

Parameters:
W : np.array

n-length array of binary treatment indicators.

X : np.array

(n, p)-shaped array of covariates.

y : np.array

n-length array of outcome measurements.

Z : np.array

Optional n-length array of binary instrument values for the instrumental variables setting.

S : np.array

Optional n-length array of selection indicators.

propensities : np.array

Optional n-length array of propensity scores. This argument is only used when model_selector is provided.

sample_weight : np.array

Optional n-length array of weights to use when fitting the underlying model.

nfolds : int

Number of cross-fitting folds to use.

model : DistReg

instantiation of dist_reg.DistReg class. This will be copied. E.g., model=dist_reg.CtsDistReg(model_type='ridge', eps_dist="empirical"). Alternatively, one may provide a list of dist_reg.DistReg classes and a model_selector which adaptively chooses between them.

train_on_selections : bool

If True, trains model only on data where S[i] == 1.

probs_only : bool

For binary data, returns P(Y = 1 | X, W) instead of a distribution. Defaults to False.

verbose : bool

If True, provides progress reports.

Returns:

  • y0_dists (list) – list of batched scipy distributions whose shapes sum to n. the ith distribution is the out-of-sample estimate of the conditional law of Yi(0) | X[i]

  • y1_dists (list) – list of batched scipy distributions whose shapes sum to n. the ith distribution is the out-of-sample estimate of the conditional law of Yi(1) | X[i]