dualbounds.dist_reg.CtsDistReg¶
-
class dualbounds.dist_reg.CtsDistReg(model_type: str | BaseEstimator =
'ridge', how_transform: str ='interactions', eps_dist: str ='empirical', eps_kwargs: dict | None =None, heterosked_model: str ='none', heterosked_kwargs: dict | None =None, **model_kwargs)[source]¶ Distributional regression for continuous outcomes.
- Parameters:¶
- model_type : str or sklearn class¶
Str specifying a sklearn model class to use; options include ‘ridge’, ‘lasso’, ‘elasticnet’, ‘randomforest’, ‘knn’. One can also directly pass an sklearn class, e.g.,
model_type=sklearn.ensemble.KNeighborsRegressor.- how_transform : str¶
Str specifying how to transform the features before fitting the underlying model. One of several options:
’identity’: does not transform the features
’intercept’: adds an intercept
’interactions’ : adds treatment-covariate interactions
The default is
interactions.- eps_dist : str¶
Str specifying the distribution of the residuals. Options include [‘empirical’, gaussian’, ‘laplace’, ‘expon’, ‘tdist’, ‘skewnorm’]. Defaults to
empirical, which uses the empirical law of the residuals of the training data.- eps_kwargs : dict¶
kwargs to
utilities.parse_distfor the residual scipy distribution- heterosked_model : str or sklearn class¶
Str specifying a sklearn model class to use to estimate Var(Y | X) as a function of X. Options are the same as
model_type.Defaults to heterosked_model=None, in which case homoskedasticity is assumed (although the final bounds will still be valid in the presence of heteroskedasticity).- heterosked_kwargs : dict¶
kwargs for the heterosked model. E.g., if
heterosked_model=knn, heterosked_kwargs could includen_neighbors.- **model_kwargs : dict
kwargs for sklearn base model. E.g., if
model_type=knn, model_kwargs could includen_neighbors.
Examples
Here we instantiate a model which assumes Gaussianity, uses a ridge to make predictions and a lasso to estimate the heteroskedasticity pattern:
import numpy as np import dualbounds import sklearn.linear_model # Instantiate dist_reg cdreg = dualbounds.dist_reg.CtsDistReg( # Arguments for main model model_type=sklearn.linear_model.RidgeCV, fit_intercept=True, gcv_mode='auto', # How to estimate the law of the residuals eps_dist='gaussian', # How to estimate Var(Y | X) heterosked_model=sklearn.linear_model.LassoCV, heterosked_kwargs=dict(cv=5), ) # Fit n, p = 300, 20 W = np.random.binomial(1, 0.5, n) X = np.random.randn(n, p) y = np.random.randn(n) cdreg.fit(W=W, X=X, y=y) # Predict on new X m = 10 Xnew = np.random.randn(m, p) y0_preds = cdreg.predict(X=Xnew, W=np.zeros(m))Methods
feature_transform(W, X[, Z])Transforms the features before feeding them to the base model.
features_to_WX(features)Inverse of feature_transform.
fit(W, X, y[, Z, sample_weight])Fits model on the data.
predict(X, W[, Z])Predicts the conditional law of the outcome.
Predicts counterfactual distributions of Y (outcome).