| Title: | Tidy Estimation of Heterogeneous Treatment Effects |
|---|---|
| Description: | Estimates heterogeneous treatment effects using tidy semantics on experimental or observational data. Methods are based on the doubly-robust learner of Kennedy (2023) <doi:10.1214/23-EJS2157>. You provide a simple recipe for what machine learning algorithms to use in estimating the nuisance functions and 'tidyhte' will take care of cross-validation, estimation, model selection, diagnostics and construction of relevant quantities of interest about the variability of treatment effects. |
| Authors: | Drew Dimmery [aut, cre, cph] (ORCID: <https://orcid.org/0000-0001-9602-6325>) |
| Maintainer: | Drew Dimmery <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.0.4 |
| Built: | 2026-05-09 08:48:12 UTC |
| Source: | https://github.com/ddimmery/tidyhte |
This adds a diagnostic to the effect model.
add_effect_diagnostic(hte_cfg, diag)add_effect_diagnostic(hte_cfg, diag)
hte_cfg |
|
diag |
Character indicating the name of the diagnostic
to include. Possible values are |
Updated HTE_cfg object
library("dplyr") basic_config() %>% add_effect_diagnostic("RROC") -> hte_cfglibrary("dplyr") basic_config() %>% add_effect_diagnostic("RROC") -> hte_cfg
This adds a learner to the ensemble used for estimating a model of the conditional expectation of the pseudo-outcome.
add_effect_model(hte_cfg, model_name, ...)add_effect_model(hte_cfg, model_name, ...)
hte_cfg |
|
model_name |
Character indicating the name of the model to
incorporate into the joint effect ensemble. Possible values
use |
... |
Parameters over which to grid-search for this model class. |
Updated HTE_cfg object
library("dplyr") basic_config() %>% add_effect_model("SL.glm.interaction") -> hte_cfglibrary("dplyr") basic_config() %>% add_effect_model("SL.glm.interaction") -> hte_cfg
This replaces the propensity score model with a known value of the propensity score.
add_known_propensity_score(hte_cfg, covariate_name)add_known_propensity_score(hte_cfg, covariate_name)
hte_cfg |
|
covariate_name |
Character indicating the name of the covariate name in the dataframe corresponding to the known propensity score. |
Updated HTE_cfg object
library("dplyr") basic_config() %>% add_known_propensity_score("ps") -> hte_cfglibrary("dplyr") basic_config() %>% add_known_propensity_score("ps") -> hte_cfg
This adds a definition about how to display a moderators to the MCATE config. A moderator is any variable that you want to view information about CATEs with respect to.
add_moderator(hte_cfg, model_type, ..., .model_arguments = NULL)add_moderator(hte_cfg, model_type, ..., .model_arguments = NULL)
hte_cfg |
|
model_type |
Character indicating the model type for these moderators.
Currently two model types are supported: |
... |
The (unquoted) names of the moderator variables. |
.model_arguments |
A named list from argument name to value to pass into the
constructor for the model. See |
Updated HTE_cfg object
For moderators with many levels and limited sample per level, estimates may be noisy. Consider whether other encodings would be more appropriate.
library("dplyr") basic_config() %>% add_moderator("Stratified", x2, x3) %>% add_moderator("KernelSmooth", x1, x4, x5) -> hte_cfglibrary("dplyr") basic_config() %>% add_moderator("Stratified", x2, x3) %>% add_moderator("KernelSmooth", x1, x4, x5) -> hte_cfg
This adds a diagnostic to the outcome model.
add_outcome_diagnostic(hte_cfg, diag)add_outcome_diagnostic(hte_cfg, diag)
hte_cfg |
|
diag |
Character indicating the name of the diagnostic
to include. Possible values are |
Updated HTE_cfg object
library("dplyr") basic_config() %>% add_outcome_diagnostic("RROC") -> hte_cfglibrary("dplyr") basic_config() %>% add_outcome_diagnostic("RROC") -> hte_cfg
This adds a learner to the ensemble used for estimating a model of the conditional expectation of the outcome.
add_outcome_model(hte_cfg, model_name, ...)add_outcome_model(hte_cfg, model_name, ...)
hte_cfg |
|
model_name |
Character indicating the name of the model to
incorporate into the outcome ensemble. Possible values
use |
... |
Parameters over which to grid-search for this model class. |
Updated HTE_cfg object
library("dplyr") basic_config() %>% add_outcome_model("SL.glm.interaction") -> hte_cfglibrary("dplyr") basic_config() %>% add_outcome_model("SL.glm.interaction") -> hte_cfg
This adds a diagnostic to the propensity score.
add_propensity_diagnostic(hte_cfg, diag)add_propensity_diagnostic(hte_cfg, diag)
hte_cfg |
|
diag |
Character indicating the name of the diagnostic
to include. Possible values are |
Updated HTE_cfg object
library("dplyr") basic_config() %>% add_propensity_diagnostic(c("AUC", "MSE")) -> hte_cfglibrary("dplyr") basic_config() %>% add_propensity_diagnostic(c("AUC", "MSE")) -> hte_cfg
This adds a learner to the ensemble used for estimating propensity scores.
add_propensity_score_model(hte_cfg, model_name, ...)add_propensity_score_model(hte_cfg, model_name, ...)
hte_cfg |
|
model_name |
Character indicating the name of the model to
incorporate into the propensity score ensemble. Possible values
use |
... |
Parameters over which to grid-search for this model class. |
Updated HTE_cfg object
library("dplyr") basic_config() %>% add_propensity_score_model("SL.glmnet", alpha = c(0, 0.5, 1)) -> hte_cfglibrary("dplyr") basic_config() %>% add_propensity_score_model("SL.glmnet", alpha = c(0, 0.5, 1)) -> hte_cfg
This adds a variable importance quantity of interest to the outputs.
add_vimp(hte_cfg, sample_splitting = TRUE, linear_only = FALSE)add_vimp(hte_cfg, sample_splitting = TRUE, linear_only = FALSE)
hte_cfg |
|
sample_splitting |
Logical indicating whether to use sample splitting or not. Choosing not to use sample splitting means that inference will only be valid for moderators with non-null importance. |
linear_only |
Logical indicating whether the variable importance should use only a single linear-only model. Variable importance measure will only be consistent for the population quantity if the true model of pseudo-outcomes is linear. |
Updated HTE_cfg object
Williamson, B. D., Gilbert, P. B., Carone, M., & Simon, N. (2021). Nonparametric variable importance assessment using machine learning techniques. Biometrics, 77(1), 9-22.
Williamson, B. D., Gilbert, P. B., Simon, N. R., & Carone, M. (2021). A general framework for inference on algorithm-agnostic variable importance. Journal of the American Statistical Association, 1-14.
library("dplyr") basic_config() %>% add_vimp(sample_splitting = FALSE) -> hte_cfglibrary("dplyr") basic_config() %>% add_vimp(sample_splitting = FALSE) -> hte_cfg
HTE_cfg to a dataframeThis adds a configuration attribute to a dataframe for HTE estimation. This configuration details the full analysis of HTE that should be performed.
attach_config(data, .HTE_cfg)attach_config(data, .HTE_cfg)
data |
dataframe |
.HTE_cfg |
|
For information about how to set up an HTE_cfg object, see the Recipe API
documentation basic_config().
To see an example analysis, read vignette("experimental_analysis") in the context
of an experiment, vignette("experimental_analysis") for an observational study, or
vignette("methodological_details") for a deeper dive under the hood.
basic_config(), make_splits(), produce_plugin_estimates(),
construct_pseudo_outcomes(), estimate_QoI()
library("dplyr") if(require("palmerpenguins")) { data(package = 'palmerpenguins') penguins$unitid = seq_len(nrow(penguins)) penguins$propensity = rep(0.5, nrow(penguins)) penguins$treatment = rbinom(nrow(penguins), 1, penguins$propensity) cfg <- basic_config() %>% add_known_propensity_score("propensity") %>% add_outcome_model("SL.glm.interaction") %>% remove_vimp() attach_config(penguins, cfg) %>% make_splits(unitid, .num_splits = 4) %>% produce_plugin_estimates(outcome = body_mass_g, treatment = treatment, species, sex) %>% construct_pseudo_outcomes(body_mass_g, treatment) %>% estimate_QoI(species, sex) }library("dplyr") if(require("palmerpenguins")) { data(package = 'palmerpenguins') penguins$unitid = seq_len(nrow(penguins)) penguins$propensity = rep(0.5, nrow(penguins)) penguins$treatment = rbinom(nrow(penguins), 1, penguins$propensity) cfg <- basic_config() %>% add_known_propensity_score("propensity") %>% add_outcome_model("SL.glm.interaction") %>% remove_vimp() attach_config(penguins, cfg) %>% make_splits(unitid, .num_splits = 4) %>% produce_plugin_estimates(outcome = body_mass_g, treatment = treatment, species, sex) %>% construct_pseudo_outcomes(body_mass_g, treatment) %>% estimate_QoI(species, sex) }
This provides a basic recipe for HTE estimation that can be extended by providing additional information about models to be estimated and what quantities of interest should be returned based on those models. This basic model includes only linear models for nuisance function estimation, and basic diagnostics.
basic_config()basic_config()
Additional models, diagnostics and quantities of interest should be added using their respective helper functions provided as part of the Recipe API.
To see an example analysis, read vignette("experimental_analysis") in the context
of an experiment, vignette("experimental_analysis") for an observational study, or
vignette("methodological_details") for a deeper dive under the hood.
HTE_cfg object
add_propensity_score_model(), add_known_propensity_score(),
add_propensity_diagnostic(), add_outcome_model(), add_outcome_diagnostic(),
add_effect_model(), add_effect_diagnostic(), add_moderator(), add_vimp()
library("dplyr") basic_config() %>% add_known_propensity_score("ps") %>% add_outcome_model("SL.glm.interaction") %>% add_outcome_model("SL.glmnet", alpha = c(0.05, 0.15, 0.2, 0.25, 0.5, 0.75)) %>% add_outcome_model("SL.glmnet.interaction", alpha = c(0.05, 0.15, 0.2, 0.25, 0.5, 0.75)) %>% add_outcome_diagnostic("RROC") %>% add_effect_model("SL.glm.interaction") %>% add_effect_model("SL.glmnet", alpha = c(0.05, 0.15, 0.2, 0.25, 0.5, 0.75)) %>% add_effect_model("SL.glmnet.interaction", alpha = c(0.05, 0.15, 0.2, 0.25, 0.5, 0.75)) %>% add_effect_diagnostic("RROC") %>% add_moderator("Stratified", x2, x3) %>% add_moderator("KernelSmooth", x1, x4, x5) %>% add_vimp(sample_splitting = FALSE) -> hte_cfglibrary("dplyr") basic_config() %>% add_known_propensity_score("ps") %>% add_outcome_model("SL.glm.interaction") %>% add_outcome_model("SL.glmnet", alpha = c(0.05, 0.15, 0.2, 0.25, 0.5, 0.75)) %>% add_outcome_model("SL.glmnet.interaction", alpha = c(0.05, 0.15, 0.2, 0.25, 0.5, 0.75)) %>% add_outcome_diagnostic("RROC") %>% add_effect_model("SL.glm.interaction") %>% add_effect_model("SL.glmnet", alpha = c(0.05, 0.15, 0.2, 0.25, 0.5, 0.75)) %>% add_effect_model("SL.glmnet.interaction", alpha = c(0.05, 0.15, 0.2, 0.25, 0.5, 0.75)) %>% add_effect_diagnostic("RROC") %>% add_moderator("Stratified", x2, x3) %>% add_moderator("KernelSmooth", x1, x4, x5) %>% add_vimp(sample_splitting = FALSE) -> hte_cfg
Constant_cfg is a configuration class for estimating a constant model.
That is, the model is a simple, one-parameter mean model.
tidyhte::Model_cfg -> Constant_cfg
model_classThe class of the model, required for all classes
which inherit from Model_cfg.
new()
Create a new Constant_cfg object.
Constant_cfg$new()
A new Constant_cfg object.
Constant_cfg$new()
clone()
The objects of this class are cloneable with this method.
Constant_cfg$clone(deep = FALSE)
deepWhether to make a deep clone.
## ------------------------------------------------ ## Method `Constant_cfg$new` ## ------------------------------------------------ Constant_cfg$new()## ------------------------------------------------ ## Method `Constant_cfg$new` ## ------------------------------------------------ Constant_cfg$new()
construct_pseudo_outcomes takes a dataset which has been prepared
with plugin estimators of nuisance parameters and transforms these into
a "pseudo-outcome": an unbiased estimator of the conditional average
treatment effect under exogeneity.
construct_pseudo_outcomes(data, outcome, treatment, type = "dr")construct_pseudo_outcomes(data, outcome, treatment, type = "dr")
data |
dataframe (already prepared with |
outcome |
Unquoted name of outcome variable. |
treatment |
Unquoted name of treatment variable. |
type |
String representing how to construct the pseudo-outcome. Valid values are "dr" (the default), "ipw" and "plugin". See "Details" for more discussion of these options. |
Taking averages of these pseudo-outcomes (or fitting a model to them) will approximate averages (or models) of the underlying treatment effect.
attach_config(), make_splits(), produce_plugin_estimates(), estimate_QoI()
Diagnostics_cfg is a configuration class for estimating a variety of
diagnostics for the models trained in the course of HTE estimation.
psModel diagnostics for the propensity score model.
outcomeModel diagnostics for the outcome models.
effectModel diagnostics for the joint effect model.
paramsParameters for any requested diagnostics.
new()
Create a new Diagnostics_cfg object with specified diagnostics to estimate.
Diagnostics_cfg$new(ps = NULL, outcome = NULL, effect = NULL, params = NULL)
psModel diagnostics for the propensity score model.
outcomeModel diagnostics for the outcome models.
effectModel diagnostics for the joint effect model.
paramsList providing values for parameters to any requested diagnostics.
A new Diagnostics_cfg object.
Diagnostics_cfg$new(
outcome = c("SL_risk", "SL_coefs", "MSE", "RROC"),
ps = c("SL_risk", "SL_coefs", "AUC")
)
add()
Add diagnostics to the Diagnostics_cfg object.
Diagnostics_cfg$add(ps = NULL, outcome = NULL, effect = NULL)
psModel diagnostics for the propensity score model.
outcomeModel diagnostics for the outcome models.
effectModel diagnostics for the joint effect model.
An updated Diagnostics_cfg object.
cfg <- Diagnostics_cfg$new(
outcome = c("SL_risk", "SL_coefs", "MSE", "RROC"),
ps = c("SL_risk", "SL_coefs")
)
cfg <- cfg$add(ps = "AUC")
clone()
The objects of this class are cloneable with this method.
Diagnostics_cfg$clone(deep = FALSE)
deepWhether to make a deep clone.
Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE", "RROC"), ps = c("SL_risk", "SL_coefs", "AUC") ) ## ------------------------------------------------ ## Method `Diagnostics_cfg$new` ## ------------------------------------------------ Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE", "RROC"), ps = c("SL_risk", "SL_coefs", "AUC") ) ## ------------------------------------------------ ## Method `Diagnostics_cfg$add` ## ------------------------------------------------ cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE", "RROC"), ps = c("SL_risk", "SL_coefs") ) cfg <- cfg$add(ps = "AUC")Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE", "RROC"), ps = c("SL_risk", "SL_coefs", "AUC") ) ## ------------------------------------------------ ## Method `Diagnostics_cfg$new` ## ------------------------------------------------ Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE", "RROC"), ps = c("SL_risk", "SL_coefs", "AUC") ) ## ------------------------------------------------ ## Method `Diagnostics_cfg$add` ## ------------------------------------------------ cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE", "RROC"), ps = c("SL_risk", "SL_coefs") ) cfg <- cfg$add(ps = "AUC")
estimate_QoI takes a dataframe already prepared with split IDs,
plugin estimates and pseudo-outcomes and calculates the requested
quantities of interest (QoIs).
estimate_QoI(data, ...)estimate_QoI(data, ...)
data |
data frame (already prepared with |
... |
Unquoted names of moderators to calculate QoIs for. |
To see an example analysis, read vignette("experimental_analysis") in the context
of an experiment, vignette("experimental_analysis") for an observational study, or
vignette("methodological_details") for a deeper dive under the hood.
attach_config(), make_splits(), produce_plugin_estimates(),
construct_pseudo_outcomes(),
library("dplyr") if(require("palmerpenguins")) { data(package = 'palmerpenguins') penguins$unitid = seq_len(nrow(penguins)) penguins$propensity = rep(0.5, nrow(penguins)) penguins$treatment = rbinom(nrow(penguins), 1, penguins$propensity) cfg <- basic_config() %>% add_known_propensity_score("propensity") %>% add_outcome_model("SL.glm.interaction") %>% remove_vimp() attach_config(penguins, cfg) %>% make_splits(unitid, .num_splits = 4) %>% produce_plugin_estimates(outcome = body_mass_g, treatment = treatment, species, sex) %>% construct_pseudo_outcomes(body_mass_g, treatment) %>% estimate_QoI(species, sex) }library("dplyr") if(require("palmerpenguins")) { data(package = 'palmerpenguins') penguins$unitid = seq_len(nrow(penguins)) penguins$propensity = rep(0.5, nrow(penguins)) penguins$treatment = rbinom(nrow(penguins), 1, penguins$propensity) cfg <- basic_config() %>% add_known_propensity_score("propensity") %>% add_outcome_model("SL.glm.interaction") %>% remove_vimp() attach_config(penguins, cfg) %>% make_splits(unitid, .num_splits = 4) %>% produce_plugin_estimates(outcome = body_mass_g, treatment = treatment, species, sex) %>% construct_pseudo_outcomes(body_mass_g, treatment) %>% estimate_QoI(species, sex) }
HTE_cfg is a configuration class that pulls everything together, indicating
the full configuration for a given HTE analysis. This includes how to estimate
models and what Quantities of Interest to calculate based off those underlying models.
outcomeModel_cfg object indicating how outcome models should be estimated.
treatmentModel_cfg object indicating how the propensity score
model should be estimated.
effectModel_cfg object indicating how the joint effect model
should be estimated.
qoiQoI_cfg object indicating what the Quantities of Interest
are and providing all
necessary detail on how they should be estimated.
verboseLogical indicating whether to print debugging information.
new()
Create a new HTE_cfg object with all necessary information about how
to carry out an HTE analysis.
HTE_cfg$new( outcome = NULL, treatment = NULL, effect = NULL, qoi = NULL, verbose = FALSE )
outcomeModel_cfg object indicating how outcome models should
be estimated.
treatmentModel_cfg object indicating how the propensity score
model should be estimated.
effectModel_cfg object indicating how the joint effect model
should be estimated.
qoiQoI_cfg object indicating what the Quantities of Interest
are and providing all
necessary detail on how they should be estimated.
verboseLogical indicating whether to print debugging information.
mcate_cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)))
pcate_cfg <- PCATE_cfg$new(
cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)),
model_covariates = c("x1", "x2", "x3"),
num_mc_samples = list(x1 = 100)
)
vimp_cfg <- VIMP_cfg$new()
diag_cfg <- Diagnostics_cfg$new(
outcome = c("SL_risk", "SL_coefs", "MSE"),
ps = c("SL_risk", "SL_coefs", "AUC")
)
qoi_cfg <- QoI_cfg$new(
mcate = mcate_cfg,
pcate = pcate_cfg,
vimp = vimp_cfg,
diag = diag_cfg
)
ps_cfg <- SLEnsemble_cfg$new(
learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam"))
)
y_cfg <- SLEnsemble_cfg$new(
learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam"))
)
fx_cfg <- SLEnsemble_cfg$new(
learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam"))
)
HTE_cfg$new(outcome = y_cfg, treatment = ps_cfg, effect = fx_cfg, qoi = qoi_cfg)
clone()
The objects of this class are cloneable with this method.
HTE_cfg$clone(deep = FALSE)
deepWhether to make a deep clone.
## ------------------------------------------------ ## Method `HTE_cfg$new` ## ------------------------------------------------ mcate_cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) pcate_cfg <- PCATE_cfg$new( cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)), model_covariates = c("x1", "x2", "x3"), num_mc_samples = list(x1 = 100) ) vimp_cfg <- VIMP_cfg$new() diag_cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE"), ps = c("SL_risk", "SL_coefs", "AUC") ) qoi_cfg <- QoI_cfg$new( mcate = mcate_cfg, pcate = pcate_cfg, vimp = vimp_cfg, diag = diag_cfg ) ps_cfg <- SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) y_cfg <- SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) fx_cfg <- SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) HTE_cfg$new(outcome = y_cfg, treatment = ps_cfg, effect = fx_cfg, qoi = qoi_cfg)## ------------------------------------------------ ## Method `HTE_cfg$new` ## ------------------------------------------------ mcate_cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) pcate_cfg <- PCATE_cfg$new( cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)), model_covariates = c("x1", "x2", "x3"), num_mc_samples = list(x1 = 100) ) vimp_cfg <- VIMP_cfg$new() diag_cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE"), ps = c("SL_risk", "SL_coefs", "AUC") ) qoi_cfg <- QoI_cfg$new( mcate = mcate_cfg, pcate = pcate_cfg, vimp = vimp_cfg, diag = diag_cfg ) ps_cfg <- SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) y_cfg <- SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) fx_cfg <- SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) HTE_cfg$new(outcome = y_cfg, treatment = ps_cfg, effect = fx_cfg, qoi = qoi_cfg)
KernelSmooth_cfg is a configuration class for non-parametric local-linear
regression to construct a smooth representation of the relationship between
two variables. This is typically used for displaying a surface of the conditional
average treatment effect over a continuous covariate.
Kernel smoothing is handled by the nprobust package.
tidyhte::Model_cfg -> KernelSmooth_cfg
model_classThe class of the model, required for all classes
which inherit from Model_cfg.
nevalThe number of points at which to evaluate the local regression. More points will provide a smoother line at the cost of somewhat higher computation.
eval_min_quantileMinimum quantile at which to evaluate the smoother.
new()
Create a new KernelSmooth_cfg object with specified number of evaluation points.
KernelSmooth_cfg$new(neval = 100, eval_min_quantile = 0.05)
nevalThe number of points at which to evaluate the local regression. More points will provide a smoother line at the cost of somewhat higher computation.
eval_min_quantileMinimum quantile at which to evaluate the smoother. A value of zero will do no clipping. Clipping is performed from both the top and the bottom of the empirical distribution. A value of alpha would evaluate over [alpha, 1 - alpha].
A new KernelSmooth_cfg object.
KernelSmooth_cfg$new(neval = 100)
clone()
The objects of this class are cloneable with this method.
KernelSmooth_cfg$clone(deep = FALSE)
deepWhether to make a deep clone.
## ------------------------------------------------ ## Method `KernelSmooth_cfg$new` ## ------------------------------------------------ KernelSmooth_cfg$new(neval = 100)## ------------------------------------------------ ## Method `KernelSmooth_cfg$new` ## ------------------------------------------------ KernelSmooth_cfg$new(neval = 100)
Known_cfg is a configuration class for when a particular model is known
a-priori. The prototypical usage of this class is when heterogeneous
treatment effects are estimated in the context of a randomized control
trial with known propensity scores.
tidyhte::Model_cfg -> Known_cfg
covariate_nameThe name of the column in the dataset which corresponds to the known model score.
model_classThe class of the model, required for all classes
which inherit from Model_cfg.
new()
Create a new Known_cfg object with specified covariate column.
Known_cfg$new(covariate_name)
covariate_nameThe name of the column, a string, in the dataset corresponding to the known model score (i.e. the true conditional expectation).
A new Known_cfg object.
Known_cfg$new("propensity_score")
clone()
The objects of this class are cloneable with this method.
Known_cfg$clone(deep = FALSE)
deepWhether to make a deep clone.
## ------------------------------------------------ ## Method `Known_cfg$new` ## ------------------------------------------------ Known_cfg$new("propensity_score")## ------------------------------------------------ ## Method `Known_cfg$new` ## ------------------------------------------------ Known_cfg$new("propensity_score")
This takes a dataset, a column with a unique identifier and an
arbitrary number of covariates on which to stratify the splits.
It returns the original dataset with an additional column .split_id
corresponding to an identifier for the split.
make_splits(data, identifier, ..., .num_splits)make_splits(data, identifier, ..., .num_splits)
data |
dataframe |
identifier |
Unquoted name of unique identifier column |
... |
variables on which to stratify (requires that |
.num_splits |
number of splits to create. If VIMP is requested in |
To see an example analysis, read vignette("experimental_analysis") in the context
of an experiment, vignette("experimental_analysis") for an observational study, or
vignette("methodological_details") for a deeper dive under the hood.
original dataframe with additional .split_id column
attach_config(), produce_plugin_estimates(), construct_pseudo_outcomes(),
estimate_QoI()
library("dplyr") if(require("palmerpenguins")) { data(package = 'palmerpenguins') penguins$unitid = seq_len(nrow(penguins)) penguins$propensity = rep(0.5, nrow(penguins)) penguins$treatment = rbinom(nrow(penguins), 1, penguins$propensity) cfg <- basic_config() %>% add_known_propensity_score("propensity") %>% add_outcome_model("SL.glm.interaction") %>% remove_vimp() attach_config(penguins, cfg) %>% make_splits(unitid, .num_splits = 4) %>% produce_plugin_estimates(outcome = body_mass_g, treatment = treatment, species, sex) %>% construct_pseudo_outcomes(body_mass_g, treatment) %>% estimate_QoI(species, sex) }library("dplyr") if(require("palmerpenguins")) { data(package = 'palmerpenguins') penguins$unitid = seq_len(nrow(penguins)) penguins$propensity = rep(0.5, nrow(penguins)) penguins$treatment = rbinom(nrow(penguins), 1, penguins$propensity) cfg <- basic_config() %>% add_known_propensity_score("propensity") %>% add_outcome_model("SL.glm.interaction") %>% remove_vimp() attach_config(penguins, cfg) %>% make_splits(unitid, .num_splits = 4) %>% produce_plugin_estimates(outcome = body_mass_g, treatment = treatment, species, sex) %>% construct_pseudo_outcomes(body_mass_g, treatment) %>% estimate_QoI(species, sex) }
MCATE_cfg is a configuration class for estimating marginal response
surfaces based on heterogeneous treatment effect estimates. "Marginal"
in this context implies that all other covariates are marginalized.
Thus, if two covariates are highly correlated, it is likely that their
MCATE surfaces will be extremely similar.
cfgsNamed list of covariates names to a Model_cfg object defining
how to present that covariate's CATE surface (while marginalizing
over all other covariates).
std_errorsBoolean indicating whether the results should be returned with standard errors or not.
estimandString indicating the estimand to target.
new()
Create a new MCATE_cfg object with specified model name and hyperparameters.
MCATE_cfg$new(cfgs, std_errors = TRUE)
cfgsNamed list from moderator name to a Model_cfg object
defining how to present that covariate's CATE surface (while
marginalizing over all other covariates)
std_errorsBoolean indicating whether the results should be returned with standard errors or not.
A new MCATE_cfg object.
MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)))
add_moderator()
Add a moderator to the MCATE_cfg object. This entails defining a configuration
for displaying the effect surface for that moderator.
MCATE_cfg$add_moderator(var_name, cfg)
var_nameThe name of the moderator to add (and the name of the column in the dataset).
cfgA Model_cfg defining how to display the selected moderator's effect
surface.
An updated MCATE_cfg object.
cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)))
cfg <- cfg$add_moderator("x2", KernelSmooth_cfg$new(neval = 100))
clone()
The objects of this class are cloneable with this method.
MCATE_cfg$clone(deep = FALSE)
deepWhether to make a deep clone.
MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) ## ------------------------------------------------ ## Method `MCATE_cfg$new` ## ------------------------------------------------ MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) ## ------------------------------------------------ ## Method `MCATE_cfg$add_moderator` ## ------------------------------------------------ cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) cfg <- cfg$add_moderator("x2", KernelSmooth_cfg$new(neval = 100))MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) ## ------------------------------------------------ ## Method `MCATE_cfg$new` ## ------------------------------------------------ MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) ## ------------------------------------------------ ## Method `MCATE_cfg$add_moderator` ## ------------------------------------------------ cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) cfg <- cfg$add_moderator("x2", KernelSmooth_cfg$new(neval = 100))
Model_cfg is the base class from which all other model configurations
inherit.
model_classThe class of the model, required for all classes
which inherit from Model_cfg.
new()
Create a new Model_cfg object with any necessary parameters.
Model_cfg$new()
A new Model_cfg object.
clone()
The objects of this class are cloneable with this method.
Model_cfg$clone(deep = FALSE)
deepWhether to make a deep clone.
R6 class to represent data to be used in estimating a model
R6 class to represent data to be used in estimating a model
This class provides consistent names and interfaces to data which will be used in a supervised regression / classification model.
labelThe labels for the eventual model as a vector.
featuresThe matrix representation of the data to be used for model fitting.
Constructed using stats::model.matrix.
model_frameThe data-frame representation of the data as constructed by
stats::model.frame.
split_idThe split identifiers as a vector.
num_splitsThe integer number of splits in the data.
clusterA cluster ID as a vector, constructed using the unit identifiers.
weightsThe case-weights as a vector.
new()
Creates an R6 object to represent data to be used in a prediction model.
Model_data$new(data, label_col, ..., .weight_col = NULL)
dataThe full dataset to populate the class with.
label_colThe unquoted name of the column to use as the label in supervised learning models.
...The unquoted names of features to use in the model.
.weight_colThe unquoted name of the column to use as case-weights in subsequent models.
A Model_data object.
library("dplyr")
df <- dplyr::tibble(
uid = 1:100,
x1 = rnorm(100),
x2 = rnorm(100),
x3 = sample(4, 100, replace = TRUE)
) %>% dplyr::mutate(
y = x1 + x2 + x3 + rnorm(100),
x3 = factor(x3)
)
df <- make_splits(df, uid, .num_splits = 5)
data <- Model_data$new(df, y, x1, x2, x3)
SL_cv_control()
A helper function to create the cross-validation options to be used by SuperLearner.
Model_data$SL_cv_control()
clone()
The objects of this class are cloneable with this method.
Model_data$clone(deep = FALSE)
deepWhether to make a deep clone.
SuperLearner::SuperLearner.CV.control
## ------------------------------------------------ ## Method `Model_data$new` ## ------------------------------------------------ library("dplyr") df <- dplyr::tibble( uid = 1:100, x1 = rnorm(100), x2 = rnorm(100), x3 = sample(4, 100, replace = TRUE) ) %>% dplyr::mutate( y = x1 + x2 + x3 + rnorm(100), x3 = factor(x3) ) df <- make_splits(df, uid, .num_splits = 5) data <- Model_data$new(df, y, x1, x2, x3)## ------------------------------------------------ ## Method `Model_data$new` ## ------------------------------------------------ library("dplyr") df <- dplyr::tibble( uid = 1:100, x1 = rnorm(100), x2 = rnorm(100), x3 = sample(4, 100, replace = TRUE) ) %>% dplyr::mutate( y = x1 + x2 + x3 + rnorm(100), x3 = factor(x3) ) df <- make_splits(df, uid, .num_splits = 5) data <- Model_data$new(df, y, x1, x2, x3)
Prediction for the glmnet wrapper.
## S3 method for class 'SL.glmnet.interaction' predict( object, newdata, remove_extra_cols = TRUE, add_missing_cols = TRUE, ... )## S3 method for class 'SL.glmnet.interaction' predict( object, newdata, remove_extra_cols = TRUE, add_missing_cols = TRUE, ... )
object |
Result object from SL.glmnet |
newdata |
Dataframe or matrix that will generate predictions. |
remove_extra_cols |
Remove any extra columns in the new data that were not part of the original model. |
add_missing_cols |
Add any columns from original data that do not exist in the new data, and set values to 0. |
... |
Any additional arguments (not used). |
This takes a dataset with an identified outcome and treatment column along
with any number of covariates and appends three columns to the dataset corresponding
to an estimate of the conditional expectation of treatment (.pi_hat), along with the
conditional expectation of the control and treatment potential outcome surfaces
(.mu0_hat and .mu1_hat respectively).
produce_plugin_estimates(data, outcome, treatment, ..., .weights = NULL)produce_plugin_estimates(data, outcome, treatment, ..., .weights = NULL)
data |
dataframe (already prepared with |
outcome |
Unquoted name of the outcome variable. |
treatment |
Unquoted name of the treatment variable. |
... |
Unquoted names of covariates to include in the models of the nuisance functions. |
.weights |
Unquoted name of weights column. If NULL, all analysis will assume weights are all equal to one and sample-based quantities will be returned. |
To see an example analysis, read vignette("experimental_analysis") in the context
of an experiment, vignette("experimental_analysis") for an observational study, or
vignette("methodological_details") for a deeper dive under the hood.
attach_config(), make_splits(), construct_pseudo_outcomes(), estimate_QoI()
library("dplyr") if(require("palmerpenguins")) { data(package = 'palmerpenguins') penguins$unitid = seq_len(nrow(penguins)) penguins$propensity = rep(0.5, nrow(penguins)) penguins$treatment = rbinom(nrow(penguins), 1, penguins$propensity) cfg <- basic_config() %>% add_known_propensity_score("propensity") %>% add_outcome_model("SL.glm.interaction") %>% remove_vimp() attach_config(penguins, cfg) %>% make_splits(unitid, .num_splits = 4) %>% produce_plugin_estimates(outcome = body_mass_g, treatment = treatment, species, sex) %>% construct_pseudo_outcomes(body_mass_g, treatment) %>% estimate_QoI(species, sex) }library("dplyr") if(require("palmerpenguins")) { data(package = 'palmerpenguins') penguins$unitid = seq_len(nrow(penguins)) penguins$propensity = rep(0.5, nrow(penguins)) penguins$treatment = rbinom(nrow(penguins), 1, penguins$propensity) cfg <- basic_config() %>% add_known_propensity_score("propensity") %>% add_outcome_model("SL.glm.interaction") %>% remove_vimp() attach_config(penguins, cfg) %>% make_splits(unitid, .num_splits = 4) %>% produce_plugin_estimates(outcome = body_mass_g, treatment = treatment, species, sex) %>% construct_pseudo_outcomes(body_mass_g, treatment) %>% estimate_QoI(species, sex) }
QoI_cfg is a configuration class for the Quantities of Interest to be
generated by the HTE analysis.
mcateA configuration object of type MCATE_cfg of
marginal effects to calculate.
pcateA configuration object of type PCATE_cfg of
partial effects to calculate.
vimpA configuration object of type VIMP_cfg of
variable importance to calculate.
diagA configuration object of type Diagnostics_cfg of
model diagnostics to calculate.
ateLogical flag indicating whether an estimate of the ATE should be returned.
predictionsLogical flag indicating whether estimates of the CATE for every unit should be returned.
new()
Create a new QoI_cfg object with specified Quantities of Interest
to estimate.
QoI_cfg$new( mcate = NULL, pcate = NULL, vimp = NULL, diag = NULL, ate = TRUE, predictions = FALSE )
mcateA configuration object of type MCATE_cfg of marginal
effects to calculate.
pcateA configuration object of type PCATE_cfg of partial
effects to calculate.
vimpA configuration object of type VIMP_cfg of variable
importance to calculate.
diagA configuration object of type Diagnostics_cfg of
model diagnostics to calculate.
ateA logical flag for whether to calculate the Average Treatment Effect (ATE) or not.
predictionsA logical flag for whether to return predictions of the CATE for every unit or not.
A new Diagnostics_cfg object.
mcate_cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)))
pcate_cfg <- PCATE_cfg$new(
cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)),
model_covariates = c("x1", "x2", "x3"),
num_mc_samples = list(x1 = 100)
)
vimp_cfg <- VIMP_cfg$new()
diag_cfg <- Diagnostics_cfg$new(
outcome = c("SL_risk", "SL_coefs", "MSE"),
ps = c("SL_risk", "SL_coefs", "AUC")
)
QoI_cfg$new(
mcate = mcate_cfg,
pcate = pcate_cfg,
vimp = vimp_cfg,
diag = diag_cfg
)
clone()
The objects of this class are cloneable with this method.
QoI_cfg$clone(deep = FALSE)
deepWhether to make a deep clone.
mcate_cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) pcate_cfg <- PCATE_cfg$new( cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)), model_covariates = c("x1", "x2", "x3"), num_mc_samples = list(x1 = 100) ) vimp_cfg <- VIMP_cfg$new() diag_cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE"), ps = c("SL_risk", "SL_coefs", "AUC") ) QoI_cfg$new( mcate = mcate_cfg, pcate = pcate_cfg, vimp = vimp_cfg, diag = diag_cfg ) ## ------------------------------------------------ ## Method `QoI_cfg$new` ## ------------------------------------------------ mcate_cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) pcate_cfg <- PCATE_cfg$new( cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)), model_covariates = c("x1", "x2", "x3"), num_mc_samples = list(x1 = 100) ) vimp_cfg <- VIMP_cfg$new() diag_cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE"), ps = c("SL_risk", "SL_coefs", "AUC") ) QoI_cfg$new( mcate = mcate_cfg, pcate = pcate_cfg, vimp = vimp_cfg, diag = diag_cfg )mcate_cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) pcate_cfg <- PCATE_cfg$new( cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)), model_covariates = c("x1", "x2", "x3"), num_mc_samples = list(x1 = 100) ) vimp_cfg <- VIMP_cfg$new() diag_cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE"), ps = c("SL_risk", "SL_coefs", "AUC") ) QoI_cfg$new( mcate = mcate_cfg, pcate = pcate_cfg, vimp = vimp_cfg, diag = diag_cfg ) ## ------------------------------------------------ ## Method `QoI_cfg$new` ## ------------------------------------------------ mcate_cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) pcate_cfg <- PCATE_cfg$new( cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)), model_covariates = c("x1", "x2", "x3"), num_mc_samples = list(x1 = 100) ) vimp_cfg <- VIMP_cfg$new() diag_cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE"), ps = c("SL_risk", "SL_coefs", "AUC") ) QoI_cfg$new( mcate = mcate_cfg, pcate = pcate_cfg, vimp = vimp_cfg, diag = diag_cfg )
This removes the variable importance quantity of interest
from an HTE_cfg.
remove_vimp(hte_cfg)remove_vimp(hte_cfg)
hte_cfg |
|
Updated HTE_cfg object
library("dplyr") basic_config() %>% remove_vimp() -> hte_cfglibrary("dplyr") basic_config() %>% remove_vimp() -> hte_cfg
Penalized regression using elastic net. Alpha = 0 corresponds to ridge regression and alpha = 1 corresponds to Lasso. Included in the model are pairwise interactions between covariates.
See vignette("glmnet_beta", package = "glmnet") for a nice tutorial on
glmnet.
SL.glmnet.interaction( Y, X, newX, family, obsWeights, id, alpha = 1, nfolds = 10, nlambda = 100, useMin = TRUE, loss = "deviance", ... )SL.glmnet.interaction( Y, X, newX, family, obsWeights, id, alpha = 1, nfolds = 10, nlambda = 100, useMin = TRUE, loss = "deviance", ... )
Y |
Outcome variable |
X |
Covariate dataframe |
newX |
Dataframe to predict the outcome |
family |
"gaussian" for regression, "binomial" for binary classification. Untested options: "multinomial" for multiple classification or "mgaussian" for multiple response, "poisson" for non-negative outcome with proportional mean and variance, "cox". |
obsWeights |
Optional observation-level weights |
id |
Optional id to group observations from the same unit (not used currently). |
alpha |
Elastic net mixing parameter, range [0, 1]. 0 = ridge regression and 1 = lasso. |
nfolds |
Number of folds for internal cross-validation to optimize lambda. |
nlambda |
Number of lambda values to check, recommended to be 100 or more. |
useMin |
If TRUE use lambda that minimizes risk, otherwise use 1 standard-error rule which chooses a higher penalty with performance within one standard error of the minimum (see Breiman et al. 1984 on CART for background). |
loss |
Loss function, can be "deviance", "mse", or "mae". If family = binomial can also be "auc" or "class" (misclassification error). |
... |
Any additional arguments are passed through to cv.glmnet. |
SLEnsemble_cfg is a configuration class for estimation of a model
using an ensemble of models using SuperLearner.
tidyhte::Model_cfg -> SLEnsemble_cfg
cvControlA list of parameters for controlling the cross-validation used in SuperLearner.
SL.libraryA vector of the names of learners to include in the SuperLearner ensemble.
SL.envAn environment containing all of the programmatically generated learners to be included in the SuperLearner ensemble.
familystats::family object to determine how SuperLearner
should be fitted.
model_classThe class of the model, required for all classes
which inherit from Model_cfg.
new()
Create a new SLEnsemble_cfg object with specified settings.
SLEnsemble_cfg$new( cvControl = NULL, learner_cfgs = NULL, family = stats::gaussian() )
cvControlA list of parameters for controlling the
cross-validation used in SuperLearner.
For more details, see SuperLearner::SuperLearner.CV.control.
learner_cfgsA list of SLLearner_cfg objects.
familystats::family object to determine how SuperLearner should be fitted.
A new SLEnsemble_cfg object.
SLEnsemble_cfg$new(
learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam"))
)
add_sublearner()
Adds a model (or class of models) to the SuperLearner ensemble. If hyperparameter values are specified, this method will add a learner for every element in the cross-product of provided hyperparameter values.
SLEnsemble_cfg$add_sublearner(learner_name, hps = NULL)
learner_namePossible values
use SuperLearner naming conventions. A full list is available
with SuperLearner::listWrappers("SL")
hpsA named list of hyper-parameters. Every element of the cross-product of these hyper-parameters will be included in the ensemble. cfg <- SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm")) ) cfg <- cfg$add_sublearner("SL.gam", list(deg.gam = c(2, 3)))
clone()
The objects of this class are cloneable with this method.
SLEnsemble_cfg$clone(deep = FALSE)
deepWhether to make a deep clone.
SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) ## ------------------------------------------------ ## Method `SLEnsemble_cfg$new` ## ------------------------------------------------ SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) )SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) ## ------------------------------------------------ ## Method `SLEnsemble_cfg$new` ## ------------------------------------------------ SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) )
SLLearner_cfg is a configuration class for a single
sublearner to be included in SuperLearner. By constructing with a named list
of hyperparameters, this configuration allows distinct submodels
for each unique combination of hyperparameters. To understand what models
and hyperparameters are available, examine the methods listed in
SuperLearner::listWrappers("SL").
model_nameThe name of the model as passed to SuperLearner
through the SL.library parameter.
hyperparametersNamed list from hyperparameter name to a vector of values that should be swept over.
new()
Create a new SLLearner_cfg object with specified model name and hyperparameters.
SLLearner_cfg$new(model_name, hp = NULL)
model_nameThe name of the model as passed to SuperLearner
through the SL.library parameter.
hpNamed list from hyperparameter name to a vector of values that should be swept over. Hyperparameters not included in this list are left at their SuperLearner default values.
A new SLLearner_cfg object.
SLLearner_cfg$new("SL.glm")
SLLearner_cfg$new("SL.gam", list(deg.gam = c(2, 3)))
clone()
The objects of this class are cloneable with this method.
SLLearner_cfg$clone(deep = FALSE)
deepWhether to make a deep clone.
## ------------------------------------------------ ## Method `SLLearner_cfg$new` ## ------------------------------------------------ SLLearner_cfg$new("SL.glm") SLLearner_cfg$new("SL.gam", list(deg.gam = c(2, 3)))## ------------------------------------------------ ## Method `SLLearner_cfg$new` ## ------------------------------------------------ SLLearner_cfg$new("SL.glm") SLLearner_cfg$new("SL.gam", list(deg.gam = c(2, 3)))
Stratified_cfg is a configuration class for stratifying a covariate
and calculating statistics within each cell.
tidyhte::Model_cfg -> Stratified_cfg
model_classThe class of the model, required for all classes
which inherit from Model_cfg.
covariateThe name of the column in the dataset which corresponds to the covariate on which to stratify.
new()
Create a new Stratified_cfg object with specified number of evaluation points.
Stratified_cfg$new(covariate)
covariateThe name of the column in the dataset which corresponds to the covariate on which to stratify.
A new Stratified_cfg object.
Stratified_cfg$new(covariate = "test_covariate")
clone()
The objects of this class are cloneable with this method.
Stratified_cfg$clone(deep = FALSE)
deepWhether to make a deep clone.
## ------------------------------------------------ ## Method `Stratified_cfg$new` ## ------------------------------------------------ Stratified_cfg$new(covariate = "test_covariate")## ------------------------------------------------ ## Method `Stratified_cfg$new` ## ------------------------------------------------ Stratified_cfg$new(covariate = "test_covariate")
VIMP_cfg is a configuration class for estimating a variable importance measure
across all moderators. This provides a meaningful measure of which moderators
explain the most of the CATE surface.
estimandString indicating the estimand to target.
sample_splittingLogical indicating whether to use sample splitting in the calculation of variable importance.
linearLogical indicating whether the variable importance assuming a linear model should be estimated.
new()
Create a new VIMP_cfg object with specified model configuration.
VIMP_cfg$new(sample_splitting = TRUE, linear_only = FALSE)
sample_splittingLogical indicating whether to use sample splitting in the calculation of variable importance. Choosing not to use sample splitting means that inference will only be valid for moderators with non-null importance.
linear_onlyLogical indicating whether the variable importance should use only a single linear-only model. Variable importance measure will only be consistent for the population quantity if the true model of pseudo-outcomes is linear.
A new VIMP_cfg object.
VIMP_cfg$new()
clone()
The objects of this class are cloneable with this method.
VIMP_cfg$clone(deep = FALSE)
deepWhether to make a deep clone.
Williamson, B. D., Gilbert, P. B., Carone, M., & Simon, N. (2021). Nonparametric variable importance assessment using machine learning techniques. Biometrics, 77(1), 9-22.
Williamson, B. D., Gilbert, P. B., Simon, N. R., & Carone, M. (2021). A general framework for inference on algorithm-agnostic variable importance. Journal of the American Statistical Association, 1-14.
VIMP_cfg$new() ## ------------------------------------------------ ## Method `VIMP_cfg$new` ## ------------------------------------------------ VIMP_cfg$new()VIMP_cfg$new() ## ------------------------------------------------ ## Method `VIMP_cfg$new` ## ------------------------------------------------ VIMP_cfg$new()