Title: | Tidy Estimation of Heterogeneous Treatment Effects |
---|---|
Description: | Estimates heterogeneous treatment effects using tidy semantics on experimental or observational data. Methods are based on the doubly-robust learner of Kennedy (n.d.) <arXiv:2004.14497>. You provide a simple recipe for what machine learning algorithms to use in estimating the nuisance functions and 'tidyhte' will take care of cross-validation, estimation, model selection, diagnostics and construction of relevant quantities of interest about the variability of treatment effects. |
Authors: | Drew Dimmery [aut, cre, cph] |
Maintainer: | Drew Dimmery <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.2 |
Built: | 2024-11-07 03:45:38 UTC |
Source: | https://github.com/ddimmery/tidyhte |
This adds a diagnostic to the effect model.
add_effect_diagnostic(hte_cfg, diag)
add_effect_diagnostic(hte_cfg, diag)
hte_cfg |
|
diag |
Character indicating the name of the diagnostic
to include. Possible values are |
Updated HTE_cfg
object
library("dplyr") basic_config() %>% add_effect_diagnostic("RROC") -> hte_cfg
library("dplyr") basic_config() %>% add_effect_diagnostic("RROC") -> hte_cfg
This adds a learner to the ensemble used for estimating a model of the conditional expectation of the pseudo-outcome.
add_effect_model(hte_cfg, model_name, ...)
add_effect_model(hte_cfg, model_name, ...)
hte_cfg |
|
model_name |
Character indicating the name of the model to
incorporate into the joint effect ensemble. Possible values
use |
... |
Parameters over which to grid-search for this model class. |
Updated HTE_cfg
object
library("dplyr") basic_config() %>% add_effect_model("SL.glm.interaction") -> hte_cfg
library("dplyr") basic_config() %>% add_effect_model("SL.glm.interaction") -> hte_cfg
This replaces the propensity score model with a known value of the propensity score.
add_known_propensity_score(hte_cfg, covariate_name)
add_known_propensity_score(hte_cfg, covariate_name)
hte_cfg |
|
covariate_name |
Character indicating the name of the covariate name in the dataframe corresponding to the known propensity score. |
Updated HTE_cfg
object
library("dplyr") basic_config() %>% add_known_propensity_score("ps") -> hte_cfg
library("dplyr") basic_config() %>% add_known_propensity_score("ps") -> hte_cfg
This adds a definition about how to display a moderators to the MCATE config. A moderator is any variable that you want to view information about CATEs with respect to.
add_moderator(hte_cfg, model_type, ..., .model_arguments = NULL)
add_moderator(hte_cfg, model_type, ..., .model_arguments = NULL)
hte_cfg |
|
model_type |
Character indicating the model type for these moderators.
Currently two model types are supported: |
... |
The (unquoted) names of the moderator variables. |
.model_arguments |
A named list from argument name to value to pass into the
constructor for the model. See |
Updated HTE_cfg
object
library("dplyr") basic_config() %>% add_moderator("Stratified", x2, x3) %>% add_moderator("KernelSmooth", x1, x4, x5) -> hte_cfg
library("dplyr") basic_config() %>% add_moderator("Stratified", x2, x3) %>% add_moderator("KernelSmooth", x1, x4, x5) -> hte_cfg
This adds a diagnostic to the outcome model.
add_outcome_diagnostic(hte_cfg, diag)
add_outcome_diagnostic(hte_cfg, diag)
hte_cfg |
|
diag |
Character indicating the name of the diagnostic
to include. Possible values are |
Updated HTE_cfg
object
library("dplyr") basic_config() %>% add_outcome_diagnostic("RROC") -> hte_cfg
library("dplyr") basic_config() %>% add_outcome_diagnostic("RROC") -> hte_cfg
This adds a learner to the ensemble used for estimating a model of the conditional expectation of the outcome.
add_outcome_model(hte_cfg, model_name, ...)
add_outcome_model(hte_cfg, model_name, ...)
hte_cfg |
|
model_name |
Character indicating the name of the model to
incorporate into the outcome ensemble. Possible values
use |
... |
Parameters over which to grid-search for this model class. |
Updated HTE_cfg
object
library("dplyr") basic_config() %>% add_outcome_model("SL.glm.interaction") -> hte_cfg
library("dplyr") basic_config() %>% add_outcome_model("SL.glm.interaction") -> hte_cfg
This adds a diagnostic to the propensity score.
add_propensity_diagnostic(hte_cfg, diag)
add_propensity_diagnostic(hte_cfg, diag)
hte_cfg |
|
diag |
Character indicating the name of the diagnostic
to include. Possible values are |
Updated HTE_cfg
object
library("dplyr") basic_config() %>% add_propensity_diagnostic(c("AUC", "MSE")) -> hte_cfg
library("dplyr") basic_config() %>% add_propensity_diagnostic(c("AUC", "MSE")) -> hte_cfg
This adds a learner to the ensemble used for estimating propensity scores.
add_propensity_score_model(hte_cfg, model_name, ...)
add_propensity_score_model(hte_cfg, model_name, ...)
hte_cfg |
|
model_name |
Character indicating the name of the model to
incorporate into the propensity score ensemble. Possible values
use |
... |
Parameters over which to grid-search for this model class. |
Updated HTE_cfg
object
library("dplyr") basic_config() %>% add_propensity_score_model("SL.glmnet", alpha = c(0, 0.5, 1)) -> hte_cfg
library("dplyr") basic_config() %>% add_propensity_score_model("SL.glmnet", alpha = c(0, 0.5, 1)) -> hte_cfg
This adds a variable importance quantity of interest to the outputs.
add_vimp(hte_cfg, sample_splitting = TRUE, linear_only = FALSE)
add_vimp(hte_cfg, sample_splitting = TRUE, linear_only = FALSE)
hte_cfg |
|
sample_splitting |
Logical indicating whether to use sample splitting or not. Choosing not to use sample splitting means that inference will only be valid for moderators with non-null importance. |
linear_only |
Logical indicating whether the variable importance should use only a single linear-only model. Variable importance measure will only be consistent for the population quantity if the true model of pseudo-outcomes is linear. |
Updated HTE_cfg
object
Williamson, B. D., Gilbert, P. B., Carone, M., & Simon, N. (2021). Nonparametric variable importance assessment using machine learning techniques. Biometrics, 77(1), 9-22.
Williamson, B. D., Gilbert, P. B., Simon, N. R., & Carone, M. (2021). A general framework for inference on algorithm-agnostic variable importance. Journal of the American Statistical Association, 1-14.
library("dplyr") basic_config() %>% add_vimp(sample_splitting = FALSE) -> hte_cfg
library("dplyr") basic_config() %>% add_vimp(sample_splitting = FALSE) -> hte_cfg
HTE_cfg
to a dataframeThis adds a configuration attribute to a dataframe for HTE estimation. This configuration details the full analysis of HTE that should be performed.
attach_config(data, .HTE_cfg)
attach_config(data, .HTE_cfg)
data |
dataframe |
.HTE_cfg |
|
For information about how to set up an HTE_cfg
object, see the Recipe API
documentation basic_config()
.
To see an example analysis, read vignette("experimental_analysis")
in the context
of an experiment, vignette("experimental_analysis")
for an observational study, or
vignette("methodological_details")
for a deeper dive under the hood.
basic_config()
, make_splits()
, produce_plugin_estimates()
,
construct_pseudo_outcomes()
, estimate_QoI()
library("dplyr") if(require("palmerpenguins")) { data(package = 'palmerpenguins') penguins$unitid = seq_len(nrow(penguins)) penguins$propensity = rep(0.5, nrow(penguins)) penguins$treatment = rbinom(nrow(penguins), 1, penguins$propensity) cfg <- basic_config() %>% add_known_propensity_score("propensity") %>% add_outcome_model("SL.glm.interaction") %>% remove_vimp() attach_config(penguins, cfg) %>% make_splits(unitid, .num_splits = 4) %>% produce_plugin_estimates(outcome = body_mass_g, treatment = treatment, species, sex) %>% construct_pseudo_outcomes(body_mass_g, treatment) %>% estimate_QoI(species, sex) }
library("dplyr") if(require("palmerpenguins")) { data(package = 'palmerpenguins') penguins$unitid = seq_len(nrow(penguins)) penguins$propensity = rep(0.5, nrow(penguins)) penguins$treatment = rbinom(nrow(penguins), 1, penguins$propensity) cfg <- basic_config() %>% add_known_propensity_score("propensity") %>% add_outcome_model("SL.glm.interaction") %>% remove_vimp() attach_config(penguins, cfg) %>% make_splits(unitid, .num_splits = 4) %>% produce_plugin_estimates(outcome = body_mass_g, treatment = treatment, species, sex) %>% construct_pseudo_outcomes(body_mass_g, treatment) %>% estimate_QoI(species, sex) }
This provides a basic recipe for HTE estimation that can be extended by providing additional information about models to be estimated and what quantities of interest should be returned based on those models. This basic model includes only linear models for nuisance function estimation, and basic diagnostics.
basic_config()
basic_config()
Additional models, diagnostics and quantities of interest should be added using their respective helper functions provided as part of the Recipe API.
To see an example analysis, read vignette("experimental_analysis")
in the context
of an experiment, vignette("experimental_analysis")
for an observational study, or
vignette("methodological_details")
for a deeper dive under the hood.
HTE_cfg
object
add_propensity_score_model()
, add_known_propensity_score()
,
add_propensity_diagnostic()
, add_outcome_model()
, add_outcome_diagnostic()
,
add_effect_model()
, add_effect_diagnostic()
, add_moderator()
, add_vimp()
library("dplyr") basic_config() %>% add_known_propensity_score("ps") %>% add_outcome_model("SL.glm.interaction") %>% add_outcome_model("SL.glmnet", alpha = c(0.05, 0.15, 0.2, 0.25, 0.5, 0.75)) %>% add_outcome_model("SL.glmnet.interaction", alpha = c(0.05, 0.15, 0.2, 0.25, 0.5, 0.75)) %>% add_outcome_diagnostic("RROC") %>% add_effect_model("SL.glm.interaction") %>% add_effect_model("SL.glmnet", alpha = c(0.05, 0.15, 0.2, 0.25, 0.5, 0.75)) %>% add_effect_model("SL.glmnet.interaction", alpha = c(0.05, 0.15, 0.2, 0.25, 0.5, 0.75)) %>% add_effect_diagnostic("RROC") %>% add_moderator("Stratified", x2, x3) %>% add_moderator("KernelSmooth", x1, x4, x5) %>% add_vimp(sample_splitting = FALSE) -> hte_cfg
library("dplyr") basic_config() %>% add_known_propensity_score("ps") %>% add_outcome_model("SL.glm.interaction") %>% add_outcome_model("SL.glmnet", alpha = c(0.05, 0.15, 0.2, 0.25, 0.5, 0.75)) %>% add_outcome_model("SL.glmnet.interaction", alpha = c(0.05, 0.15, 0.2, 0.25, 0.5, 0.75)) %>% add_outcome_diagnostic("RROC") %>% add_effect_model("SL.glm.interaction") %>% add_effect_model("SL.glmnet", alpha = c(0.05, 0.15, 0.2, 0.25, 0.5, 0.75)) %>% add_effect_model("SL.glmnet.interaction", alpha = c(0.05, 0.15, 0.2, 0.25, 0.5, 0.75)) %>% add_effect_diagnostic("RROC") %>% add_moderator("Stratified", x2, x3) %>% add_moderator("KernelSmooth", x1, x4, x5) %>% add_vimp(sample_splitting = FALSE) -> hte_cfg
Constant_cfg
is a configuration class for estimating a constant model.
That is, the model is a simple, one-parameter mean model.
tidyhte::Model_cfg
-> Constant_cfg
model_class
The class of the model, required for all classes
which inherit from Model_cfg
.
new()
Create a new Constant_cfg
object.
Constant_cfg$new()
A new Constant_cfg
object.
Constant_cfg$new()
clone()
The objects of this class are cloneable with this method.
Constant_cfg$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `Constant_cfg$new` ## ------------------------------------------------ Constant_cfg$new()
## ------------------------------------------------ ## Method `Constant_cfg$new` ## ------------------------------------------------ Constant_cfg$new()
construct_pseudo_outcomes
takes a dataset which has been prepared
with plugin estimators of nuisance parameters and transforms these into
a "pseudo-outcome": an unbiased estimator of the conditional average
treatment effect under exogeneity.
construct_pseudo_outcomes(data, outcome, treatment, type = "dr")
construct_pseudo_outcomes(data, outcome, treatment, type = "dr")
data |
dataframe (already prepared with |
outcome |
Unquoted name of outcome variable. |
treatment |
Unquoted name of treatment variable. |
type |
String representing how to construct the pseudo-outcome. Valid values are "dr" (the default), "ipw" and "plugin". See "Details" for more discussion of these options. |
Taking averages of these pseudo-outcomes (or fitting a model to them) will approximate averages (or models) of the underlying treatment effect.
attach_config()
, make_splits()
, produce_plugin_estimates()
, estimate_QoI()
Diagnostics_cfg
is a configuration class for estimating a variety of
diagnostics for the models trained in the course of HTE estimation.
ps
Model diagnostics for the propensity score model.
outcome
Model diagnostics for the outcome models.
effect
Model diagnostics for the joint effect model.
params
Parameters for any requested diagnostics.
new()
Create a new Diagnostics_cfg
object with specified diagnostics to estimate.
Diagnostics_cfg$new(ps = NULL, outcome = NULL, effect = NULL, params = NULL)
ps
Model diagnostics for the propensity score model.
outcome
Model diagnostics for the outcome models.
effect
Model diagnostics for the joint effect model.
params
List providing values for parameters to any requested diagnostics.
A new Diagnostics_cfg
object.
Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE", "RROC"), ps = c("SL_risk", "SL_coefs", "AUC") )
add()
Add diagnostics to the Diagnostics_cfg
object.
Diagnostics_cfg$add(ps = NULL, outcome = NULL, effect = NULL)
ps
Model diagnostics for the propensity score model.
outcome
Model diagnostics for the outcome models.
effect
Model diagnostics for the joint effect model.
An updated Diagnostics_cfg
object.
cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE", "RROC"), ps = c("SL_risk", "SL_coefs") ) cfg <- cfg$add(ps = "AUC")
clone()
The objects of this class are cloneable with this method.
Diagnostics_cfg$clone(deep = FALSE)
deep
Whether to make a deep clone.
Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE", "RROC"), ps = c("SL_risk", "SL_coefs", "AUC") ) ## ------------------------------------------------ ## Method `Diagnostics_cfg$new` ## ------------------------------------------------ Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE", "RROC"), ps = c("SL_risk", "SL_coefs", "AUC") ) ## ------------------------------------------------ ## Method `Diagnostics_cfg$add` ## ------------------------------------------------ cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE", "RROC"), ps = c("SL_risk", "SL_coefs") ) cfg <- cfg$add(ps = "AUC")
Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE", "RROC"), ps = c("SL_risk", "SL_coefs", "AUC") ) ## ------------------------------------------------ ## Method `Diagnostics_cfg$new` ## ------------------------------------------------ Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE", "RROC"), ps = c("SL_risk", "SL_coefs", "AUC") ) ## ------------------------------------------------ ## Method `Diagnostics_cfg$add` ## ------------------------------------------------ cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE", "RROC"), ps = c("SL_risk", "SL_coefs") ) cfg <- cfg$add(ps = "AUC")
estimate_QoI
takes a dataframe already prepared with split IDs,
plugin estimates and pseudo-outcomes and calculates the requested
quantities of interest (QoIs).
estimate_QoI(data, ...)
estimate_QoI(data, ...)
data |
data frame (already prepared with |
... |
Unquoted names of moderators to calculate QoIs for. |
To see an example analysis, read vignette("experimental_analysis")
in the context
of an experiment, vignette("experimental_analysis")
for an observational study, or
vignette("methodological_details")
for a deeper dive under the hood.
attach_config()
, make_splits()
, produce_plugin_estimates()
,
construct_pseudo_outcomes()
,
library("dplyr") if(require("palmerpenguins")) { data(package = 'palmerpenguins') penguins$unitid = seq_len(nrow(penguins)) penguins$propensity = rep(0.5, nrow(penguins)) penguins$treatment = rbinom(nrow(penguins), 1, penguins$propensity) cfg <- basic_config() %>% add_known_propensity_score("propensity") %>% add_outcome_model("SL.glm.interaction") %>% remove_vimp() attach_config(penguins, cfg) %>% make_splits(unitid, .num_splits = 4) %>% produce_plugin_estimates(outcome = body_mass_g, treatment = treatment, species, sex) %>% construct_pseudo_outcomes(body_mass_g, treatment) %>% estimate_QoI(species, sex) }
library("dplyr") if(require("palmerpenguins")) { data(package = 'palmerpenguins') penguins$unitid = seq_len(nrow(penguins)) penguins$propensity = rep(0.5, nrow(penguins)) penguins$treatment = rbinom(nrow(penguins), 1, penguins$propensity) cfg <- basic_config() %>% add_known_propensity_score("propensity") %>% add_outcome_model("SL.glm.interaction") %>% remove_vimp() attach_config(penguins, cfg) %>% make_splits(unitid, .num_splits = 4) %>% produce_plugin_estimates(outcome = body_mass_g, treatment = treatment, species, sex) %>% construct_pseudo_outcomes(body_mass_g, treatment) %>% estimate_QoI(species, sex) }
HTE_cfg
is a configuration class that pulls everything together, indicating
the full configuration for a given HTE analysis. This includes how to estimate
models and what Quantities of Interest to calculate based off those underlying models.
outcome
Model_cfg
object indicating how outcome models should be estimated.
treatment
Model_cfg
object indicating how the propensity score
model should be estimated.
effect
Model_cfg
object indicating how the joint effect model
should be estimated.
qoi
QoI_cfg
object indicating what the Quantities of Interest
are and providing all
necessary detail on how they should be estimated.
verbose
Logical indicating whether to print debugging information.
new()
Create a new HTE_cfg
object with all necessary information about how
to carry out an HTE analysis.
HTE_cfg$new( outcome = NULL, treatment = NULL, effect = NULL, qoi = NULL, verbose = FALSE )
outcome
Model_cfg
object indicating how outcome models should
be estimated.
treatment
Model_cfg
object indicating how the propensity score
model should be estimated.
effect
Model_cfg
object indicating how the joint effect model
should be estimated.
qoi
QoI_cfg
object indicating what the Quantities of Interest
are and providing all
necessary detail on how they should be estimated.
verbose
Logical indicating whether to print debugging information.
mcate_cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) pcate_cfg <- PCATE_cfg$new( cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)), model_covariates = c("x1", "x2", "x3"), num_mc_samples = list(x1 = 100) ) vimp_cfg <- VIMP_cfg$new() diag_cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE"), ps = c("SL_risk", "SL_coefs", "AUC") ) qoi_cfg <- QoI_cfg$new( mcate = mcate_cfg, pcate = pcate_cfg, vimp = vimp_cfg, diag = diag_cfg ) ps_cfg <- SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) y_cfg <- SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) fx_cfg <- SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) HTE_cfg$new(outcome = y_cfg, treatment = ps_cfg, effect = fx_cfg, qoi = qoi_cfg)
clone()
The objects of this class are cloneable with this method.
HTE_cfg$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `HTE_cfg$new` ## ------------------------------------------------ mcate_cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) pcate_cfg <- PCATE_cfg$new( cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)), model_covariates = c("x1", "x2", "x3"), num_mc_samples = list(x1 = 100) ) vimp_cfg <- VIMP_cfg$new() diag_cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE"), ps = c("SL_risk", "SL_coefs", "AUC") ) qoi_cfg <- QoI_cfg$new( mcate = mcate_cfg, pcate = pcate_cfg, vimp = vimp_cfg, diag = diag_cfg ) ps_cfg <- SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) y_cfg <- SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) fx_cfg <- SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) HTE_cfg$new(outcome = y_cfg, treatment = ps_cfg, effect = fx_cfg, qoi = qoi_cfg)
## ------------------------------------------------ ## Method `HTE_cfg$new` ## ------------------------------------------------ mcate_cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) pcate_cfg <- PCATE_cfg$new( cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)), model_covariates = c("x1", "x2", "x3"), num_mc_samples = list(x1 = 100) ) vimp_cfg <- VIMP_cfg$new() diag_cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE"), ps = c("SL_risk", "SL_coefs", "AUC") ) qoi_cfg <- QoI_cfg$new( mcate = mcate_cfg, pcate = pcate_cfg, vimp = vimp_cfg, diag = diag_cfg ) ps_cfg <- SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) y_cfg <- SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) fx_cfg <- SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) HTE_cfg$new(outcome = y_cfg, treatment = ps_cfg, effect = fx_cfg, qoi = qoi_cfg)
KernelSmooth_cfg
is a configuration class for non-parametric local-linear
regression to construct a smooth representation of the relationship between
two variables. This is typically used for displaying a surface of the conditional
average treatment effect over a continuous covariate.
Kernel smoothing is handled by the nprobust
package.
tidyhte::Model_cfg
-> KernelSmooth_cfg
model_class
The class of the model, required for all classes
which inherit from Model_cfg
.
neval
The number of points at which to evaluate the local regression. More points will provide a smoother line at the cost of somewhat higher computation.
eval_min_quantile
Minimum quantile at which to evaluate the smoother.
new()
Create a new KernelSmooth_cfg
object with specified number of evaluation points.
KernelSmooth_cfg$new(neval = 100, eval_min_quantile = 0.05)
neval
The number of points at which to evaluate the local regression. More points will provide a smoother line at the cost of somewhat higher computation.
eval_min_quantile
Minimum quantile at which to evaluate the smoother. A value of zero will do no clipping. Clipping is performed from both the top and the bottom of the empirical distribution. A value of alpha would evaluate over [alpha, 1 - alpha].
A new KernelSmooth_cfg
object.
KernelSmooth_cfg$new(neval = 100)
clone()
The objects of this class are cloneable with this method.
KernelSmooth_cfg$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `KernelSmooth_cfg$new` ## ------------------------------------------------ KernelSmooth_cfg$new(neval = 100)
## ------------------------------------------------ ## Method `KernelSmooth_cfg$new` ## ------------------------------------------------ KernelSmooth_cfg$new(neval = 100)
Known_cfg
is a configuration class for when a particular model is known
a-priori. The prototypical usage of this class is when heterogeneous
treatment effects are estimated in the context of a randomized control
trial with known propensity scores.
tidyhte::Model_cfg
-> Known_cfg
covariate_name
The name of the column in the dataset which corresponds to the known model score.
model_class
The class of the model, required for all classes
which inherit from Model_cfg
.
new()
Create a new Known_cfg
object with specified covariate column.
Known_cfg$new(covariate_name)
covariate_name
The name of the column, a string, in the dataset corresponding to the known model score (i.e. the true conditional expectation).
A new Known_cfg
object.
Known_cfg$new("propensity_score")
clone()
The objects of this class are cloneable with this method.
Known_cfg$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `Known_cfg$new` ## ------------------------------------------------ Known_cfg$new("propensity_score")
## ------------------------------------------------ ## Method `Known_cfg$new` ## ------------------------------------------------ Known_cfg$new("propensity_score")
This takes a dataset, a column with a unique identifier and an
arbitrary number of covariates on which to stratify the splits.
It returns the original dataset with an additional column .split_id
corresponding to an identifier for the split.
make_splits(data, identifier, ..., .num_splits)
make_splits(data, identifier, ..., .num_splits)
data |
dataframe |
identifier |
Unquoted name of unique identifier column |
... |
variables on which to stratify (requires that |
.num_splits |
number of splits to create. If VIMP is requested in |
To see an example analysis, read vignette("experimental_analysis")
in the context
of an experiment, vignette("experimental_analysis")
for an observational study, or
vignette("methodological_details")
for a deeper dive under the hood.
original dataframe with additional .split_id
column
attach_config()
, produce_plugin_estimates()
, construct_pseudo_outcomes()
,
estimate_QoI()
library("dplyr") if(require("palmerpenguins")) { data(package = 'palmerpenguins') penguins$unitid = seq_len(nrow(penguins)) penguins$propensity = rep(0.5, nrow(penguins)) penguins$treatment = rbinom(nrow(penguins), 1, penguins$propensity) cfg <- basic_config() %>% add_known_propensity_score("propensity") %>% add_outcome_model("SL.glm.interaction") %>% remove_vimp() attach_config(penguins, cfg) %>% make_splits(unitid, .num_splits = 4) %>% produce_plugin_estimates(outcome = body_mass_g, treatment = treatment, species, sex) %>% construct_pseudo_outcomes(body_mass_g, treatment) %>% estimate_QoI(species, sex) }
library("dplyr") if(require("palmerpenguins")) { data(package = 'palmerpenguins') penguins$unitid = seq_len(nrow(penguins)) penguins$propensity = rep(0.5, nrow(penguins)) penguins$treatment = rbinom(nrow(penguins), 1, penguins$propensity) cfg <- basic_config() %>% add_known_propensity_score("propensity") %>% add_outcome_model("SL.glm.interaction") %>% remove_vimp() attach_config(penguins, cfg) %>% make_splits(unitid, .num_splits = 4) %>% produce_plugin_estimates(outcome = body_mass_g, treatment = treatment, species, sex) %>% construct_pseudo_outcomes(body_mass_g, treatment) %>% estimate_QoI(species, sex) }
MCATE_cfg
is a configuration class for estimating marginal response
surfaces based on heterogeneous treatment effect estimates. "Marginal"
in this context implies that all other covariates are marginalized.
Thus, if two covariates are highly correlated, it is likely that their
MCATE surfaces will be extremely similar.
cfgs
Named list of covariates names to a Model_cfg
object defining
how to present that covariate's CATE surface (while marginalizing
over all other covariates).
std_errors
Boolean indicating whether the results should be returned with standard errors or not.
estimand
String indicating the estimand to target.
new()
Create a new MCATE_cfg
object with specified model name and hyperparameters.
MCATE_cfg$new(cfgs, std_errors = TRUE)
cfgs
Named list from moderator name to a Model_cfg
object
defining how to present that covariate's CATE surface (while
marginalizing over all other covariates)
std_errors
Boolean indicating whether the results should be returned with standard errors or not.
A new MCATE_cfg
object.
MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)))
add_moderator()
Add a moderator to the MCATE_cfg
object. This entails defining a configuration
for displaying the effect surface for that moderator.
MCATE_cfg$add_moderator(var_name, cfg)
var_name
The name of the moderator to add (and the name of the column in the dataset).
cfg
A Model_cfg
defining how to display the selected moderator's effect
surface.
An updated MCATE_cfg
object.
cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) cfg <- cfg$add_moderator("x2", KernelSmooth_cfg$new(neval = 100))
clone()
The objects of this class are cloneable with this method.
MCATE_cfg$clone(deep = FALSE)
deep
Whether to make a deep clone.
MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) ## ------------------------------------------------ ## Method `MCATE_cfg$new` ## ------------------------------------------------ MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) ## ------------------------------------------------ ## Method `MCATE_cfg$add_moderator` ## ------------------------------------------------ cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) cfg <- cfg$add_moderator("x2", KernelSmooth_cfg$new(neval = 100))
MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) ## ------------------------------------------------ ## Method `MCATE_cfg$new` ## ------------------------------------------------ MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) ## ------------------------------------------------ ## Method `MCATE_cfg$add_moderator` ## ------------------------------------------------ cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) cfg <- cfg$add_moderator("x2", KernelSmooth_cfg$new(neval = 100))
Model_cfg
is the base class from which all other model configurations
inherit.
model_class
The class of the model, required for all classes
which inherit from Model_cfg
.
new()
Create a new Model_cfg
object with any necessary parameters.
Model_cfg$new()
A new Model_cfg
object.
clone()
The objects of this class are cloneable with this method.
Model_cfg$clone(deep = FALSE)
deep
Whether to make a deep clone.
R6 class to represent data to be used in estimating a model
R6 class to represent data to be used in estimating a model
This class provides consistent names and interfaces to data which will be used in a supervised regression / classification model.
label
The labels for the eventual model as a vector.
features
The matrix representation of the data to be used for model fitting.
Constructed using stats::model.matrix
.
model_frame
The data-frame representation of the data as constructed by
stats::model.frame
.
split_id
The split identifiers as a vector.
num_splits
The integer number of splits in the data.
cluster
A cluster ID as a vector, constructed using the unit identifiers.
weights
The case-weights as a vector.
new()
Creates an R6 object to represent data to be used in a prediction model.
Model_data$new(data, label_col, ..., .weight_col = NULL)
data
The full dataset to populate the class with.
label_col
The unquoted name of the column to use as the label in supervised learning models.
...
The unquoted names of features to use in the model.
.weight_col
The unquoted name of the column to use as case-weights in subsequent models.
A Model_data
object.
library("dplyr") df <- dplyr::tibble( uid = 1:100, x1 = rnorm(100), x2 = rnorm(100), x3 = sample(4, 100, replace = TRUE) ) %>% dplyr::mutate( y = x1 + x2 + x3 + rnorm(100), x3 = factor(x3) ) df <- make_splits(df, uid, .num_splits = 5) data <- Model_data$new(df, y, x1, x2, x3)
SL_cv_control()
A helper function to create the cross-validation options to be used by SuperLearner.
Model_data$SL_cv_control()
clone()
The objects of this class are cloneable with this method.
Model_data$clone(deep = FALSE)
deep
Whether to make a deep clone.
SuperLearner::SuperLearner.CV.control
## ------------------------------------------------ ## Method `Model_data$new` ## ------------------------------------------------ library("dplyr") df <- dplyr::tibble( uid = 1:100, x1 = rnorm(100), x2 = rnorm(100), x3 = sample(4, 100, replace = TRUE) ) %>% dplyr::mutate( y = x1 + x2 + x3 + rnorm(100), x3 = factor(x3) ) df <- make_splits(df, uid, .num_splits = 5) data <- Model_data$new(df, y, x1, x2, x3)
## ------------------------------------------------ ## Method `Model_data$new` ## ------------------------------------------------ library("dplyr") df <- dplyr::tibble( uid = 1:100, x1 = rnorm(100), x2 = rnorm(100), x3 = sample(4, 100, replace = TRUE) ) %>% dplyr::mutate( y = x1 + x2 + x3 + rnorm(100), x3 = factor(x3) ) df <- make_splits(df, uid, .num_splits = 5) data <- Model_data$new(df, y, x1, x2, x3)
Prediction for the glmnet wrapper.
## S3 method for class 'SL.glmnet.interaction' predict( object, newdata, remove_extra_cols = TRUE, add_missing_cols = TRUE, ... )
## S3 method for class 'SL.glmnet.interaction' predict( object, newdata, remove_extra_cols = TRUE, add_missing_cols = TRUE, ... )
object |
Result object from SL.glmnet |
newdata |
Dataframe or matrix that will generate predictions. |
remove_extra_cols |
Remove any extra columns in the new data that were not part of the original model. |
add_missing_cols |
Add any columns from original data that do not exist in the new data, and set values to 0. |
... |
Any additional arguments (not used). |
This takes a dataset with an identified outcome and treatment column along
with any number of covariates and appends three columns to the dataset corresponding
to an estimate of the conditional expectation of treatment (.pi_hat
), along with the
conditional expectation of the control and treatment potential outcome surfaces
(.mu0_hat
and .mu1_hat
respectively).
produce_plugin_estimates(data, outcome, treatment, ..., .weights = NULL)
produce_plugin_estimates(data, outcome, treatment, ..., .weights = NULL)
data |
dataframe (already prepared with |
outcome |
Unquoted name of the outcome variable. |
treatment |
Unquoted name of the treatment variable. |
... |
Unquoted names of covariates to include in the models of the nuisance functions. |
.weights |
Unquoted name of weights column. If NULL, all analysis will assume weights are all equal to one and sample-based quantities will be returned. |
To see an example analysis, read vignette("experimental_analysis")
in the context
of an experiment, vignette("experimental_analysis")
for an observational study, or
vignette("methodological_details")
for a deeper dive under the hood.
attach_config()
, make_splits()
, construct_pseudo_outcomes()
, estimate_QoI()
library("dplyr") if(require("palmerpenguins")) { data(package = 'palmerpenguins') penguins$unitid = seq_len(nrow(penguins)) penguins$propensity = rep(0.5, nrow(penguins)) penguins$treatment = rbinom(nrow(penguins), 1, penguins$propensity) cfg <- basic_config() %>% add_known_propensity_score("propensity") %>% add_outcome_model("SL.glm.interaction") %>% remove_vimp() attach_config(penguins, cfg) %>% make_splits(unitid, .num_splits = 4) %>% produce_plugin_estimates(outcome = body_mass_g, treatment = treatment, species, sex) %>% construct_pseudo_outcomes(body_mass_g, treatment) %>% estimate_QoI(species, sex) }
library("dplyr") if(require("palmerpenguins")) { data(package = 'palmerpenguins') penguins$unitid = seq_len(nrow(penguins)) penguins$propensity = rep(0.5, nrow(penguins)) penguins$treatment = rbinom(nrow(penguins), 1, penguins$propensity) cfg <- basic_config() %>% add_known_propensity_score("propensity") %>% add_outcome_model("SL.glm.interaction") %>% remove_vimp() attach_config(penguins, cfg) %>% make_splits(unitid, .num_splits = 4) %>% produce_plugin_estimates(outcome = body_mass_g, treatment = treatment, species, sex) %>% construct_pseudo_outcomes(body_mass_g, treatment) %>% estimate_QoI(species, sex) }
QoI_cfg
is a configuration class for the Quantities of Interest to be
generated by the HTE analysis.
mcate
A configuration object of type MCATE_cfg
of
marginal effects to calculate.
pcate
A configuration object of type PCATE_cfg
of
partial effects to calculate.
vimp
A configuration object of type VIMP_cfg
of
variable importance to calculate.
diag
A configuration object of type Diagnostics_cfg
of
model diagnostics to calculate.
ate
Logical flag indicating whether an estimate of the ATE should be returned.
predictions
Logical flag indicating whether estimates of the CATE for every unit should be returned.
new()
Create a new QoI_cfg
object with specified Quantities of Interest
to estimate.
QoI_cfg$new( mcate = NULL, pcate = NULL, vimp = NULL, diag = NULL, ate = TRUE, predictions = FALSE )
mcate
A configuration object of type MCATE_cfg
of marginal
effects to calculate.
pcate
A configuration object of type PCATE_cfg
of partial
effects to calculate.
vimp
A configuration object of type VIMP_cfg
of variable
importance to calculate.
diag
A configuration object of type Diagnostics_cfg
of
model diagnostics to calculate.
ate
A logical flag for whether to calculate the Average Treatment Effect (ATE) or not.
predictions
A logical flag for whether to return predictions of the CATE for every unit or not.
A new Diagnostics_cfg
object.
mcate_cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) pcate_cfg <- PCATE_cfg$new( cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)), model_covariates = c("x1", "x2", "x3"), num_mc_samples = list(x1 = 100) ) vimp_cfg <- VIMP_cfg$new() diag_cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE"), ps = c("SL_risk", "SL_coefs", "AUC") ) QoI_cfg$new( mcate = mcate_cfg, pcate = pcate_cfg, vimp = vimp_cfg, diag = diag_cfg )
clone()
The objects of this class are cloneable with this method.
QoI_cfg$clone(deep = FALSE)
deep
Whether to make a deep clone.
mcate_cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) pcate_cfg <- PCATE_cfg$new( cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)), model_covariates = c("x1", "x2", "x3"), num_mc_samples = list(x1 = 100) ) vimp_cfg <- VIMP_cfg$new() diag_cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE"), ps = c("SL_risk", "SL_coefs", "AUC") ) QoI_cfg$new( mcate = mcate_cfg, pcate = pcate_cfg, vimp = vimp_cfg, diag = diag_cfg ) ## ------------------------------------------------ ## Method `QoI_cfg$new` ## ------------------------------------------------ mcate_cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) pcate_cfg <- PCATE_cfg$new( cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)), model_covariates = c("x1", "x2", "x3"), num_mc_samples = list(x1 = 100) ) vimp_cfg <- VIMP_cfg$new() diag_cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE"), ps = c("SL_risk", "SL_coefs", "AUC") ) QoI_cfg$new( mcate = mcate_cfg, pcate = pcate_cfg, vimp = vimp_cfg, diag = diag_cfg )
mcate_cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) pcate_cfg <- PCATE_cfg$new( cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)), model_covariates = c("x1", "x2", "x3"), num_mc_samples = list(x1 = 100) ) vimp_cfg <- VIMP_cfg$new() diag_cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE"), ps = c("SL_risk", "SL_coefs", "AUC") ) QoI_cfg$new( mcate = mcate_cfg, pcate = pcate_cfg, vimp = vimp_cfg, diag = diag_cfg ) ## ------------------------------------------------ ## Method `QoI_cfg$new` ## ------------------------------------------------ mcate_cfg <- MCATE_cfg$new(cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100))) pcate_cfg <- PCATE_cfg$new( cfgs = list(x1 = KernelSmooth_cfg$new(neval = 100)), model_covariates = c("x1", "x2", "x3"), num_mc_samples = list(x1 = 100) ) vimp_cfg <- VIMP_cfg$new() diag_cfg <- Diagnostics_cfg$new( outcome = c("SL_risk", "SL_coefs", "MSE"), ps = c("SL_risk", "SL_coefs", "AUC") ) QoI_cfg$new( mcate = mcate_cfg, pcate = pcate_cfg, vimp = vimp_cfg, diag = diag_cfg )
This removes the variable importance quantity of interest
from an HTE_cfg
.
remove_vimp(hte_cfg)
remove_vimp(hte_cfg)
hte_cfg |
|
Updated HTE_cfg
object
library("dplyr") basic_config() %>% remove_vimp() -> hte_cfg
library("dplyr") basic_config() %>% remove_vimp() -> hte_cfg
Penalized regression using elastic net. Alpha = 0 corresponds to ridge regression and alpha = 1 corresponds to Lasso. Included in the model are pairwise interactions between covariates.
See vignette("glmnet_beta", package = "glmnet")
for a nice tutorial on
glmnet.
SL.glmnet.interaction( Y, X, newX, family, obsWeights, id, alpha = 1, nfolds = 10, nlambda = 100, useMin = TRUE, loss = "deviance", ... )
SL.glmnet.interaction( Y, X, newX, family, obsWeights, id, alpha = 1, nfolds = 10, nlambda = 100, useMin = TRUE, loss = "deviance", ... )
Y |
Outcome variable |
X |
Covariate dataframe |
newX |
Dataframe to predict the outcome |
family |
"gaussian" for regression, "binomial" for binary classification. Untested options: "multinomial" for multiple classification or "mgaussian" for multiple response, "poisson" for non-negative outcome with proportional mean and variance, "cox". |
obsWeights |
Optional observation-level weights |
id |
Optional id to group observations from the same unit (not used currently). |
alpha |
Elastic net mixing parameter, range [0, 1]. 0 = ridge regression and 1 = lasso. |
nfolds |
Number of folds for internal cross-validation to optimize lambda. |
nlambda |
Number of lambda values to check, recommended to be 100 or more. |
useMin |
If TRUE use lambda that minimizes risk, otherwise use 1 standard-error rule which chooses a higher penalty with performance within one standard error of the minimum (see Breiman et al. 1984 on CART for background). |
loss |
Loss function, can be "deviance", "mse", or "mae". If family = binomial can also be "auc" or "class" (misclassification error). |
... |
Any additional arguments are passed through to cv.glmnet. |
SLEnsemble_cfg
is a configuration class for estimation of a model
using an ensemble of models using SuperLearner
.
tidyhte::Model_cfg
-> SLEnsemble_cfg
cvControl
A list of parameters for controlling the cross-validation used in SuperLearner.
SL.library
A vector of the names of learners to include in the SuperLearner ensemble.
SL.env
An environment containing all of the programmatically generated learners to be included in the SuperLearner ensemble.
family
stats::family
object to determine how SuperLearner
should be fitted.
model_class
The class of the model, required for all classes
which inherit from Model_cfg
.
new()
Create a new SLEnsemble_cfg
object with specified settings.
SLEnsemble_cfg$new( cvControl = NULL, learner_cfgs = NULL, family = stats::gaussian() )
cvControl
A list of parameters for controlling the
cross-validation used in SuperLearner.
For more details, see SuperLearner::SuperLearner.CV.control
.
learner_cfgs
A list of SLLearner_cfg
objects.
family
stats::family
object to determine how SuperLearner should be fitted.
A new SLEnsemble_cfg
object.
SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) )
add_sublearner()
Adds a model (or class of models) to the SuperLearner ensemble. If hyperparameter values are specified, this method will add a learner for every element in the cross-product of provided hyperparameter values.
SLEnsemble_cfg$add_sublearner(learner_name, hps = NULL)
learner_name
Possible values
use SuperLearner
naming conventions. A full list is available
with SuperLearner::listWrappers("SL")
hps
A named list of hyper-parameters. Every element of the cross-product of these hyper-parameters will be included in the ensemble. cfg <- SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm")) ) cfg <- cfg$add_sublearner("SL.gam", list(deg.gam = c(2, 3)))
clone()
The objects of this class are cloneable with this method.
SLEnsemble_cfg$clone(deep = FALSE)
deep
Whether to make a deep clone.
SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) ## ------------------------------------------------ ## Method `SLEnsemble_cfg$new` ## ------------------------------------------------ SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) )
SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) ) ## ------------------------------------------------ ## Method `SLEnsemble_cfg$new` ## ------------------------------------------------ SLEnsemble_cfg$new( learner_cfgs = list(SLLearner_cfg$new("SL.glm"), SLLearner_cfg$new("SL.gam")) )
SLLearner_cfg
is a configuration class for a single
sublearner to be included in SuperLearner. By constructing with a named list
of hyperparameters, this configuration allows distinct submodels
for each unique combination of hyperparameters. To understand what models
and hyperparameters are available, examine the methods listed in
SuperLearner::listWrappers("SL")
.
model_name
The name of the model as passed to SuperLearner
through the SL.library
parameter.
hyperparameters
Named list from hyperparameter name to a vector of values that should be swept over.
new()
Create a new SLLearner_cfg
object with specified model name and hyperparameters.
SLLearner_cfg$new(model_name, hp = NULL)
model_name
The name of the model as passed to SuperLearner
through the SL.library
parameter.
hp
Named list from hyperparameter name to a vector of values that should be swept over. Hyperparameters not included in this list are left at their SuperLearner default values.
A new SLLearner_cfg
object.
SLLearner_cfg$new("SL.glm") SLLearner_cfg$new("SL.gam", list(deg.gam = c(2, 3)))
clone()
The objects of this class are cloneable with this method.
SLLearner_cfg$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `SLLearner_cfg$new` ## ------------------------------------------------ SLLearner_cfg$new("SL.glm") SLLearner_cfg$new("SL.gam", list(deg.gam = c(2, 3)))
## ------------------------------------------------ ## Method `SLLearner_cfg$new` ## ------------------------------------------------ SLLearner_cfg$new("SL.glm") SLLearner_cfg$new("SL.gam", list(deg.gam = c(2, 3)))
Stratified_cfg
is a configuration class for stratifying a covariate
and calculating statistics within each cell.
tidyhte::Model_cfg
-> Stratified_cfg
model_class
The class of the model, required for all classes
which inherit from Model_cfg
.
covariate
The name of the column in the dataset which corresponds to the covariate on which to stratify.
new()
Create a new Stratified_cfg
object with specified number of evaluation points.
Stratified_cfg$new(covariate)
covariate
The name of the column in the dataset which corresponds to the covariate on which to stratify.
A new Stratified_cfg
object.
Stratified_cfg$new(covariate = "test_covariate")
clone()
The objects of this class are cloneable with this method.
Stratified_cfg$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `Stratified_cfg$new` ## ------------------------------------------------ Stratified_cfg$new(covariate = "test_covariate")
## ------------------------------------------------ ## Method `Stratified_cfg$new` ## ------------------------------------------------ Stratified_cfg$new(covariate = "test_covariate")
VIMP_cfg
is a configuration class for estimating a variable importance measure
across all moderators. This provides a meaningful measure of which moderators
explain the most of the CATE surface.
estimand
String indicating the estimand to target.
sample_splitting
Logical indicating whether to use sample splitting in the calculation of variable importance.
linear
Logical indicating whether the variable importance assuming a linear model should be estimated.
new()
Create a new VIMP_cfg
object with specified model configuration.
VIMP_cfg$new(sample_splitting = TRUE, linear_only = FALSE)
sample_splitting
Logical indicating whether to use sample splitting in the calculation of variable importance. Choosing not to use sample splitting means that inference will only be valid for moderators with non-null importance.
linear_only
Logical indicating whether the variable importance should use only a single linear-only model. Variable importance measure will only be consistent for the population quantity if the true model of pseudo-outcomes is linear.
A new VIMP_cfg
object.
VIMP_cfg$new()
clone()
The objects of this class are cloneable with this method.
VIMP_cfg$clone(deep = FALSE)
deep
Whether to make a deep clone.
Williamson, B. D., Gilbert, P. B., Carone, M., & Simon, N. (2021). Nonparametric variable importance assessment using machine learning techniques. Biometrics, 77(1), 9-22.
Williamson, B. D., Gilbert, P. B., Simon, N. R., & Carone, M. (2021). A general framework for inference on algorithm-agnostic variable importance. Journal of the American Statistical Association, 1-14.
VIMP_cfg$new() ## ------------------------------------------------ ## Method `VIMP_cfg$new` ## ------------------------------------------------ VIMP_cfg$new()
VIMP_cfg$new() ## ------------------------------------------------ ## Method `VIMP_cfg$new` ## ------------------------------------------------ VIMP_cfg$new()