Skip to contents

By default, fit_net_logit() does not standardize predictor variables. If you want numeric variables to be standardized, you can either use [bag_fit_net_logit()] with parameter standardize = TRUE or provide an already standardized data set as input.

Usage

fit_net_logit(
  f,
  data,
  samples,
  i = 1,
  metric = c("AUC")[1],
  metrics_evaluate = c("AUC"),
  method = c("Lasso", "Ridge", "AdaptiveLasso", "DistanceDecay-AdaptiveLasso",
    "DD-AdaptiveLasso", "OneZOI-AdaptiveLasso", "OZ-AdaptiveLasso",
    "Grouped-AdaptiveLasso", "G-AdaptiveLasso", "HypothesisDriven-AdaptiveLasso",
    "HD-AdaptiveLasso", "ElasticNet")[1],
  alpha = NULL,
  penalty.factor = NULL,
  gamma = 1,
  standardize = c("internal", "external", FALSE)[1],
  predictor_table = NULL,
  function_lasso_decay = c(log, function(x) x/1000)[[1]],
  value_lasso_decay = 1,
  factor_hypothesis = 1,
  factor_grouped_lasso = 1,
  na.action = "na.pass",
  out_dir_file = NULL,
  verbose = FALSE,
  ...
)

fit_net_rsf(
  f,
  data,
  samples,
  i = 1,
  metric = c("AUC")[1],
  metrics_evaluate = c("AUC"),
  method = c("Lasso", "Ridge", "AdaptiveLasso", "DistanceDecay-AdaptiveLasso",
    "DD-AdaptiveLasso", "OneZOI-AdaptiveLasso", "OZ-AdaptiveLasso",
    "Grouped-AdaptiveLasso", "G-AdaptiveLasso", "HypothesisDriven-AdaptiveLasso",
    "HD-AdaptiveLasso", "ElasticNet")[1],
  alpha = NULL,
  penalty.factor = NULL,
  gamma = 1,
  standardize = c("internal", "external", FALSE)[1],
  predictor_table = NULL,
  function_lasso_decay = c(log, function(x) x/1000)[[1]],
  value_lasso_decay = 1,
  factor_hypothesis = 1,
  factor_grouped_lasso = 1,
  na.action = "na.pass",
  out_dir_file = NULL,
  verbose = FALSE,
  ...
)

Arguments

f

[formula]
Formula of the model to be fitted, with all possible candidate terms.

data

[data.frame,tibble]
Complete data set to be analyzed.

samples

[list]
List of samples with at least three elements: train, test, and validate. Each elements might have several elements, each representing the lines of data to be sampled for each resample. Typically, this is computed by the function create_resamples().

metric

[function,character]{AUC, conditionalBoyce, conditionalSomersD, conditionalAUC}
Function representing the metric to evaluate goodness-of-fit. One of AUC (Default), conditionalBoyce, conditionalSomersD, and conditionalAUC. A user-defined function might be provided, with a condition that it must be maximized to find the best fit model. It can also be a character, in case it should be one of the following: c("AUC", "conditionalAUC", "conditionalBoyce", "conditionalSomersD").

method

[character="Lasso"]
The penalized regression method used for fitting each model. Default is method = "Lasso", but it could be method = "Ridge" or different flavors of "AdaptiveLasso" (see details below).

gamma

[numeric(1)=1]{(0.5, 1, 2)}
Gamma is the exponent for defining the vector of penalty weights when method = "AdaptiveLasso. This means that the penalties are defined as penalty.factor = 1/(coef_ridge^gamma), where coef_ridge are the coefficients of a Ridge regression. Default is gamma = 1, but values of 0.5 or 2 could also be tried, as suggested by the authors (Zou et al 2006).

standardize

[logical(1)=TRUE]
Logical flag for predictor variable standardization, prior to fitting the model sequence. The coefficients are always returned on the original scale. Default is standardize=TRUE. If variables are in the same units already, you might not wish to standardize them.

out_dir_file

[character(1)=NULL]
String with the prefix of the file name (and the folder) where the result of each model will be saved. E.g. if out_dir_file = "output/test_", the models will be saved as RDS files names "test_i1.rds", "test_i2.rds", etc, within the folder "output".

...

Options for net_logit() and glmnet::glmnet().

References

Zou, H., 2006. The Adaptive Lasso and Its Oracle Properties. Journal of the American Statistical Association 101, 1418–1429. https://doi.org/10.1198/016214506000000735