Skip to contents

This function fits a Variable Length Markov Chain (VLMC) to a discrete time series by optimizing an information criterion (BIC or AIC).

Usage

tune_vlmc(
  x,
  criterion = c("BIC", "AIC"),
  initial = c("truncated", "specific", "extended"),
  alpha_init = NULL,
  cutoff_init = NULL,
  min_size = 2L,
  max_depth = 100L,
  backend = getOption("mixvlmc.backend", "R"),
  verbose = 0,
  save = c("best", "initial", "all")
)

Arguments

x

a discrete time series; can be numeric, character, factor and logical.

criterion

criterion used to select the best model. Either "BIC" (default) or "AIC" (see details).

initial

specifies the likelihood function, more precisely the way the first few observations for which contexts cannot be calculated are integrated in the likelihood. Default to "truncated". See loglikelihood() for details.

alpha_init

if non NULL used as the initial cut off parameter (in quantile scale) to build the initial VLMC

cutoff_init

if non NULL used as the initial cut off parameter to build the initial VLMC. Takes precedence over alpha_init if specified.

min_size

integer >= 1 (default: 2). Minimum number of observations for a context in the growing phase of the initial context tree.

max_depth

integer >= 1 (default: 100). Longest context considered in growing phase of the initial context tree (see details).

backend

backend "R" or "C++" (default: as specified by the "mixvlmc.backend" option). Specifies the implementation used to represent the context tree and to built it. See vlmc() for details.

verbose

integer >= 0 (default: 0). Verbosity level of the pruning process.

save

specify which BIC models are saved during the pruning process. The default value "best" asks the function to keep only the best model according to the criterion. When save="initial" the function keeps in addition the initial (complex) model which is then pruned during the selection process. When save="all", the function returns all the models considered during the selection process.

Value

a list with the following components:

  • best_model: the optimal VLMC

  • criterion: the criterion used to select the optimal VLMC

  • initial: the likelihood function used to select the optimal VLMC

  • results: a data frame with details about the pruning process

  • saved_models: a list of intermediate VLMCs if save="initial" or save="all". It contains an initial component with the large VLMC obtained first and an all component with a list of all the other VLMC obtained by pruning the initial one.

Details

This function automates the process of fitting a large VLMC to a discrete time series with vlmc() and of pruning the tree (with cutoff() and prune()) to get an optimal with respect to an information criterion. To avoid missing long term dependencies, the function uses the max_depth parameter as an initial guess but then relies on an automatic increase of the value to make sure the initial context tree is only limited by the min_size parameter. The initial value of the cutoff parameter of vlmc() is also set to conservative values (depending on the criterion) to avoid prior simplification of the context tree. This default value can be overridden using the cutoff_init or alpha_init parameter.

Once the initial VLMC is obtained, the cutoff() and prune() functions are used to build all the VLMC models that could be generated using larger values of the initial cut off parameter. The best model is selected from this collection, including the initial complex tree, as the one that minimizes the chosen information criterion.

See also

Examples

dts <- sample(as.factor(c("A", "B", "C")), 100, replace = TRUE)
tune_result <- tune_vlmc(dts)
draw(tune_result$best_model)
#> * (0.32, 0.35, 0.33)