Cut off values for pruning the context tree of a VLMC with covariates

This function returns all the cut off values that should induce a pruning of the context tree of a VLMC with covariates.

Usage

# S3 method for class 'covlmc'
cutoff(model, raw = FALSE, tolerance = .Machine$double.eps^0.5, ...)

Arguments

model: a fitted COVLMC model.
raw: specify whether the returned values should be limit values computed in the model or modified values that guarantee pruning (see details)
tolerance: specify the minimum separation between two consecutive values of the cut off in native mode (before any transformation). See details.
...: additional arguments for the cutoff function.

Value

a vector of cut off values, NULL if none can be computed

Details

Notice that the list of cut off values returned by the function is not as complete as the one computed for a VLMC without covariates. Indeed, pruning the COVLMC tree creates new pruning opportunities that are not evaluated during the construction of the initial model, while all pruning opportunities are computed during the construction of a VLMC context tree. Nevertheless, the largest value returned by the function is guaranteed to produce the least pruned tree consistent with the reference one.

For large COVLMC, some cut off values can be almost identical, with a difference of the order of the machine epsilon value. The tolerance parameter is used to keep only values that are different enough. This is done in the quantile scale, before transformations implemented when raw is FALSE.

Notice that the loglikelihood scale is not directly useful in COVLMC as the differences in model sizes are not constant through the pruning process. As a consequence, this function does not provide mode parameter, contrarily to cutoff.vlmc().

Setting raw to TRUE removes the small perturbation that are subtracted from the log-likelihood ratio values computed from the COVLMC (in quantile scale).

As automated model selection is provided by tune_covlmc(), the direct use of cutoff should be reserved to advanced exploration of the set of trees that can be obtained from a complex one, e.g. to implement model selection techniques that are not provided by tune_covlmc().

Examples

pc <- powerconsumption[powerconsumption$week == 5, ]
dts <- cut(pc$active_power, breaks = c(0, quantile(pc$active_power, probs = c(0.5, 1))))
m_nocovariate <- vlmc(dts)
draw(m_nocovariate)
#> * (0.5, 0.5)
#> +-- (0,1.34] (0.9264, 0.07356)
#> '-- (1.34,7.54] (0.07341, 0.9266)
dts_cov <- data.frame(day_night = (pc$hour >= 7 & pc$hour <= 17))
m_cov <- covlmc(dts, dts_cov, min_size = 5)
draw(m_cov)
#> *
#> +-- (0,1.34]
#> |   +-- (0,1.34]
#> |   |   +-- (0,1.34] (0.02329 [ -2.813 2.178 -4.449 2.418 ])
#> |   |   '-- (1.34,7.54] (0.01834 [ -19.57 18.39 ])
#> |   '-- (1.34,7.54] (0.6763 [ -1.856 ])
#> '-- (1.34,7.54] (0.8175 [ 2.535 ])
cutoff(m_cov)
#> [1] 0.02328831 0.01834317