This function fits a Variable Length Markov Chain with covariates (coVLMC) to a discrete time series coupled with a time series of covariates.

## Usage

```
covlmc(
x,
covariate,
alpha = 0.05,
min_size = 5L,
max_depth = 100L,
keep_data = TRUE,
control = covlmc_control(...),
...
)
```

## Arguments

- x
a discrete time series; can be numeric, character, factor or logical.

- covariate
a data frame of covariates.

- alpha
number in (0,1) (default: 0.05) cut off value in the pruning phase (in quantile scale).

- min_size
number >= 1 (default: 5). Tune the minimum number of observations for a context in the growing phase of the context tree (see below for details).

- max_depth
integer >= 1 (default: 100). Longest context considered in growing phase of the context tree.

- keep_data
logical (defaults to

`TRUE`

). If`TRUE`

, the original data are stored in the resulting object to enable post pruning (see`prune.covlmc()`

).- control
a list with control parameters, see

`covlmc_control()`

.- ...
arguments passed to

`covlmc_control()`

.

## Details

The model is built using the algorithm described in Zanin Zambom et al. As
for the `vlmc()`

approach, the algorithm builds first a context tree (see
`ctx_tree()`

). The `min_size`

parameter is used to compute the actual number
of observations per context in the growing phase of the tree. It is computed
as `min_size*(1+ncol(covariate)*d)*(s-1)`

where `d`

is the length of the
context (a.k.a. the depth in the tree) and `s`

is the number of states. This
corresponds to ensuring min_size observations per parameter of the logistic
regression during the estimation phase.

Then logistic models are adjusted in the leaves at the tree: the goal of each logistic model is to estimate the conditional distribution of the next state of the times series given the context (the recent past of the time series) and delayed versions of the covariates. A pruning strategy is used to simplified the models (mainly to reduce the time window associated to the covariates) and the tree itself.

Parameters specified by `control`

are used to fine tune the behaviour of the
algorithm.

## Logistic models

By default, `covlmc`

uses two different computing *engines* for logistic
models:

when the time series has only two states,

`covlmc`

uses`stats::glm()`

with a binomial link (`stats::binomial()`

);when the time series has at least three states,

`covlmc`

use`VGAM::vglm()`

with a multinomial link (`VGAM::multinomial()`

).

Both engines are able to detect degenerate cases and lead to more robust
results that using `nnet::multinom()`

. It is nevertheless possible to
replace `stats::glm()`

and `VGAM::vglm()`

with `nnet::multinom()`

by setting
the global option `mixvlmc.predictive`

to `"multinom"`

(the default value is
`"glm"`

). Notice that while results should be comparable, there is no
guarantee that they will be identical.

## References

Bühlmann, P. and Wyner, A. J. (1999), "Variable length Markov chains." Ann. Statist. 27 (2) 480-513 doi:10.1214/aos/1018031204

Zanin Zambom, A., Kim, S. and Lopes Garcia, N. (2022), "Variable length Markov chain with exogenous covariates." J. Time Ser. Anal., 43 (2) 312-328 doi:10.1111/jtsa.12615

## See also

`cutoff.covlmc()`

and `prune.covlmc()`

for post-pruning.

## Examples

```
pc <- powerconsumption[powerconsumption$week == 5, ]
dts <- cut(pc$active_power, breaks = c(0, quantile(pc$active_power, probs = c(1 / 3, 2 / 3, 1))))
dts_cov <- data.frame(day_night = (pc$hour >= 7 & pc$hour <= 17))
m_cov <- covlmc(dts, dts_cov, min_size = 15)
draw(m_cov)
#> * (merging ((0.556,1.78] and (1.78,7.54]): 1.347e-96)
#> +-- (0,0.556] (0.001385 [ -2.885 1.237
#> | -4.185 -15.4 ])
#> '-- (0.556,1.78] (0.8622 [ 2.046
#> | 0.1372 ])
#> '-- (1.78,7.54] (0.227 [ 3.714
#> 5.684 ])
withr::with_options(
list(mixvlmc.predictive = "multinom"),
m_cov_nnet <- covlmc(dts, dts_cov, min_size = 15)
)
draw(m_cov_nnet)
#> * (merging ((0.556,1.78] and (1.78,7.54]): 1.347e-96)
#> +-- (0,0.556] (0.001386 [ -2.885 1.237
#> | -4.185 -7.944 ])
#> '-- (0.556,1.78] (0.8622 [ 2.046
#> | 0.1372 ])
#> '-- (1.78,7.54] (0.2274 [ 3.714
#> 5.684 ])
```