This function builds a context tree for a time series.
Usage
ctx_tree(
x,
min_size = 2L,
max_depth = 100L,
keep_position = TRUE,
backend = getOption("mixvlmc.backend", "R")
)
Arguments
- x
a discrete time series; can be numeric, character, factor or logical.
- min_size
integer >= 1 (default: 2). Minimum number of observations for a context to be included in the tree.
- max_depth
integer >= 1 (default: 100). Maximum length of a context to be included in the tree.
- keep_position
logical (default: TRUE). Should the context tree keep the position of the contexts.
- backend
"R" or "C++" (default: as specified by the "mixvlmc.backend" option). Specifies the implementation used to represent the context tree and to built it. See details.
Details
The tree represents all the sequences of symbols/states of length smaller
than max_depth
that appear at least min_size
times in the time series and
stores the frequencies of the states that follow each context. Optionally,
the positions of the contexts in the time series can be stored in the tree.
Back ends
Two back ends are available to compute context trees:
the "R" back end represents the tree in pure R data structures (nested lists) that be easily processed further in pure R (C++ helper functions are used to speed up the construction).
the "C++" back end represents the tree with C++ classes. This back end is considered experimental. The tree is built with an optimised suffix tree algorithm which speeds up the construction by at least a factor 10 in standard settings. As the tree is kept outside of R direct reach, context trees built with the C++ back end must be restored after a
saveRDS()
/readRDS()
sequence. This is done automatically by recomputing completely the context tree.