This function extracts all the contexts from a fitted VLMC, possibly with some associated data.
Usage
# S3 method for vlmc
contexts(
ct,
sequence = FALSE,
reverse = FALSE,
frequency = NULL,
positions = FALSE,
local = FALSE,
cutoff = NULL,
metrics = FALSE,
...
)
# S3 method for vlmc_cpp
contexts(
ct,
sequence = FALSE,
reverse = FALSE,
frequency = NULL,
positions = FALSE,
local = FALSE,
cutoff = NULL,
metrics = FALSE,
...
)
Arguments
- ct
a context tree.
- sequence
if
TRUE
the function returns its results as adata.frame
, ifFALSE
(default) as a list ofctx_node
objects. (see details)- reverse
logical (defaults to
FALSE
). See details.- frequency
specifies the counts to be included in the result data.frame. The default value of
NULL
does not include anything."total"
gives the number of occurrences of each context in the original sequence."detailed"
includes in addition the break down of these occurrences into all the possible states.- positions
logical (defaults to FALSE). Specify whether the positions of each context in the time series used to build the context tree should be reported in a
positions
column of the result data frame. The availability of the positions depends on the way the context tree was built. See details for the definition of a position.- local
specifies how the counts reported by
frequency
are computed. Whenlocal
isFALSE
(default value) the counts include both counts that are specific to the context (if any) and counts from the descendants of the context in the tree. Whenlocal
isTRUE
the counts include only the number of times the context appears without being the last part of a longer context.- cutoff
specifies whether to include the cut off value associated to each context (see
cutoff()
andprune()
). The default result withcutoff=NULL
does not include those values. Settingcutoff
toquantile
adds the cut off values in quantile scale, whilecutoff="native"
adds them in the native scale. The returned values are directly based on the log likelihood ratio computed in the context tree and are not modified to ensure pruning (as whencutoff()
is called byraw=TRUE
).- metrics
if TRUE, adds predictive metrics for each context (see
metrics()
for the definition of predictive metrics).- ...
additional arguments for the contexts function.
Value
A list of class contexts
containing the contexts represented in
this tree (as ctx_node
) or a data.frame.
Details
The default behaviour of the function is to return a list of all the
contexts using ctx_node
objects (as returned by find_sequence()
). The
properties of the contexts can then be explored using adapted functions
such as counts()
, cutoff.ctx_node()
, metrics.ctx_node()
and
positions()
.
When sequence=TRUE
the method returns a data.frame whose first column,
named context
, contains the contexts as vectors (i.e. the value returned
by as_sequence()
applied to a ctx_node
object). Other columns contain
context specific values specified by the additional parameters. Setting any
of those parameters to a value that ask for reporting information will
toggle the result type of the function to data.frame
.
The frequency
parameter is described in details in the documentation of
contexts.ctx_tree()
. When cutoff
is non NULL
, the resulting
data.frame
contains a cutoff
column with the cut off values, either in
quantile or in native scale. See cutoff.vlmc()
and prune.vlmc()
for the
definitions of cut off values and of the two scales.
Cut off values
The cut off values reported by contexts.vlmc
can
be different from the ones reported by cutoff.vlmc()
for three reasons:
cutoff.vlmc()
reports only useful cut off values, i.e., cut off values that should induce a simplification of the VLMC when used inprune()
. This exclude cut off values associated to simple contexts that are smaller than the ones of their descendants in the context tree. Those values are reported bycontext.vlmc
.context.vlmc
reports only cut off values of actual contexts, whilecutoff.vlmc()
reports cut off values for all nodes of the context tree.values are not modified to induce pruning, contrarily to the default behaviour of
cutoff.vlmc()
Positions
A position of a context ctx
in the time series x
is
an index value t
such that the context ends with x[t]
. Thus x[t+1]
is
after the context. For instance if x=c(0, 0, 1, 1)
and ctx=c(0, 1)
(in
standard state order), then the position of ctx
in x
is 3.
State order in a context
Notice that contexts are given by default
in the temporal order and not in the "reverse" order used by many VLMC
research papers: older values are on the left. For instance, the context
c(1, 0)
is reported if the sequence 0, then 1 appeared in the time series
used to build the context tree. Set reverse to TRUE
for the reverse
convention which is somewhat easier to relate to the way the context trees
are represented by draw()
(i.e. recent values at the top the tree).
See also
find_sequence()
and find_sequence.covlmc()
for direct access to
a specific context, and contexts.ctx_tree()
, contexts.vlmc()
and
contexts.covlmc()
for concrete implementations of contexts()
.
Examples
dts <- sample(as.factor(c("A", "B", "C")), 100, replace = TRUE)
model <- vlmc(dts, alpha = 0.5)
## direct representation with ctx_node objects
model_ctxs <- contexts(model)
model_ctxs
#> Contexts:
#> A, A, A
#> B, A, A
#> A, A
#> B, B, B, A
#> B, B, A
#> A, C, B, A
#> C, B, A
#> B, A
#> B, C, A
#> A, C, C, A
#> C, C, A
#> C, A
#> B, B, A, B
#> C, B, A, B
#> B, A, B
#> C, A, B
#> A, B
#> A, B, B
#> C, B, B
#> B, B
#> B, C, A, C, B
#> C, A, C, B
#> A, C, B
#> C, B, C, B
#> B, C, B
#> C, B, C, C, B
#> B, C, C, B
#> C, C, B
#> A, B, B, A, C
#> B, B, A, C
#> B, A, C
#> C, A, C
#> A, C
#> B, A, B, C
#> A, B, C
#> B, B, C
#> B, C
#> A, C, C
#> B, C, C
#> A, C, C, C
#> C, C, C
sapply(model_ctxs, cutoff, scale = "quantile")
#> [1] 0.4897959 0.1574344 0.8930017 0.2222222 0.7491156 0.2500000 0.1573663
#> [8] 0.9948751 0.4746094 0.5000000 0.7500000 0.6015948 0.5000000 0.5000000
#> [15] 0.5120000 0.4740741 0.3627739 0.3169333 0.4965458 0.2831756 0.4444444
#> [22] 0.6480000 0.1558617 0.5000000 0.7170617 0.4444444 0.4218750 0.7170617
#> [29] 0.4444444 0.4218750 0.5120000 0.1250000 0.4828227 0.5000000 0.8392869
#> [36] 0.3966942 0.9581648 0.1573663 0.1657851 0.4444444 0.9737040
sapply(model_ctxs, cutoff, scale = "native")
#> [1] 0.713766468 1.848746401 0.113166844 1.504077397 0.288861954 1.386294361
#> [7] 1.849179069 0.005138089 0.745263182 0.693147181 0.287682072 0.508171105
#> [13] 0.693147181 0.693147181 0.669430654 0.746391695 1.013975464 1.149063859
#> [19] 0.700079597 1.261688240 0.810930216 0.433864583 1.858785993 0.693147181
#> [25] 0.332593351 0.810930216 0.863046217 0.332593351 0.810930216 0.863046217
#> [31] 0.669430654 2.079441542 0.728105685 0.693147181 0.175202636 0.924589535
#> [37] 0.042735539 1.849179069 1.797063068 0.810930216 0.026647941
sapply(model_ctxs, function(x) metrics(x)$accuracy)
#> [1] 0.5000000 0.6666667 1.0000000 0.5000000 0.7500000 1.0000000 0.0000000
#> [8] 0.0000000 0.6666667 0.5000000 0.5000000 1.0000000 0.5000000 0.5000000
#> [15] NaN 0.7500000 0.5000000 0.6000000 0.7500000 0.6250000 1.0000000
#> [22] 0.0000000 1.0000000 0.5000000 0.5000000 1.0000000 0.0000000 0.0000000
#> [29] 1.0000000 0.0000000 0.0000000 1.0000000 0.3333333 0.5000000 0.5000000
#> [36] 0.5000000 0.4000000 0.5000000 0.7500000 0.5000000 0.0000000
## data.frame format
contexts(model, frequency = "total")
#> context freq
#> 1 A, A, A 2
#> 2 B, A, A 3
#> 3 A, A 7
#> 4 B, B, B, A 2
#> 5 B, B, A 6
#> 6 A, C, B, A 2
#> 7 C, B, A 4
#> 8 B, A 11
#> 9 B, C, A 3
#> 10 A, C, C, A 2
#> 11 C, C, A 4
#> 12 C, A 8
#> 13 B, B, A, B 2
#> 14 C, B, A, B 2
#> 15 B, A, B 4
#> 16 C, A, B 4
#> 17 A, B 10
#> 18 A, B, B 5
#> 19 C, B, B 4
#> 20 B, B 17
#> 21 B, C, A,.... 2
#> 22 C, A, C, B 3
#> 23 A, C, B 5
#> 24 C, B, C, B 2
#> 25 B, C, B 4
#> 26 C, B, C,.... 2
#> 27 B, C, C, B 3
#> 28 C, C, B 4
#> 29 A, B, B,.... 2
#> 30 B, B, A, C 3
#> 31 B, A, C 4
#> 32 C, A, C 3
#> 33 A, C 10
#> 34 B, A, B, C 2
#> 35 A, B, C 4
#> 36 B, B, C 2
#> 37 B, C 11
#> 38 A, C, C 4
#> 39 B, C, C 4
#> 40 A, C, C, C 2
#> 41 C, C, C 3
contexts(model, cutoff = "quantile")
#> context cutoff
#> 1 A, A, A 0.4897959
#> 2 B, A, A 0.1574344
#> 3 A, A 0.8930017
#> 4 B, B, B, A 0.2222222
#> 5 B, B, A 0.7491156
#> 6 A, C, B, A 0.2500000
#> 7 C, B, A 0.1573663
#> 8 B, A 0.9948751
#> 9 B, C, A 0.4746094
#> 10 A, C, C, A 0.5000000
#> 11 C, C, A 0.7500000
#> 12 C, A 0.6015948
#> 13 B, B, A, B 0.5000000
#> 14 C, B, A, B 0.5000000
#> 15 B, A, B 0.5120000
#> 16 C, A, B 0.4740741
#> 17 A, B 0.3627739
#> 18 A, B, B 0.3169333
#> 19 C, B, B 0.4965458
#> 20 B, B 0.2831756
#> 21 B, C, A,.... 0.4444444
#> 22 C, A, C, B 0.6480000
#> 23 A, C, B 0.1558617
#> 24 C, B, C, B 0.5000000
#> 25 B, C, B 0.7170617
#> 26 C, B, C,.... 0.4444444
#> 27 B, C, C, B 0.4218750
#> 28 C, C, B 0.7170617
#> 29 A, B, B,.... 0.4444444
#> 30 B, B, A, C 0.4218750
#> 31 B, A, C 0.5120000
#> 32 C, A, C 0.1250000
#> 33 A, C 0.4828227
#> 34 B, A, B, C 0.5000000
#> 35 A, B, C 0.8392869
#> 36 B, B, C 0.3966942
#> 37 B, C 0.9581648
#> 38 A, C, C 0.1573663
#> 39 B, C, C 0.1657851
#> 40 A, C, C, C 0.4444444
#> 41 C, C, C 0.9737040
contexts(model, cutoff = "native", metrics = TRUE)
#> context cutoff accuracy auc
#> 1 A, A, A 0.713766468 0.5000000 NA
#> 2 B, A, A 1.848746401 0.6666667 NA
#> 3 A, A 0.113166844 1.0000000 NA
#> 4 B, B, B, A 1.504077397 0.5000000 NA
#> 5 B, B, A 0.288861954 0.7500000 NA
#> 6 A, C, B, A 1.386294361 1.0000000 NA
#> 7 C, B, A 1.849179069 0.0000000 NA
#> 8 B, A 0.005138089 0.0000000 NA
#> 9 B, C, A 0.745263182 0.6666667 NA
#> 10 A, C, C, A 0.693147181 0.5000000 NA
#> 11 C, C, A 0.287682072 0.5000000 NA
#> 12 C, A 0.508171105 1.0000000 NA
#> 13 B, B, A, B 0.693147181 0.5000000 NA
#> 14 C, B, A, B 0.693147181 0.5000000 NA
#> 15 B, A, B 0.669430654 NaN NA
#> 16 C, A, B 0.746391695 0.7500000 NA
#> 17 A, B 1.013975464 0.5000000 NA
#> 18 A, B, B 1.149063859 0.6000000 0.5
#> 19 C, B, B 0.700079597 0.7500000 NA
#> 20 B, B 1.261688240 0.6250000 0.5
#> 21 B, C, A,.... 0.810930216 1.0000000 NA
#> 22 C, A, C, B 0.433864583 0.0000000 NA
#> 23 A, C, B 1.858785993 1.0000000 NA
#> 24 C, B, C, B 0.693147181 0.5000000 NA
#> 25 B, C, B 0.332593351 0.5000000 NA
#> 26 C, B, C,.... 0.810930216 1.0000000 NA
#> 27 B, C, C, B 0.863046217 0.0000000 NA
#> 28 C, C, B 0.332593351 0.0000000 NA
#> 29 A, B, B,.... 0.810930216 1.0000000 NA
#> 30 B, B, A, C 0.863046217 0.0000000 NA
#> 31 B, A, C 0.669430654 0.0000000 NA
#> 32 C, A, C 2.079441542 1.0000000 NA
#> 33 A, C 0.728105685 0.3333333 NA
#> 34 B, A, B, C 0.693147181 0.5000000 NA
#> 35 A, B, C 0.175202636 0.5000000 NA
#> 36 B, B, C 0.924589535 0.5000000 NA
#> 37 B, C 0.042735539 0.4000000 0.5
#> 38 A, C, C 1.849179069 0.5000000 NA
#> 39 B, C, C 1.797063068 0.7500000 NA
#> 40 A, C, C, C 0.810930216 0.5000000 NA
#> 41 C, C, C 0.026647941 0.0000000 NA