This function simulates a time series from the distribution estimated by the given vlmc object.
Arguments
- object
a fitted vlmc object.
- nsim
length of the simulated time series (defaults to 1).
- seed
an optional random seed (see the dedicated section).
- init
an optional initial sequence for the time series.
- burnin
number of initial observations to discard or
"auto"
(see the dedicated section).- sample
specifies which implementation of
base::sample()
to use. See the dedicated section.- ...
additional arguments.
Value
a simulated discrete time series of the same type as the one used to
build the vlmc with a seed
attribute (see the Random seed section). The
results has also the dts
class to hide the seed
attribute when using
print
or similar function.
sampling method
The R backend for vlmc()
uses base::sample()
to generate samples for each
context. Internally, this function sorts the probabilities of each state in
decreasing probability order (among other things), which is not needed in our
case. The C++ backend can be used with three different implementations:
sample="fast"
uses a dedicated C++ implementation adapted to the data structures used internally. In general, the simulated time series obtained with this implementation will be different from the one generated with the R backend, even using the same seed.sample="slow"
uses another C++ implementation that mimicsbase::sample()
in order to maximize the chance to provide identical simulation results regardless of the backend (when using the same random seed). This process is not perfect as we use the std::lib sort algorithm which is not guaranteed to give identical results as the ones of R internal 'revsort'.sample="R"
uses direct calls tobase::sample()
. Results are guaranteed to be identical between the two backends, but at the price of higher running time.
Burn in (Warm up) period
When using a VLMC for simulation purposes, we are generally interested in
the stationary distribution of the corresponding Markov chain. To reduce
the dependence of the samples from the initial values and get closer to
this stationary distribution (if it exists), it is recommended to discard
the first samples which are produced in a so-called "burn in" (or "warm
up") period. The burnin
parameter can be used to implement this approach.
The VLMC is used to produce a sample of size burnin + nsim
but the first
burnin
values are discarded. Notice that this burn in values can be
partially given by the init
parameter if it is specified.
If burnin
is set to "auto"
, the burnin
period is set to 64 * context_number(object)
, following the heuristic proposed in Mächler and
Bühlmann (2004).
Random seed
This function reproduce the behaviour of stats::simulate()
. If seed
is
NULL
the function does not change the random generator state and returns
the value of .Random.seed as a seed
attribute in the return value. This
can be used to reproduce exactly the simulation results by setting
.Random.seed to this value. Notice that if the random seed has not be
initialised by R so far, the function issues a call to runif(1)
to
perform this initialisation (as is done in stats::simulate()
).
It seed
is an integer, it is used in a call to set.seed()
before the
simulation takes place. The integer is saved as a seed
attribute in the
return value. The integer seed is completed by an attribute kind
which
contains the value as.list([RNGkind()])
exactly as with
stats::simulate()
. The random generator state is reset to its original
value at the end of the call.
Extended contexts
As explained in details in loglikelihood.vlmc()
documentation and in the
dedicated vignette("likelihood", package = "mixvlmc")
, the first initial
values of a time series do not in general have a proper context for a VLMC
with a non zero order. In order to simulate something meaningful for those
values when init
is not provided, we rely on the notion of extended
context defined in the documents mentioned above. This follows the same
logic as using loglikelihood.vlmc()
with the parameter
initial="extended"
. All vlmc functions that need to manipulate initial
values with no proper context use the same approach.
References
Mächler, M. and Bühlmann, P. (2004) "Variable Length Markov Chains: Methodology, Computing, and Software" Journal of Computational and Graphical Statistics, 13 (2), 435-455, doi:10.1198/1061860043524
See also
stats::simulate()
for details and examples on the random number
generator setting
Examples
pc <- powerconsumption[powerconsumption$week == 5, ]
dts <- cut(pc$active_power, breaks = c(0, quantile(pc$active_power, probs = c(0.25, 0.5, 0.75, 1))))
model <- vlmc(dts, min_size = 5)
new_dts <- simulate(model, 500, seed = 0)
new_dts_2 <- simulate(model, 500, seed = 0, init = dts[1:5])
new_dts_3 <- simulate(model, 500, seed = 0, burnin = 500)