Methods for Fit, Predict, Forecast (fast), Cross Validation and plotting
StatsForecast
are:
StatsForecast.fit
StatsForecast.predict
StatsForecast.forecast
StatsForecast.cross_validation
StatsForecast.plot
*The
StatsForecast
class allows you to efficiently fit multiple
StatsForecast
models for large sets of time series. It operates on a DataFrame df
with at least three columns ids, times and targets.
The class has memory-efficient
StatsForecast.forecast
method that avoids storing partial model outputs. While the
StatsForecast.fit
and StatsForecast.predict
methods with
Scikit-learn interface store the fitted models.
The
StatsForecast
class offers parallelization utilities with Dask, Spark and Ray
back-ends. See distributed computing example
here.*
*Fit statistical models. Fit
models
to a large set of time series from DataFrame df
and store
fitted models for later inspection.*
Type | Default | Details | |
---|---|---|---|
df | Union | DataFrame with ids, times, targets and exogenous. | |
prediction_intervals | Optional | None | Configuration to calibrate prediction intervals (Conformal Prediction). |
id_col | str | unique_id | Column that identifies each serie. |
time_col | str | ds | Column that identifies each timestep, its values can be timestamps or integers. |
target_col | str | y | Column that contains the target. |
Returns | StatsForecast | Returns with stored StatsForecast fitted models . |
*Predict statistical models. Use stored fitted
models
to predict large set of time series from
DataFrame df
.*
Type | Default | Details | |
---|---|---|---|
h | int | Forecast horizon. | |
X_df | Union | None | DataFrame with ids, times and future exogenous. |
level | Optional | None | Confidence levels between 0 and 100 for prediction intervals. |
Returns | pandas or polars DataFrame | DataFrame with models columns for point predictions and probabilisticpredictions for all fitted models . |
*Fit and Predict with statistical models. This method avoids memory burden due from object storage. It is analogous to Scikit-Learn
fit_predict
without storing information. It
requires the forecast horizon h
in advance.
In contrast to
StatsForecast.forecast
this method stores partial models outputs.*
Type | Default | Details | |
---|---|---|---|
h | int | Forecast horizon. | |
df | Union | DataFrame with ids, times, targets and exogenous. | |
X_df | Union | None | DataFrame with ids, times and future exogenous. |
level | Optional | None | Confidence levels between 0 and 100 for prediction intervals. |
prediction_intervals | Optional | None | Configuration to calibrate prediction intervals (Conformal Prediction). |
id_col | str | unique_id | Column that identifies each serie. |
time_col | str | ds | Column that identifies each timestep, its values can be timestamps or integers. |
target_col | str | y | Column that contains the target. |
Returns | Union | DataFrame with models columns for point predictions and probabilisticpredictions for all fitted models . |
*Memory Efficient predictions. This method avoids memory burden due from object storage. It is analogous to Scikit-Learn
fit_predict
without storing information. It
requires the forecast horizon h
in advance.*
Type | Default | Details | |
---|---|---|---|
h | int | Forecast horizon. | |
df | Union | DataFrame with ids, times, targets and exogenous. | |
X_df | Union | None | DataFrame with ids, times and future exogenous. |
level | Optional | None | Confidence levels between 0 and 100 for prediction intervals. |
fitted | bool | False | Store in-sample predictions. |
prediction_intervals | Optional | None | Configuration to calibrate prediction intervals (Conformal Prediction). |
id_col | str | unique_id | Column that identifies each serie. |
time_col | str | ds | Column that identifies each timestep, its values can be timestamps or integers. |
target_col | str | y | Column that contains the target. |
Returns | Union | DataFrame with models columns for point predictions and probabilisticpredictions for all fitted models . |
*Access insample predictions. After executing
StatsForecast.forecast
,
you can access the insample prediction values for each model. To get
them, you need to pass fitted=True
to the
StatsForecast.forecast
method and then use the
StatsForecast.forecast_fitted_values
method.*
*Temporal Cross-Validation. Efficiently fits a list of
StatsForecast
models through multiple training windows, in either chained or rolled
manner.
StatsForecast.models
’ speed allows to overcome this evaluation
technique high computational costs. Temporal cross-validation provides
better model’s generalization measurements by increasing the test’s
length and diversity.*
Type | Default | Details | |
---|---|---|---|
h | int | Forecast horizon. | |
df | Union | DataFrame with ids, times, targets and exogenous. | |
n_windows | int | 1 | Number of windows used for cross validation. |
step_size | int | 1 | Step size between each window. |
test_size | Optional | None | Length of test size. If passed, set n_windows=None . |
input_size | Optional | None | Input size for each window, if not none rolled windows. |
level | Optional | None | Confidence levels between 0 and 100 for prediction intervals. |
fitted | bool | False | Store in-sample predictions. |
refit | Union | True | Wether or not refit the model for each window. If int, train the models every refit windows. |
prediction_intervals | Optional | None | Configuration to calibrate prediction intervals (Conformal Prediction). |
id_col | str | unique_id | Column that identifies each serie. |
time_col | str | ds | Column that identifies each timestep, its values can be timestamps or integers. |
target_col | str | y | Column that contains the target. |
Returns | Union | DataFrame with insample models columns for point predictions and probabilisticpredictions for all fitted models . |
*Access insample cross validated predictions. After executing
StatsForecast.cross_validation
,
you can access the insample prediction values for each model and window.
To get them, you need to pass fitted=True
to the
StatsForecast.cross_validation
method and then use the StatsForecast.cross_validation_fitted_values
method.*
Plot forecasts and insample values.
Type | Default | Details | |
---|---|---|---|
df | Union | DataFrame with ids, times, targets and exogenous. | |
forecasts_df | Union | None | DataFrame ids, times and models. |
unique_ids | Union | None | ids to plot. If None, they’re selected randomly. |
plot_random | bool | True | Select time series to plot randomly. |
models | Optional | None | List of models to plot. |
level | Optional | None | List of prediction intervals to plot if paseed. |
max_insample_length | Optional | None | Max number of train/insample observations to be plotted. |
plot_anomalies | bool | False | Plot anomalies for each prediction interval. |
engine | str | matplotlib | Library used to plot. ‘plotly’, ‘plotly-resampler’ or ‘matplotlib’. |
id_col | str | unique_id | Column that identifies each serie. |
time_col | str | ds | Column that identifies each timestep, its values can be timestamps or integers. |
target_col | str | y | Column that contains the target. |
resampler_kwargs | Optional | None | Kwargs to be passed to plotly-resampler constructor. For further custumization (“show_dash”) call the method, store the plotting object and add the extra arguments to its show_dash method. |
Function that will save StatsForecast class with certain settings to make it reproducible.
Type | Default | Details | |
---|---|---|---|
path | Union | None | Path of the file to be saved. If None will create one in the current directory using the current UTC timestamp. |
max_size | Optional | None | StatsForecast object should not exceed this size. Available byte naming: [‘B’, ‘KB’, ‘MB’, ‘GB’] |
trim | bool | False | Delete any attributes not needed for inference. |
Automatically loads the model into ready StatsForecast.
Type | Details | |
---|---|---|
path | Union | Path to saved StatsForecast file. |
Returns | sf: StatsForecast | Previously saved StatsForecast |
StatsForecast
class can also receive integers as datestamp, the following example
shows how to do it.
StatsForecast.forecast
method.
level
to the
StatsForecast.forecast
method to calculate prediction intervals. Not all models can calculate
them at the moment, so we will only obtain the intervals of those models
that have it implemented.