Cross validation
In this example, we’ll implement time series cross-validation to evaluate model’s performance.
Prerequesites
This tutorial assumes basic familiarity with StatsForecast. For a minimal example visit the Quick Start
Introduction
Time series cross-validation is a method for evaluating how a model would have performed in the past. It works by defining a sliding window across the historical data and predicting the period following it.
Statsforecast has an implementation of time series cross-validation that is fast and easy to use. This implementation makes cross-validation a distributed operation, which makes it less time-consuming. In this notebook, we’ll use it on a subset of the M4 Competition hourly dataset.
Outline:
- Install libraries
- Load and explore data
- Train model
- Perform time series cross-validation
- Evaluate results
Tip
You can use Colab to run this Notebook interactively
Install libraries
We assume that you have StatsForecast already installed. If not, check this guide for instructions on how to install StatsForecast
Install the necessary packages with pip install statsforecast
Load and explore the data
As stated in the introduction, we’ll use the M4 Competition hourly
dataset. We’ll first import the data from an URL using pandas
.
unique_id | ds | y | |
---|---|---|---|
0 | H1 | 1 | 605.0 |
1 | H1 | 2 | 586.0 |
2 | H1 | 3 | 586.0 |
3 | H1 | 4 | 559.0 |
4 | H1 | 5 | 511.0 |
The input to
StatsForecast
is a data frame in long
format with
three columns: unique_id
, ds
and y:
- The
unique_id
(string, int, or category) represents an identifier for the series. - The
ds
(datestamp or int) column should be either an integer indexing time or a datestamp in format YYYY-MM-DD or YYYY-MM-DD HH:MM:SS. - The
y
(numeric) represents the measurement we wish to forecast.
The data in this example already has this format, so no changes are needed.
To keep the time required to execute this notebook to a minimum, we’ll
only use one time series from the data, namely the one with
unique_id == 'H1'
. However, you can use as many as you want, with no
additional changes to the code needed.
We can plot the time series we’ll work with using StatsForecast.plot
method.
Train model
For this example, we’ll use StastForecast
AutoETS.
We first need to import it from statsforecast.models
and then we need
to instantiate a new
StatsForecast
object.
The
StatsForecast
object has the following parameters:
- models: a list of models. Select the models you want from models and import them.
- freq: a string indicating the frequency of the data. See panda’s available frequencies.
- n_jobs: n_jobs: int, number of jobs used in the parallel processing, use -1 for all cores.
Any settings are passed into the constructor. Then you call its fit
method and pass in the historical data frame df
.
Perform time series cross-validation
Once the
StatsForecast
object
has been instantiated, we can use the
cross_validation
method, which takes the following arguments:
df
: training data frame withStatsForecast
formath
(int): represents the h steps into the future that will be forecastedstep_size
(int): step size between each window, meaning how often do you want to run the forecasting process.n_windows
(int): number of windows used for cross-validation, meaning the number of forecasting processes in the past you want to evaluate.
For this particular example, we’ll use 3 windows of 24 hours.
The crossvaldation_df object is a new data frame that includes the following columns:
unique_id
: index. If you dont like working with index just runcrossvalidation_df.resetindex()
ds
: datestamp or temporal indexcutoff
: the last datestamp or temporal index for the n_windows.y
: true value"model"
: columns with the model’s name and fitted value.
ds | cutoff | y | AutoETS | |
---|---|---|---|---|
unique_id | ||||
H1 | 677 | 676 | 691.0 | 677.761047 |
H1 | 678 | 676 | 618.0 | 607.817871 |
H1 | 679 | 676 | 563.0 | 569.437744 |
H1 | 680 | 676 | 529.0 | 537.340027 |
H1 | 681 | 676 | 504.0 | 515.571106 |
We’ll now plot the forecast for each cutoff period. To make the plots clearer, we’ll rename the actual values in each period.
Notice that in each cutoff period, we generated a forecast for the next
24 hours using only the data y
before said period.
Evaluate results
We can now compute the accuracy of the forecast using an appropiate accuracy metric. Here we’ll use the Root Mean Squared Error (RMSE). To do this, we first need to install datasetsforecast, a Python library developed by Nixtla that includes a function to compute the RMSE.
The function to compute the RMSE takes two arguments:
- The actual values.
- The forecasts, in this case,
AutoETS
.
This measure should better reflect the predictive abilities of our model, since it used different time periods to test its accuracy.
Tip
Cross validation is especially useful when comparing multiple models. Here’s an example with multiple models and time series.