In this notebook, you will make forecasts for the M5 dataset choosing the best model for each time series using cross validation.
SeasonalNaive
and
HistoricAverage
models for this category.CrostonOptimized
,
IMAPA
,
and
ADIDA
.
These models are particularly suited for handling zero-inflated
series.AutoETS
model from the statsforecast library falls under this category.LightGBM
, XGBoost
, and
LinearRegression
can be advantageous due to their capacity to uncover
intricate patterns in data. We’ll use the MLForecast library for this
purpose.
NeuralForecast
Deep Learning: DL models, such as Transformers (AutoTFT
) and Neural
Networks (AutoNHITS
), allow us to handle complex non-linear
dependencies in time series data. We’ll utilize the NeuralForecast
library for these models.
Using the Nixtla suite of libraries, we’ll be able to drive our model
selection process with data, ensuring we utilize the most suitable
models for specific groups of series in our dataset.
Outline:
Warning
This tutorial was originally executed using a c5d.24xlarge
EC2
instance.
30,490
bottom time series.
StatsForecast
class. This method prints 8 random series from the dataset and is useful
for basic
EDA.
StatsForecast
is a comprehensive library providing a suite of popular univariate time
series forecasting models, all designed with a focus on high performance
and scalability.
Here’s what makes StatsForecast a powerful tool for time series
forecasting:
StatsForecast
receives a list of models to fit each time series. Since we are dealing
with Daily data, it would be benefitial to use 7 as seasonality.
models
: a list of models. Select the models you want from models
and import them.freq
: a string indicating the frequency of the data. (See panda’s
available frequencies.)n_jobs
: int, number of jobs used in the parallel processing, use
-1 for all cores.fallback_model
: a model to be used if a model fails. Any settings
are passed into the constructor. Then you call its fit method and
pass in the historical data frame.h
(int): represents the forecast h steps into the future. In this
case, 12 months ahead.level
(list of floats): this optional parameter is used for
probabilistic forecasting. Set the level (or confidence percentile)
of your prediction interval. For example, level=[90] means that
the model expects the real value to be inside that interval 90% of
the times.ds | SeasonalNaive | SeasonalNaive-lo-90 | SeasonalNaive-hi-90 | Naive | Naive-lo-90 | Naive-hi-90 | HistoricAverage | HistoricAverage-lo-90 | HistoricAverage-hi-90 | CrostonOptimized | ADIDA | IMAPA | AutoETS | AutoETS-lo-90 | AutoETS-hi-90 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
unique_id | ||||||||||||||||
FOODS_3_001_CA_1 | 2016-05-23 | 1.0 | -2.847174 | 4.847174 | 2.0 | 0.098363 | 3.901637 | 0.448738 | -1.009579 | 1.907055 | 0.345192 | 0.345477 | 0.347249 | 0.381414 | -1.028122 | 1.790950 |
FOODS_3_001_CA_1 | 2016-05-24 | 0.0 | -3.847174 | 3.847174 | 2.0 | -0.689321 | 4.689321 | 0.448738 | -1.009579 | 1.907055 | 0.345192 | 0.345477 | 0.347249 | 0.286933 | -1.124136 | 1.698003 |
FOODS_3_001_CA_1 | 2016-05-25 | 0.0 | -3.847174 | 3.847174 | 2.0 | -1.293732 | 5.293732 | 0.448738 | -1.009579 | 1.907055 | 0.345192 | 0.345477 | 0.347249 | 0.334987 | -1.077614 | 1.747588 |
FOODS_3_001_CA_1 | 2016-05-26 | 1.0 | -2.847174 | 4.847174 | 2.0 | -1.803274 | 5.803274 | 0.448738 | -1.009579 | 1.907055 | 0.345192 | 0.345477 | 0.347249 | 0.186851 | -1.227280 | 1.600982 |
FOODS_3_001_CA_1 | 2016-05-27 | 0.0 | -3.847174 | 3.847174 | 2.0 | -2.252190 | 6.252190 | 0.448738 | -1.009579 | 1.907055 | 0.345192 | 0.345477 | 0.347249 | 0.308112 | -1.107548 | 1.723771 |
MLForecast
is a powerful library that provides automated feature
creation for time series forecasting, facilitating the use of global
machine learning models. It is designed for high performance and
scalability.
Key features of MLForecast include:
MLForecast
for time series forecasting, we instantiate a new
MLForecast
object and provide it with various parameters to tailor the
modeling process to our specific needs:
models
: This parameter accepts a list of machine learning models
you wish to use for forecasting. You can import your preferred
models from scikit-learn, lightgbm and xgboost.
freq
: This is a string indicating the frequency of your data
(hourly, daily, weekly, etc.). The specific format of this string
should align with pandas’ recognized frequency strings.
target_transforms
: These are transformations applied to the target
variable before model training and after model prediction. This can
be useful when working with data that may benefit from
transformations, such as log-transforms for highly skewed data.
lags
: This parameter accepts specific lag values to be used as
regressors. Lags represent how many steps back in time you want to
look when creating features for your model. For example, if you want
to use the previous day’s data as a feature for predicting today’s
value, you would specify a lag of 1.
lags_transforms
: These are specific transformations for each lag.
This allows you to apply transformations to your lagged features.
date_features
: This parameter specifies date-related features to
be used as regressors. For instance, you might want to include the
day of the week or the month as a feature in your model.
num_threads
: This parameter controls the number of threads to use
for parallelizing feature creation, helping to speed up this process
when working with large datasets.
MLForecast
constructor. Once the
MLForecast
object is initialized with these settings, we call its
fit
method and pass the historical data frame as the argument. The
fit
method trains the models on the provided historical data, readying
them for future forecasting tasks.
fit
models to train the select models. In this case we
are generating conformal prediction intervals.
predict
to generate forecasts.
unique_id | ds | LGBMRegressor | XGBRegressor | LinearRegression | LGBMRegressor-lo-90 | LGBMRegressor-hi-90 | XGBRegressor-lo-90 | XGBRegressor-hi-90 | LinearRegression-lo-90 | LinearRegression-hi-90 | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | FOODS_3_001_CA_1 | 2016-05-23 | 0.549520 | 0.598431 | 0.359638 | -0.213915 | 1.312955 | -0.020050 | 1.216912 | 0.030000 | 0.689277 |
1 | FOODS_3_001_CA_1 | 2016-05-24 | 0.553196 | 0.337268 | 0.100361 | -0.251383 | 1.357775 | -0.201449 | 0.875985 | -0.216195 | 0.416917 |
2 | FOODS_3_001_CA_1 | 2016-05-25 | 0.599668 | 0.349604 | 0.175840 | -0.203974 | 1.403309 | -0.284416 | 0.983624 | -0.150593 | 0.502273 |
3 | FOODS_3_001_CA_1 | 2016-05-26 | 0.638097 | 0.322144 | 0.156460 | 0.118688 | 1.157506 | -0.085872 | 0.730160 | -0.273851 | 0.586771 |
4 | FOODS_3_001_CA_1 | 2016-05-27 | 0.763305 | 0.300362 | 0.328194 | -0.313091 | 1.839701 | -0.296636 | 0.897360 | -0.657089 | 1.313476 |
NeuralForecast
is a robust collection of neural forecasting models
that focuses on usability and performance. It includes a variety of
model architectures, from classic networks such as Multilayer
Perceptrons (MLP) and Recurrent Neural Networks (RNN) to novel
contributions like N-BEATS, N-HITS, Temporal Fusion Transformers (TFT),
and more.
Key features of NeuralForecast
include:
unique_id | ds | AutoNHITS | AutoNHITS-lo-90 | AutoNHITS-hi-90 | AutoTFT | AutoTFT-lo-90 | AutoTFT-hi-90 | |
---|---|---|---|---|---|---|---|---|
0 | FOODS_3_001_CA_1 | 2016-05-23 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 2.0 |
1 | FOODS_3_001_CA_1 | 2016-05-24 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 2.0 |
2 | FOODS_3_001_CA_1 | 2016-05-25 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 1.0 |
3 | FOODS_3_001_CA_1 | 2016-05-26 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 2.0 |
4 | FOODS_3_001_CA_1 | 2016-05-27 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 2.0 |
unique_id | ds | SeasonalNaive | SeasonalNaive-lo-90 | SeasonalNaive-hi-90 | Naive | Naive-lo-90 | Naive-hi-90 | HistoricAverage | HistoricAverage-lo-90 | … | AutoTFT-hi-90 | LGBMRegressor | XGBRegressor | LinearRegression | LGBMRegressor-lo-90 | LGBMRegressor-hi-90 | XGBRegressor-lo-90 | XGBRegressor-hi-90 | LinearRegression-lo-90 | LinearRegression-hi-90 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | FOODS_3_001_CA_1 | 2016-05-23 | 1.0 | -2.847174 | 4.847174 | 2.0 | 0.098363 | 3.901637 | 0.448738 | -1.009579 | … | 2.0 | 0.549520 | 0.598431 | 0.359638 | -0.213915 | 1.312955 | -0.020050 | 1.216912 | 0.030000 | 0.689277 |
1 | FOODS_3_001_CA_1 | 2016-05-24 | 0.0 | -3.847174 | 3.847174 | 2.0 | -0.689321 | 4.689321 | 0.448738 | -1.009579 | … | 2.0 | 0.553196 | 0.337268 | 0.100361 | -0.251383 | 1.357775 | -0.201449 | 0.875985 | -0.216195 | 0.416917 |
2 | FOODS_3_001_CA_1 | 2016-05-25 | 0.0 | -3.847174 | 3.847174 | 2.0 | -1.293732 | 5.293732 | 0.448738 | -1.009579 | … | 1.0 | 0.599668 | 0.349604 | 0.175840 | -0.203974 | 1.403309 | -0.284416 | 0.983624 | -0.150593 | 0.502273 |
3 | FOODS_3_001_CA_1 | 2016-05-26 | 1.0 | -2.847174 | 4.847174 | 2.0 | -1.803274 | 5.803274 | 0.448738 | -1.009579 | … | 2.0 | 0.638097 | 0.322144 | 0.156460 | 0.118688 | 1.157506 | -0.085872 | 0.730160 | -0.273851 | 0.586771 |
4 | FOODS_3_001_CA_1 | 2016-05-27 | 0.0 | -3.847174 | 3.847174 | 2.0 | -2.252190 | 6.252190 | 0.448738 | -1.009579 | … | 2.0 | 0.763305 | 0.300362 | 0.328194 | -0.313091 | 1.839701 | -0.296636 | 0.897360 | -0.657089 | 1.313476 |
StatsForecast
,
MLForecast
, and NeuralForecast
- offer out-of-the-box
cross-validation capabilities specifically designed for time series.
This allows us to evaluate the model’s performance using historical data
to obtain an unbiased assessment of how well each model is likely to
perform on unseen data.
cross_validation
method from the
StatsForecast
class accepts the following arguments:
df
: A DataFrame representing the training data.h
(int): The forecast horizon, represented as the number of steps
into the future that we wish to predict. For example, if we’re
forecasting hourly data, h=24
would represent a 24-hour forecast.step_size
(int): The step size between each cross-validation
window. This parameter determines how often we want to run the
forecasting process.n_windows
(int): The number of windows used for cross validation.
This parameter defines how many past forecasting processes we want
to evaluate.unique_id
index: (If you dont like working with index just run
forecasts_cv_df.resetindex())ds
: datestamp or temporal indexcutoff
: the last datestamp or temporal index for the n_windows. If
n_windows=1, then one unique cuttoff value, if n_windows=2 then two
unique cutoff values.y
: true value"model"
: columns with the model’s name and fitted value.ds | cutoff | y | SeasonalNaive | SeasonalNaive-lo-90 | SeasonalNaive-hi-90 | Naive | Naive-lo-90 | Naive-hi-90 | HistoricAverage | HistoricAverage-lo-90 | HistoricAverage-hi-90 | CrostonOptimized | ADIDA | IMAPA | AutoETS | AutoETS-lo-90 | AutoETS-hi-90 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
unique_id | ||||||||||||||||||
FOODS_3_001_CA_1 | 2016-02-29 | 2016-02-28 | 0.0 | 2.0 | -1.878885 | 5.878885 | 0.0 | -1.917011 | 1.917011 | 0.449111 | -1.021813 | 1.920036 | 0.618472 | 0.618375 | 0.617998 | 0.655286 | -0.765731 | 2.076302 |
FOODS_3_001_CA_1 | 2016-03-01 | 2016-02-28 | 1.0 | 0.0 | -3.878885 | 3.878885 | 0.0 | -2.711064 | 2.711064 | 0.449111 | -1.021813 | 1.920036 | 0.618472 | 0.618375 | 0.617998 | 0.568595 | -0.853966 | 1.991155 |
FOODS_3_001_CA_1 | 2016-03-02 | 2016-02-28 | 1.0 | 0.0 | -3.878885 | 3.878885 | 0.0 | -3.320361 | 3.320361 | 0.449111 | -1.021813 | 1.920036 | 0.618472 | 0.618375 | 0.617998 | 0.618805 | -0.805298 | 2.042908 |
FOODS_3_001_CA_1 | 2016-03-03 | 2016-02-28 | 0.0 | 1.0 | -2.878885 | 4.878885 | 0.0 | -3.834023 | 3.834023 | 0.449111 | -1.021813 | 1.920036 | 0.618472 | 0.618375 | 0.617998 | 0.455891 | -0.969753 | 1.881534 |
FOODS_3_001_CA_1 | 2016-03-04 | 2016-02-28 | 0.0 | 1.0 | -2.878885 | 4.878885 | 0.0 | -4.286568 | 4.286568 | 0.449111 | -1.021813 | 1.920036 | 0.618472 | 0.618375 | 0.617998 | 0.591197 | -0.835987 | 2.018380 |
cross_validation
method from the MLForecast
class takes the following arguments.
data
: training data framewindow_size
(int): represents h steps into the future that are
being forecasted. In this case, 24 hours ahead.step_size
(int): step size between each window. In other words:
how often do you want to run the forecasting processes.n_windows
(int): number of windows used for cross-validation. In
other words: what number of forecasting processes in the past do you
want to evaluate.prediction_intervals
: class to compute conformal intervals.unique_id
index: (If you dont like working with index just run
forecasts_cv_df.resetindex())ds
: datestamp or temporal indexcutoff
: the last datestamp or temporal index for the n_windows. If
n_windows=1, then one unique cuttoff value, if n_windows=2 then two
unique cutoff values.y
: true value"model"
: columns with the model’s name and fitted value.unique_id | ds | cutoff | y | LGBMRegressor | XGBRegressor | LinearRegression | |
---|---|---|---|---|---|---|---|
0 | FOODS_3_001_CA_1 | 2016-02-29 | 2016-02-28 | 0.0 | 0.435674 | 0.556261 | -0.312492 |
1 | FOODS_3_001_CA_1 | 2016-03-01 | 2016-02-28 | 1.0 | 0.639676 | 0.625806 | -0.041924 |
2 | FOODS_3_001_CA_1 | 2016-03-02 | 2016-02-28 | 1.0 | 0.792989 | 0.659650 | 0.263699 |
3 | FOODS_3_001_CA_1 | 2016-03-03 | 2016-02-28 | 0.0 | 0.806868 | 0.535121 | 0.482491 |
4 | FOODS_3_001_CA_1 | 2016-03-04 | 2016-02-28 | 0.0 | 0.829106 | 0.313353 | 0.677326 |
unique_id | ds | cutoff | AutoNHITS | AutoNHITS-lo-90 | AutoNHITS-hi-90 | AutoTFT | AutoTFT-lo-90 | AutoTFT-hi-90 | y | |
---|---|---|---|---|---|---|---|---|---|---|
0 | FOODS_3_001_CA_1 | 2016-02-29 | 2016-02-28 | 0.0 | 0.0 | 2.0 | 1.0 | 0.0 | 2.0 | 0.0 |
1 | FOODS_3_001_CA_1 | 2016-03-01 | 2016-02-28 | 0.0 | 0.0 | 2.0 | 1.0 | 0.0 | 2.0 | 1.0 |
2 | FOODS_3_001_CA_1 | 2016-03-02 | 2016-02-28 | 0.0 | 0.0 | 2.0 | 1.0 | 0.0 | 2.0 | 1.0 |
3 | FOODS_3_001_CA_1 | 2016-03-03 | 2016-02-28 | 0.0 | 0.0 | 2.0 | 1.0 | 0.0 | 2.0 | 0.0 |
4 | FOODS_3_001_CA_1 | 2016-03-04 | 2016-02-28 | 0.0 | 0.0 | 2.0 | 1.0 | 0.0 | 2.0 | 0.0 |
dask
to parallelize the evaluation. The
parallelization will be done using fugue
.
evaluate
function receives a unique combination of a time series
and a window, and calculates different metrics
for each model in df
.
dask
client.
transform
function takes the evaluate
functions and applies it
to each combination of time series (unique_id
) and cross validation
window (cutoff
) using the dask
client we created before.
unique_id | cutoff | metric | SeasonalNaive | Naive | HistoricAverage | CrostonOptimized | ADIDA | IMAPA | AutoETS | AutoNHITS | AutoTFT | LGBMRegressor | XGBRegressor | LinearRegression | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | FOODS_3_003_WI_3 | 2016-02-28 | mse | 1.142857 | 1.142857 | 0.816646 | 0.816471 | 1.142857 | 1.142857 | 1.142857 | 1.142857 | 1.142857 | 0.832010 | 1.020361 | 0.887121 |
1 | FOODS_3_003_WI_3 | 2016-02-28 | mae | 0.571429 | 0.571429 | 0.729592 | 0.731261 | 0.571429 | 0.571429 | 0.571429 | 0.571429 | 0.571429 | 0.772788 | 0.619949 | 0.685413 |
2 | FOODS_3_003_WI_3 | 2016-02-28 | smape | 71.428574 | 71.428574 | 158.813507 | 158.516235 | 200.000000 | 200.000000 | 200.000000 | 71.428574 | 71.428574 | 145.901947 | 188.159164 | 178.883743 |
3 | FOODS_3_013_CA_3 | 2016-04-24 | mse | 4.000000 | 6.214286 | 2.406764 | 3.561202 | 2.267853 | 2.267600 | 2.268677 | 2.750000 | 2.125000 | 2.160508 | 2.370228 | 2.289606 |
4 | FOODS_3_013_CA_3 | 2016-04-24 | mae | 1.500000 | 2.142857 | 1.214286 | 1.340446 | 1.214286 | 1.214286 | 1.214286 | 1.107143 | 1.142857 | 1.140084 | 1.157548 | 1.148813 |
SeasonalNaive | Naive | HistoricAverage | CrostonOptimized | ADIDA | IMAPA | AutoETS | AutoNHITS | AutoTFT | LGBMRegressor | XGBRegressor | LinearRegression | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
cutoff | metric | ||||||||||||
2016-02-28 | mae | 1.744289 | 2.040496 | 1.730704 | 1.633017 | 1.527965 | 1.528772 | 1.497553 | 1.434938 | 1.485419 | 1.688403 | 1.514102 | 1.576320 |
mse | 14.510710 | 19.080585 | 12.858994 | 11.785032 | 11.114497 | 11.100909 | 10.347847 | 10.010982 | 10.964664 | 10.436206 | 10.968788 | 10.792831 | |
smape | 85.202042 | 87.719086 | 125.418488 | 124.749908 | 127.591858 | 127.704102 | 127.790672 | 79.132614 | 80.983368 | 118.489983 | 140.420578 | 127.043137 | |
2016-03-27 | mae | 1.795973 | 2.106449 | 1.754029 | 1.662087 | 1.570701 | 1.572741 | 1.535301 | 1.432412 | 1.502393 | 1.712493 | 1.600193 | 1.601612 |
mse | 14.810259 | 26.044472 | 12.804104 | 12.020620 | 12.083861 | 12.120033 | 11.315013 | 9.445867 | 10.762877 | 10.723589 | 12.924312 | 10.943772 | |
smape | 87.407471 | 89.453247 | 123.587196 | 123.460030 | 123.428459 | 123.538521 | 123.612991 | 79.926781 | 82.013168 | 116.089699 | 138.885941 | 127.304871 | |
2016-04-24 | mae | 1.785983 | 1.990774 | 1.762506 | 1.609268 | 1.527627 | 1.529721 | 1.501820 | 1.447401 | 1.505127 | 1.692946 | 1.541845 | 1.590985 |
mse | 13.476350 | 16.234917 | 13.151311 | 10.647048 | 10.072225 | 10.062395 | 9.393439 | 9.363891 | 10.436214 | 10.347073 | 10.774202 | 10.608137 | |
smape | 89.238815 | 90.685867 | 121.124947 | 119.721245 | 120.325401 | 120.345284 | 120.649582 | 81.402748 | 83.614029 | 113.334198 | 136.755234 | 124.618622 |
model | MSE |
---|---|
MQCNN | 10.09 |
DeepAR-student_t | 10.11 |
DeepAR-lognormal | 30.20 |
DeepAR | 9.13 |
NPTS | 11.53 |
num_samples
in the neural auto models.Theta
,
ARIMA
,
RNN
, LSTM
, …