Amazon Forecast vs StatsForecast
Amazon’s AutoML vs open source statistical methods
Data
We will make use of the M5 competition dataset provided by Walmart. This dataset is interesting for its scale but also the fact that it features many timeseries with infrequent occurances. Such timeseries are common in retail scenarios and are difficult for traditional timeseries forecasting techniques to address.
The data are ready for download at the following URLs:
- Train set:
https://m5-benchmarks.s3.amazonaws.com/data/train/target.parquet
- Temporal exogenous variables (used by AmazonForecast):
https://m5-benchmarks.s3.amazonaws.com/data/train/temporal.parquet
- Static exogenous variables (used by AmazonForecast):
https://m5-benchmarks.s3.amazonaws.com/data/train/static.parquet
A more detailed description of the data can be found here.
Warning
The M5 competition is hierarchical. That is, forecasts are required for different levels of aggregation: national, state, store, etc. In this experiment, we only generate forecasts using the bottom-level data. The evaluation is performed using the bottom-up reconciliation method to obtain the forecasts for the higher hierarchies.
Amazon Forecast
Amazon Forecast is a fully automated solution for time series forecasting. The solution can take the time series to forecast and exogenous variables (temporal and static). For this experiment, we used the AutoPredict functionality of Amazon Forecast following the steps of this tutorial. A detailed description of the particular steps for this dataset can be found here.
Amazon Forecast creates predictors with AutoPredictor, which involves applying the optimal combination of algorithms to each time series in your datasets. The predictor is an Amazon Forecast model that is trained using your target time series, related time series, item metadata, and any additional datasets you include.
Included algorithms range from commonly used statistical algorithms like Autoregressive Integrated Moving Average (ARIMA), to complex neural network algorithms like CNN-QR and DeepAR+.: CNN-QR, DeepAR+, Prophet, NPTS, ARIMA, and ETS.
To leverage the probabilistic features of Amazon Forecast and enable confidence intervals for further analysis we forecasted the following quantiles: 0.1 | 0.5 | 0.9.
The full pipeline of Amazon Forecast took 4.1 hours and the results can
be found here: s3://m5-benchmarks/forecasts/amazonforecast-m5.parquet
Nixtla’s StatsForecast
Install necessary libraries
We assume you have StatsForecast already installed. Check this guide for instructions on how to install StatsForecast.
Additionally, we will install s3fs
to read from the S3 Filesystem of
AWS. (If you don’t want to use a cloud storage provider, you can read
your files locally using pandas)
Input format
We will use pandas to read the data set stored in a parquet file for
efficiency. You can use ordinary pandas operations to read your data in
other formats likes .csv
.
The input to StatsForecast is always a data frame in long
format with
three columns: unique_id
, ds
and y
:
-
The
unique_id
(string, int or category) represents an identifier for the series. -
The
ds
(datestamp) column should be of a format expected by Pandas, ideally YYYY-MM-DD for a date or YYYY-MM-DD HH:MM:SS for a timestamp. -
The
y
(numeric) represents the measurement we wish to forecast. We will rename the
So we will rename the original columns to make it compatible with StatsForecast.
Depending on your internet connection, this step should take around 20 seconds.
Warning
We are reading a file from S3, so you need to install the s3fs library. To install it, run
! pip install s3fs
Read data
unique_id | ds | y | |
---|---|---|---|
0 | FOODS_1_001_CA_1 | 2011-01-29 | 3.0 |
1 | FOODS_1_001_CA_1 | 2011-01-30 | 0.0 |
2 | FOODS_1_001_CA_1 | 2011-01-31 | 0.0 |
3 | FOODS_1_001_CA_1 | 2011-02-01 | 1.0 |
4 | FOODS_1_001_CA_1 | 2011-02-02 | 4.0 |
Train statistical models
We fit the model by instantiating a new
StatsForecast
object with the following parameters:
-
models
: a list of models. Select the models you want from models and import them. For this example, we will useAutoETS
andDynamicOptimizedTheta
. We setseason_length
to 7 because we expect seasonal effects every week. (See: Seasonal periods) -
freq
: a string indicating the frequency of the data. (See panda’s available frequencies.) -
n_jobs
: n_jobs: int, number of jobs used in the parallel processing, use -1 for all cores. -
fallback_model
: a model to be used if a model fails.
Any settings are passed into the constructor. Then you call its fit method and pass in the historical data frame.
Note
StatsForecast achieves its blazing speed using JIT compiling through Numba. The first time you call the statsforecast class, the fit method should take around 5 seconds. The second time -once Numba compiled your settings- it should take less than 0.2s.
-
AutoETS
: Exponential Smoothing model. Automatically selects the best ETS (Error, Trend, Seasonality) model using an information criterion. Ref:AutoETS
. -
SeasonalNaive
: Memory Efficient Seasonal Naive predictions. Ref:SeasonalNaive
. -
DynamicOptimizedTheta
: fit two theta lines to a deseasonalized time series, using different techniques to obtain and combine the two theta lines to produce the final forecasts. Ref:DynamicOptimizedTheta
.
The forecast
method takes two arguments: forecasts the next h
(for
horizon) and level
.
-
h
(int): represents the forecast h steps into the future. In this case, 12 months ahead. -
level
(list of floats): this optional parameter is used for probabilistic forecasting. Set thelevel
(or confidence percentile) of your prediction interval. For example,level=[90]
means that the model expects the real value to be inside that interval 90% of the times.
The forecast object here is a new data frame that includes a column with the name of the model and the y hat values, as well as columns for the uncertainty intervals.
Note
The
forecast
is inteded to be compatible with distributed clusters, so it does not store any model parameters. If you want to store parameter for everymodel you can use thefit
andpredict
methods. However, those methods are not defined for distrubed engines like Spark, Ray or Dask.
Store the results for further evaluation.
Evaluation
This section evaluates the performance of
StatsForecast
and AmazonForecast
. To do this, we first need to install
datasetsforecast, a Python
library developed by Nixtla that includes a large battery of benchmark
datasets and evaluation utilities. The library will allow us to
calculate the performance of the models using the original evaluation
used in the competition.
The following function will allow us to evaluate a specific model included in the input dataframe. The function is useful for evaluating different models.
Now let’s read the forecasts generated for each solution.
Finally, let’s use our predefined function to compute the performance of each model.
Total | Level1 | Level2 | Level3 | Level4 | Level5 | Level6 | Level7 | Level8 | Level9 | Level10 | Level11 | Level12 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
StatsForecast_ThETS_wrmsse | 0.669606 | 0.424331 | 0.515777 | 0.580670 | 0.474098 | 0.552459 | 0.578092 | 0.651079 | 0.642446 | 0.725324 | 1.009390 | 0.967537 | 0.914068 |
StatsForecast_AutoETS_wrmsse | 0.672404 | 0.430474 | 0.516340 | 0.580736 | 0.482090 | 0.559721 | 0.579939 | 0.655362 | 0.643638 | 0.727967 | 1.010596 | 0.968168 | 0.913820 |
StatsForecast_DynamicOptimizedTheta_wrmsse | 0.675333 | 0.429670 | 0.521640 | 0.589278 | 0.478730 | 0.557520 | 0.584278 | 0.656283 | 0.650613 | 0.731735 | 1.013910 | 0.971758 | 0.918576 |
AmazonForecast_p50_wrmsse | 1.617815 | 1.912144 | 1.786991 | 1.736382 | 1.972658 | 2.010498 | 1.805926 | 1.819329 | 1.667225 | 1.619216 | 1.156432 | 1.012942 | 0.914040 |
The results (including processing time and costs) can be summarized in the following table.