AutoARIMA Comparison (Prophet and pmdarima)
Motivation
The
AutoARIMA
model is widely used to forecast time series in production and as a
benchmark. However, the python implementation (pmdarima
) is so slow
that prevent data scientist practioners from quickly iterating and
deploying
AutoARIMA
in production for a large number of time series. In this notebook we
present Nixtla’s
AutoARIMA
based on the R implementation (developed by Rob Hyndman) and optimized
using numba
.
Example
Libraries
Useful functions
Data
For testing purposes, we will use the Hourly dataset from the M4 competition.
In this example we will use a subset of the data to avoid waiting too long. You can modify the number of series if you want.
Would an autorregresive model be the right choice for our data? There is
no doubt that we observe seasonal periods. The autocorrelation function
(acf
) can help us to answer the question. Intuitively, we have to
observe a decreasing correlation to opt for an AR model.
Thus, we observe a high autocorrelation for previous lags and also for
the seasonal lags. Therefore, we will let auto_arima
to handle our
data.
Training and forecasting
StatsForecast
receives a list of models to fit each time series. Since we are dealing
with Hourly data, it would be benefitial to use 24 as seasonality.
As we see, we can pass season_length
to
AutoARIMA
,
so the definition of our models would be,
ds | AutoARIMA | |
---|---|---|
unique_id | ||
H1 | 701 | 616.084167 |
H1 | 702 | 544.432129 |
H1 | 703 | 510.414490 |
H1 | 704 | 481.046539 |
H1 | 705 | 460.893066 |
Alternatives
pmdarima
You can use the
StatsForecast
class to parallelize your own models. In this section we will use it to
run the auto_arima
model from pmdarima
.
ds | pmdarima | |
---|---|---|
unique_id | ||
H1 | 701 | 628.310547 |
H1 | 702 | 571.659851 |
H1 | 703 | 543.504700 |
H1 | 704 | 517.539062 |
H1 | 705 | 502.829559 |
Prophet
Prophet
is designed to receive a pandas dataframe, so we cannot use
StatForecast
. Therefore, we need to parallize from scratch.
unique_id | ds | prophet | |
---|---|---|---|
0 | H1 | 701 | 635.914254 |
1 | H1 | 702 | 565.976464 |
2 | H1 | 703 | 505.095507 |
3 | H1 | 704 | 462.559539 |
4 | H1 | 705 | 438.766801 |
… | … | … | … |
43 | H112 | 744 | 6184.686240 |
44 | H112 | 745 | 6188.851888 |
45 | H112 | 746 | 6129.306256 |
46 | H112 | 747 | 6058.040672 |
47 | H112 | 748 | 5991.982370 |
Evaluation
Time
Since
AutoARIMA
works with numba is useful to calculate the time for just one time
series.
pmdarima | prophet | AutoARIMA_nixtla | |
---|---|---|---|
n_series | |||
410 | 181686.758059 | 3093.636144 | 573.128222 |
411 | 182129.896494 | 3101.181598 | 574.480358 |
412 | 182573.034928 | 3108.727052 | 575.832494 |
413 | 183016.173362 | 3116.272506 | 577.184630 |
414 | 183459.311796 | 3123.817960 | 578.536766 |
Performance
pmdarima (only two time series)
model | mae | |
---|---|---|
0 | AutoARIMA | 20.289669 |
1 | pmdarima | 24.676279 |
2 | prophet | 39.201933 |
Prophet
model | mae | |
---|---|---|
0 | AutoARIMA | 680.202965 |
1 | prophet | 1058.578963 |
For a complete comparison check the complete experiment.