In this notebook, we’ll implement models for intermittent or sparse data
Tip You can use Colab to run this Notebook interactively![]()
Tip For forecasting at scale, we recommend you check this notebook done on Databricks.
pip install statsforecast
plot_series
function from
utilsforecast.plotting
. This function has multiple parameters, and the
required ones to generate the plots in this notebook are explained
below.
df
: A pandas
dataframe with columns [unique_id
, ds
, y
].forecasts_df
: A pandas
dataframe with columns [unique_id
,
ds
] and models.plot_random
: Plots the time series randomly.max_insample_length
: The maximum number of train/insample
observations to be plotted.engine
: The library used to generate the plots. It can also be
matplotlib
for static plots.max_insample_length
. From these plots, we
can confirm that the data is indeed intermittent since it has multiple
periods with zero sales. In fact, in all cases but one, the median value
is zero.
statsforecast.models
and then we need to instantiate them.
models
: The list of models defined in the previous step.freq
: A string indicating the frequency of the data. See pandas’
available
frequencies.n_jobs
: An integer that indicates the number of jobs used in
parallel processing. Use -1 to select all cores.forecast
method, which requires the forecasting horizon (in this case,
28 days) as argument.
The models for intermittent series that are currently available in
StatsForecast can only generate point-forecasts. If prediction intervals
are needed, then a probabilisitic
model should be used.
unique_id | ds | ADIDA | CrostonClassic | IMAPA | TSB | |
---|---|---|---|---|---|---|
0 | FOODS_1_001_CA_1 | 2016-05-23 | 0.791852 | 0.898247 | 0.705835 | 0.434313 |
1 | FOODS_1_001_CA_1 | 2016-05-24 | 0.791852 | 0.898247 | 0.705835 | 0.434313 |
2 | FOODS_1_001_CA_1 | 2016-05-25 | 0.791852 | 0.898247 | 0.705835 | 0.434313 |
3 | FOODS_1_001_CA_1 | 2016-05-26 | 0.791852 | 0.898247 | 0.705835 | 0.434313 |
4 | FOODS_1_001_CA_1 | 2016-05-27 | 0.791852 | 0.898247 | 0.705835 | 0.434313 |
plot_series
function described above.
metric | ADIDA | CrostonClassic | IMAPA | TSB | |
---|---|---|---|---|---|
0 | mae | 0.948729 | 0.944071 | 0.957256 | 1.023126 |