Step-by-step guide on using theDuring this walkthrough, we will become familiar with the mainCrostonClassic Model
withStatsforecast
.
StatsForecast
class and some relevant methods such as
StatsForecast.plot
, StatsForecast.forecast
and
StatsForecast.cross_validation
in other.
The text in this article is largely taken from: 1. Changquan Huang •
Alla Petukhina. Springer series (2022). Applied Time Series Analysis and
Forecasting with
Python. 2.
Ivan Svetunkov. Forecasting and Analytics with the Augmented Dynamic
Adaptive Model (ADAM) 3. James D.
Hamilton. Time Series Analysis Princeton University Press, Princeton,
New Jersey, 1st Edition,
1994.
4. Rob J. Hyndman and George Athanasopoulos (2018). “Forecasting
Principles and Practice (3rd ed)”.
Tip Statsforecast will be needed. To install, see instructions.Next, we import plotting libraries and configure the plotting style.
date | sales | |
---|---|---|
0 | 2022-01-01 00:00:00 | 0 |
1 | 2022-01-01 01:00:00 | 10 |
2 | 2022-01-01 02:00:00 | 0 |
3 | 2022-01-01 03:00:00 | 0 |
4 | 2022-01-01 04:00:00 | 100 |
unique_id
(string, int or category) represents an identifier
for the series.
ds
(datestamp) column should be of a format expected by
Pandas, ideally YYYY-MM-DD for a date or YYYY-MM-DD HH:MM:SS for a
timestamp.
y
(numeric) represents the measurement we wish to forecast.
ds | y | unique_id | |
---|---|---|---|
0 | 2022-01-01 00:00:00 | 0 | 1 |
1 | 2022-01-01 01:00:00 | 10 | 1 |
2 | 2022-01-01 02:00:00 | 0 | 1 |
3 | 2022-01-01 03:00:00 | 0 | 1 |
4 | 2022-01-01 04:00:00 | 100 | 1 |
(ds)
is in an object format, we need
to convert to a date format
Croston Classic Model
.CrostonClassic Model
, they are listed below. For more information,
visit the
documentation.
season_length
.
freq:
a string indicating the frequency of the data. (See pandas’
available
frequencies.)
n_jobs:
n_jobs: int, number of jobs used in the parallel
processing, use -1 for all cores.
fallback_model:
a model to be used if a model fails.
Croston Classic Model
. We can observe it
with the following instruction:
StatsForecast.forecast
method
instead of .fit
and .predict
.
The main difference is that the .forecast
doest not store the fitted
values and is highly scalable in distributed environments.
The forecast method takes two arguments: forecasts next h
(horizon)
and level
.
h (int):
represents the forecast h steps into the future. In this
case, 25 week ahead.unique_id | ds | CrostonClassic | |
---|---|---|---|
0 | 1 | 2023-01-31 20:00:00 | 27.418417 |
1 | 1 | 2023-01-31 21:00:00 | 27.418417 |
2 | 1 | 2023-01-31 22:00:00 | 27.418417 |
… | … | … | … |
497 | 1 | 2023-02-21 13:00:00 | 27.418417 |
498 | 1 | 2023-02-21 14:00:00 | 27.418417 |
499 | 1 | 2023-02-21 15:00:00 | 27.418417 |
h
(for
horizon) and level
.
h (int):
represents the forecast h steps into the future. In this
case, 500 hours ahead.unique_id | ds | CrostonClassic | |
---|---|---|---|
0 | 1 | 2023-01-31 20:00:00 | 27.418417 |
1 | 1 | 2023-01-31 21:00:00 | 27.418417 |
2 | 1 | 2023-01-31 22:00:00 | 27.418417 |
… | … | … | … |
497 | 1 | 2023-02-21 13:00:00 | 27.418417 |
498 | 1 | 2023-02-21 14:00:00 | 27.418417 |
499 | 1 | 2023-02-21 15:00:00 | 27.418417 |
(n_windows=)
, forecasting every second hour
(step_size=50)
. Depending on your computer, this step should take
around 1 min.
The cross_validation method from the StatsForecast class takes the
following arguments.
df:
training data frame
h (int):
represents steps into the future that are being
forecasted. In this case, 500 hours ahead.
step_size (int):
step size between each window. In other words:
how often do you want to run the forecasting processes.
n_windows(int):
number of windows used for cross validation. In
other words: what number of forecasting processes in the past do you
want to evaluate.
unique_id:
series identifier.ds:
datestamp or temporal indexcutoff:
the last datestamp or temporal index for the n_windows
.y:
true valuemodel:
columns with the model’s name and fitted value.unique_id | ds | cutoff | y | CrostonClassic | |
---|---|---|---|---|---|
0 | 1 | 2023-01-23 12:00:00 | 2023-01-23 11:00:00 | 0.0 | 23.655830 |
1 | 1 | 2023-01-23 13:00:00 | 2023-01-23 11:00:00 | 0.0 | 23.655830 |
2 | 1 | 2023-01-23 14:00:00 | 2023-01-23 11:00:00 | 0.0 | 23.655830 |
… | … | … | … | … | … |
2497 | 1 | 2023-02-21 13:00:00 | 2023-01-31 19:00:00 | 60.0 | 27.418417 |
2498 | 1 | 2023-02-21 14:00:00 | 2023-01-31 19:00:00 | 20.0 | 27.418417 |
2499 | 1 | 2023-02-21 15:00:00 | 2023-01-31 19:00:00 | 20.0 | 27.418417 |
unique_id | metric | CrostonClassic | |
---|---|---|---|
0 | 1 | mae | 33.704756 |
1 | 1 | mape | 0.632593 |
2 | 1 | mase | 0.804074 |
3 | 1 | rmse | 45.262709 |
4 | 1 | smape | 0.767960 |