Introduction to Hierarchial Forecasting using HierarchialForecast
You can run these experiments using CPU or GPU with Google Colab.
In many applications, a set of time series is hierarchically organized. Examples include the presence of geographic levels, products, or categories that define different types of aggregations.
In such scenarios, forecasters are often required to provide predictions for all disaggregate and aggregate series. A natural desire is for those predictions to be “coherent”, that is, for the bottom series to add up precisely to the forecasts of the aggregated series.
The above figure shows a simple hierarchical structure where we have four bottom-level series, two middle-level series, and the top level representing the total aggregation. Its hierarchical aggregations or coherency constraints are:
Luckily these constraints can be compactly expressed with the following matrices:
where aggregates the bottom series to the upper levels, and is an identity matrix. The representation of the hierarchical series is then:
To visualize an example, in Figure 2, one can think of the hierarchical time series structure levels to represent different geographical aggregations. For example, in Figure 2, the top level is the total aggregation of series within a country, the middle level being its states and the bottom level its regions.
To achieve “coherency”, most statistical solutions to the hierarchical forecasting challenge implement a two-stage reconciliation process.
First, we obtain a set of the base forecast
Later, we reconcile them into coherent forecasts .
Most hierarchical reconciliation methods can be expressed by the following transformations:
The HierarchicalForecast library offers a Python collection of
reconciliation methods, datasets, evaluation and visualization tools for
the task. Among its available reconciliation methods we have
BottomUp
,
TopDown
,
MiddleOut
,
MinTrace
,
ERM
.
Among its probabilistic coherent methods we have
Normality
,
Bootstrap
,
PERMBU
.
We are going to creat a synthetic data set to illustrate a hierarchical time series structure like the one in Figure 1.
We will create a two level structure with four bottom series where aggregations of the series are self evident.
ds | top_level | middle_level | bottom_level | y | |
---|---|---|---|---|---|
0 | 2000-01-01 | Australia | State1 | r1 | 10 |
1 | 2000-02-01 | Australia | State1 | r1 | 20 |
8 | 2000-01-01 | Australia | State1 | r2 | 10 |
9 | 2000-02-01 | Australia | State1 | r2 | 20 |
16 | 2000-01-01 | Australia | State2 | r3 | 100 |
17 | 2000-02-01 | Australia | State2 | r3 | 200 |
24 | 2000-01-01 | Australia | State2 | r4 | 100 |
25 | 2000-02-01 | Australia | State2 | r4 | 200 |
The previously introduced hierarchical series
is captured within the Y_hier_df
dataframe.
The aggregation constraints matrix is captured
within the S_df
dataframe.
Finally the tags
contains a list within Y_hier_df
composing each
hierarchical level, for example the tags['top_level']
contains
Australia
’s aggregated series index.
unique_id | ds | y | |
---|---|---|---|
0 | Australia | 2000-01-01 | 220 |
1 | Australia | 2000-02-01 | 440 |
8 | Australia/State1 | 2000-01-01 | 20 |
9 | Australia/State1 | 2000-02-01 | 40 |
16 | Australia/State2 | 2000-01-01 | 200 |
17 | Australia/State2 | 2000-02-01 | 400 |
24 | Australia/State1/r1 | 2000-01-01 | 10 |
25 | Australia/State1/r1 | 2000-02-01 | 20 |
32 | Australia/State1/r2 | 2000-01-01 | 10 |
33 | Australia/State1/r2 | 2000-02-01 | 20 |
40 | Australia/State2/r3 | 2000-01-01 | 100 |
41 | Australia/State2/r3 | 2000-02-01 | 200 |
48 | Australia/State2/r4 | 2000-01-01 | 100 |
49 | Australia/State2/r4 | 2000-02-01 | 200 |
unique_id | Australia/State1/r1 | Australia/State1/r2 | Australia/State2/r3 | Australia/State2/r4 | |
---|---|---|---|---|---|
0 | Australia | 1.0 | 1.0 | 1.0 | 1.0 |
1 | Australia/State1 | 1.0 | 1.0 | 0.0 | 0.0 |
2 | Australia/State2 | 0.0 | 0.0 | 1.0 | 1.0 |
3 | Australia/State1/r1 | 1.0 | 0.0 | 0.0 | 0.0 |
4 | Australia/State1/r2 | 0.0 | 1.0 | 0.0 | 0.0 |
5 | Australia/State2/r3 | 0.0 | 0.0 | 1.0 | 0.0 |
6 | Australia/State2/r4 | 0.0 | 0.0 | 0.0 | 1.0 |
Next, we compute the base forecast for each time series using the
naive
model. Observe that Y_hat_df
contains the forecasts but they
are not coherent.
unique_id | ds | Naive | Naive/BottomUp | |
---|---|---|---|---|
0 | Australia | 2000-05-01 | 880.0 | 880.0 |
1 | Australia | 2000-06-01 | 880.0 | 880.0 |
4 | Australia/State1 | 2000-05-01 | 80.0 | 80.0 |
5 | Australia/State1 | 2000-06-01 | 80.0 | 80.0 |
8 | Australia/State2 | 2000-05-01 | 800.0 | 800.0 |
9 | Australia/State2 | 2000-06-01 | 800.0 | 800.0 |
12 | Australia/State1/r1 | 2000-05-01 | 40.0 | 40.0 |
13 | Australia/State1/r1 | 2000-06-01 | 40.0 | 40.0 |
16 | Australia/State1/r2 | 2000-05-01 | 40.0 | 40.0 |
17 | Australia/State1/r2 | 2000-06-01 | 40.0 | 40.0 |
20 | Australia/State2/r3 | 2000-05-01 | 400.0 | 400.0 |
21 | Australia/State2/r3 | 2000-06-01 | 400.0 | 400.0 |
24 | Australia/State2/r4 | 2000-05-01 | 400.0 | 400.0 |
25 | Australia/State2/r4 | 2000-06-01 | 400.0 | 400.0 |
Introduction to Hierarchial Forecasting using HierarchialForecast
You can run these experiments using CPU or GPU with Google Colab.
In many applications, a set of time series is hierarchically organized. Examples include the presence of geographic levels, products, or categories that define different types of aggregations.
In such scenarios, forecasters are often required to provide predictions for all disaggregate and aggregate series. A natural desire is for those predictions to be “coherent”, that is, for the bottom series to add up precisely to the forecasts of the aggregated series.
The above figure shows a simple hierarchical structure where we have four bottom-level series, two middle-level series, and the top level representing the total aggregation. Its hierarchical aggregations or coherency constraints are:
Luckily these constraints can be compactly expressed with the following matrices:
where aggregates the bottom series to the upper levels, and is an identity matrix. The representation of the hierarchical series is then:
To visualize an example, in Figure 2, one can think of the hierarchical time series structure levels to represent different geographical aggregations. For example, in Figure 2, the top level is the total aggregation of series within a country, the middle level being its states and the bottom level its regions.
To achieve “coherency”, most statistical solutions to the hierarchical forecasting challenge implement a two-stage reconciliation process.
First, we obtain a set of the base forecast
Later, we reconcile them into coherent forecasts .
Most hierarchical reconciliation methods can be expressed by the following transformations:
The HierarchicalForecast library offers a Python collection of
reconciliation methods, datasets, evaluation and visualization tools for
the task. Among its available reconciliation methods we have
BottomUp
,
TopDown
,
MiddleOut
,
MinTrace
,
ERM
.
Among its probabilistic coherent methods we have
Normality
,
Bootstrap
,
PERMBU
.
We are going to creat a synthetic data set to illustrate a hierarchical time series structure like the one in Figure 1.
We will create a two level structure with four bottom series where aggregations of the series are self evident.
ds | top_level | middle_level | bottom_level | y | |
---|---|---|---|---|---|
0 | 2000-01-01 | Australia | State1 | r1 | 10 |
1 | 2000-02-01 | Australia | State1 | r1 | 20 |
8 | 2000-01-01 | Australia | State1 | r2 | 10 |
9 | 2000-02-01 | Australia | State1 | r2 | 20 |
16 | 2000-01-01 | Australia | State2 | r3 | 100 |
17 | 2000-02-01 | Australia | State2 | r3 | 200 |
24 | 2000-01-01 | Australia | State2 | r4 | 100 |
25 | 2000-02-01 | Australia | State2 | r4 | 200 |
The previously introduced hierarchical series
is captured within the Y_hier_df
dataframe.
The aggregation constraints matrix is captured
within the S_df
dataframe.
Finally the tags
contains a list within Y_hier_df
composing each
hierarchical level, for example the tags['top_level']
contains
Australia
’s aggregated series index.
unique_id | ds | y | |
---|---|---|---|
0 | Australia | 2000-01-01 | 220 |
1 | Australia | 2000-02-01 | 440 |
8 | Australia/State1 | 2000-01-01 | 20 |
9 | Australia/State1 | 2000-02-01 | 40 |
16 | Australia/State2 | 2000-01-01 | 200 |
17 | Australia/State2 | 2000-02-01 | 400 |
24 | Australia/State1/r1 | 2000-01-01 | 10 |
25 | Australia/State1/r1 | 2000-02-01 | 20 |
32 | Australia/State1/r2 | 2000-01-01 | 10 |
33 | Australia/State1/r2 | 2000-02-01 | 20 |
40 | Australia/State2/r3 | 2000-01-01 | 100 |
41 | Australia/State2/r3 | 2000-02-01 | 200 |
48 | Australia/State2/r4 | 2000-01-01 | 100 |
49 | Australia/State2/r4 | 2000-02-01 | 200 |
unique_id | Australia/State1/r1 | Australia/State1/r2 | Australia/State2/r3 | Australia/State2/r4 | |
---|---|---|---|---|---|
0 | Australia | 1.0 | 1.0 | 1.0 | 1.0 |
1 | Australia/State1 | 1.0 | 1.0 | 0.0 | 0.0 |
2 | Australia/State2 | 0.0 | 0.0 | 1.0 | 1.0 |
3 | Australia/State1/r1 | 1.0 | 0.0 | 0.0 | 0.0 |
4 | Australia/State1/r2 | 0.0 | 1.0 | 0.0 | 0.0 |
5 | Australia/State2/r3 | 0.0 | 0.0 | 1.0 | 0.0 |
6 | Australia/State2/r4 | 0.0 | 0.0 | 0.0 | 1.0 |
Next, we compute the base forecast for each time series using the
naive
model. Observe that Y_hat_df
contains the forecasts but they
are not coherent.
unique_id | ds | Naive | Naive/BottomUp | |
---|---|---|---|---|
0 | Australia | 2000-05-01 | 880.0 | 880.0 |
1 | Australia | 2000-06-01 | 880.0 | 880.0 |
4 | Australia/State1 | 2000-05-01 | 80.0 | 80.0 |
5 | Australia/State1 | 2000-06-01 | 80.0 | 80.0 |
8 | Australia/State2 | 2000-05-01 | 800.0 | 800.0 |
9 | Australia/State2 | 2000-06-01 | 800.0 | 800.0 |
12 | Australia/State1/r1 | 2000-05-01 | 40.0 | 40.0 |
13 | Australia/State1/r1 | 2000-06-01 | 40.0 | 40.0 |
16 | Australia/State1/r2 | 2000-05-01 | 40.0 | 40.0 |
17 | Australia/State1/r2 | 2000-06-01 | 40.0 | 40.0 |
20 | Australia/State2/r3 | 2000-05-01 | 400.0 | 400.0 |
21 | Australia/State2/r3 | 2000-06-01 | 400.0 | 400.0 |
24 | Australia/State2/r4 | 2000-05-01 | 400.0 | 400.0 |
25 | Australia/State2/r4 | 2000-06-01 | 400.0 | 400.0 |