Temporal Hierarchical Forecasting on M3 monthly and quarterly data with THIEFIn this notebook we present an example on how to use
HierarchicalForecast to produce coherent forecasts between temporal
levels. We will use the monthly and quarterly timeseries of the M3
dataset. We will first load the M3 data and produce base forecasts
using an AutoETS model from StatsForecast. Then, we reconcile the
forecasts with THIEF (Temporal HIerarchical Forecasting) from
HierarchicalForecast according to a specified temporal hierarchy.
References
Athanasopoulos, G, Hyndman, Rob J., Kourentzes, N., Petropoulos, Fotios (2017). Forecasting with temporal hierarchies. European Journal of Operational Research, 262, 60-74 You can run these experiments using CPU or GPU with Google Colab.1. Load and Process Data
unique_id='M1')
has 68 timesteps. This is not a multiple of 12 (12 months in one year),
so we would not be able to aggregate all timesteps into full years.
Hence, we truncate (remove) the first 8 timesteps, resulting in 60
timesteps for this series. We do something similar for the quarterly
data, albeit with a multiple of 4 (4 quarters in one year).
Depending on the highest temporal aggregation in your reconciliation
problem, you may want to truncate your data differently.
2. Temporal reconciliation
2a. Split Train/Test sets
We use as test samples the last 24 observations from the Monthly series and the last 8 observations of each quarterly series, following the original THIEF paper.2a. Aggregating the dataset according to temporal hierarchy
We first define the temporal aggregation spec. The spec is a dictionary in which the keys are the name of the aggregation and the value is the amount of bottom-level timesteps that should be aggregated in that aggregation. For example,year consists of 12 months, so we define a
key, value pair "yearly":12. We can do something similar for other
aggregations that we are interested in.
aggregate_temporal function. Note that we have different aggregation
matrices S for the train- and test set, as the test set contains
temporal hierarchies that are not included in the train set.
| temporal_id | monthly-1 | monthly-2 | monthly-3 | monthly-4 | |
|---|---|---|---|---|---|
| 0 | yearly-1 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1 | yearly-2 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2 | yearly-3 | 0.0 | 0.0 | 0.0 | 0.0 |
| 3 | yearly-4 | 0.0 | 0.0 | 0.0 | 0.0 |
| 4 | yearly-5 | 0.0 | 0.0 | 0.0 | 0.0 |
| temporal_id | monthly-1 | monthly-2 | monthly-3 | monthly-4 | |
|---|---|---|---|---|---|
| 0 | yearly-1 | 1.0 | 1.0 | 1.0 | 1.0 |
| 1 | yearly-2 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2 | semiannually-1 | 1.0 | 1.0 | 1.0 | 1.0 |
| 3 | semiannually-2 | 0.0 | 0.0 | 0.0 | 0.0 |
| 4 | semiannually-3 | 0.0 | 0.0 | 0.0 | 0.0 |
2b. Computing base forecasts
Now, we need to compute base forecasts for each temporal aggregation. The following cell computes the base forecasts for each temporal aggregation inY_monthly_train and Y_quarterly_train using the
AutoARIMA model. Observe that Y_hats contains the forecasts but they
are not coherent.
Note also that both frequency and horizon are different for each
temporal aggregation. For the monthly data, the lowest level has a
monthly frequency, and a horizon of 24 (constituting 2 years).
However, as example, the year aggregation has a yearly frequency with
a horizon of 2.
It is of course possible to choose a different model for each level in
the temporal aggregation - you can be as creative as you like!
2c. Reconcile forecasts
We can use theHierarchicalReconciliation class to reconcile the
forecasts. In this example we use BottomUp and MinTrace(wls_struct).
The latter is the ‘structural scaling’ method introduced in Forecasting
with temporal
hierarchies.
Note that we have to set temporal=True in the reconcile function.
3. Evaluation
TheHierarchicalForecast package includes the evaluate function to
evaluate the different hierarchies.
We evaluate the temporally aggregated forecasts across all temporal
aggregations.
3a. Monthly
| level | metric | Base | BottomUp | MinTrace(wls_struct) | |
|---|---|---|---|---|---|
| 0 | yearly | mae-scaled | 1.0 | 0.78 | 0.75 |
| 1 | semiannually | mae-scaled | 1.0 | 0.99 | 0.95 |
| 2 | fourmonthly | mae-scaled | 1.0 | 0.96 | 0.93 |
| 3 | quarterly | mae-scaled | 1.0 | 0.95 | 0.93 |
| 4 | bimonthly | mae-scaled | 1.0 | 0.96 | 0.94 |
| 5 | monthly | mae-scaled | 1.0 | 1.00 | 0.99 |
| 6 | Overall | mae-scaled | 1.0 | 0.94 | 0.92 |
MinTrace(wls_struct) is the best overall method, scoring the lowest
mae on all levels.
3b. Quarterly
| level | metric | Base | BottomUp | MinTrace(wls_struct) | |
|---|---|---|---|---|---|
| 0 | yearly | mae-scaled | 1.0 | 0.87 | 0.85 |
| 1 | semiannually | mae-scaled | 1.0 | 1.03 | 1.00 |
| 2 | quarterly | mae-scaled | 1.0 | 1.00 | 0.97 |
| 3 | Overall | mae-scaled | 1.0 | 0.97 | 0.94 |
MinTrace(wls_struct) is the best overall method, scoring the
lowest mae on all levels.
