Temporal Aggregation (Tourism)
Temporal Hierarchical Forecasting on Australian Tourism Data
In many applications, a set of time series is hierarchically organized. Examples include the presence of geographic levels, products, or categories that define different types of aggregations. In such scenarios, forecasters are often required to provide predictions for all disaggregate and aggregate series. A natural desire is for those predictions to be “coherent”, that is, for the bottom series to add up precisely to the forecasts of the aggregated series.
In this notebook we present an example on how to use
HierarchicalForecast
to produce coherent forecasts between temporal
levels. We will use the classic Australian Domestic Tourism (Tourism
)
dataset, which contains monthly time series of the number of visitors to
each state of Australia.
We will first load the Tourism
data and produce base forecasts using
an AutoETS
model from StatsForecast
. Then, we reconciliate the
forecasts with several reconciliation algorithms from
HierarchicalForecast
according to a temporal hierarchy.
You can run these experiments using CPU or GPU with Google Colab.
1. Load and Process Data
In this example we will use the Tourism dataset from the Forecasting: Principles and Practice book.
The dataset only contains the time series at the lowest level, so we need to create the time series for all hierarchies.
Country | Region | State | Purpose | ds | y | |
---|---|---|---|---|---|---|
0 | Australia | Adelaide | South Australia | Business | 1998-01-01 | 135.077690 |
1 | Australia | Adelaide | South Australia | Business | 1998-04-01 | 109.987316 |
2 | Australia | Adelaide | South Australia | Business | 1998-07-01 | 166.034687 |
3 | Australia | Adelaide | South Australia | Business | 1998-10-01 | 127.160464 |
4 | Australia | Adelaide | South Australia | Business | 1999-01-01 | 137.448533 |
2. Temporal reconciliation
First, we add a unique_id
to the data.
2a. Split Train/Test sets
We use the final two years (8 quarters) as test set. Consequently, our forecast horizon=8.
2a. Aggregating the dataset according to temporal hierarchy
We first define the temporal aggregation spec. The spec is a dictionary
in which the keys are the name of the aggregation and the value is the
amount of bottom-level timesteps that should be aggregated in that
aggregation. For example, year
consists of 12
months, so we define a
key, value pair "yearly":12
. We can do something similar for other
aggregations that we are interested in.
In this example, we choose a temporal aggregation of year
,
semiannual
and quarter
. The bottom level timesteps have a quarterly
frequency.
We next compute the temporally aggregated train- and test sets using the
aggregate_temporal
function. Note that we have different aggregation matrices S
for the
train- and test set, as the test set contains temporal hierarchies that
are not included in the train set.
Our aggregation matrices aggregate the lowest temporal granularity (quarters) up to years.
temporal_id | quarter-1 | quarter-2 | quarter-3 | quarter-4 | |
---|---|---|---|---|---|
0 | year-1 | 1.0 | 1.0 | 1.0 | 1.0 |
1 | year-2 | 0.0 | 0.0 | 0.0 | 0.0 |
2 | year-3 | 0.0 | 0.0 | 0.0 | 0.0 |
3 | year-4 | 0.0 | 0.0 | 0.0 | 0.0 |
4 | year-5 | 0.0 | 0.0 | 0.0 | 0.0 |
temporal_id | quarter-1 | quarter-2 | quarter-3 | quarter-4 | |
---|---|---|---|---|---|
0 | year-1 | 1.0 | 1.0 | 1.0 | 1.0 |
1 | year-2 | 0.0 | 0.0 | 0.0 | 0.0 |
2 | semiannual-1 | 1.0 | 1.0 | 0.0 | 0.0 |
3 | semiannual-2 | 0.0 | 0.0 | 1.0 | 1.0 |
4 | semiannual-3 | 0.0 | 0.0 | 0.0 | 0.0 |
If you don’t have a test set available, as is usually the case when
you’re making forecasts, it is necessary to create a future dataframe
that holds the correct bottom-level unique_ids and timestamps so that
they can be temporally aggregated. We can use the
make_future_dataframe
helper function for that.
Y_test_df_new
can be then used in
aggregate_temporal
to construct the temporally aggregated structures:
And we can verify that we have the same temporally aggregated test set,
except that Y_test_df_new
doesn’t contain the ground truth values y
.
temporal_id | quarter-1 | quarter-2 | quarter-3 | quarter-4 | quarter-5 | quarter-6 | quarter-7 | quarter-8 | |
---|---|---|---|---|---|---|---|---|---|
0 | year-1 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
1 | year-2 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 |
2 | semiannual-1 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
3 | semiannual-2 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
4 | semiannual-3 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 |
5 | semiannual-4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 |
6 | quarter-1 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
7 | quarter-2 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
8 | quarter-3 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
9 | quarter-4 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
10 | quarter-5 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 |
11 | quarter-6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 |
12 | quarter-7 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
13 | quarter-8 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
temporal_id | unique_id | ds | y | |
---|---|---|---|---|
0 | year-1 | Australia/ACT/Canberra/Business | 2016-10-01 | 754.139245 |
1 | year-2 | Australia/ACT/Canberra/Business | 2017-10-01 | 809.950839 |
2 | year-1 | Australia/ACT/Canberra/Holiday | 2016-10-01 | 735.365896 |
3 | year-2 | Australia/ACT/Canberra/Holiday | 2017-10-01 | 834.717900 |
4 | year-1 | Australia/ACT/Canberra/Other | 2016-10-01 | 175.239916 |
… | … | … | … | … |
4251 | quarter-4 | Australia/Western Australia/Experience Perth/V… | 2016-10-01 | 439.699451 |
4252 | quarter-5 | Australia/Western Australia/Experience Perth/V… | 2017-01-01 | 356.867038 |
4253 | quarter-6 | Australia/Western Australia/Experience Perth/V… | 2017-04-01 | 302.296119 |
4254 | quarter-7 | Australia/Western Australia/Experience Perth/V… | 2017-07-01 | 373.442070 |
4255 | quarter-8 | Australia/Western Australia/Experience Perth/V… | 2017-10-01 | 455.316702 |
temporal_id | unique_id | ds | |
---|---|---|---|
0 | year-1 | Australia/ACT/Canberra/Business | 2016-10-01 |
1 | year-2 | Australia/ACT/Canberra/Business | 2017-10-01 |
2 | year-1 | Australia/ACT/Canberra/Holiday | 2016-10-01 |
3 | year-2 | Australia/ACT/Canberra/Holiday | 2017-10-01 |
4 | year-1 | Australia/ACT/Canberra/Other | 2016-10-01 |
… | … | … | … |
4251 | quarter-4 | Australia/Western Australia/Experience Perth/V… | 2016-10-01 |
4252 | quarter-5 | Australia/Western Australia/Experience Perth/V… | 2017-01-01 |
4253 | quarter-6 | Australia/Western Australia/Experience Perth/V… | 2017-04-01 |
4254 | quarter-7 | Australia/Western Australia/Experience Perth/V… | 2017-07-01 |
4255 | quarter-8 | Australia/Western Australia/Experience Perth/V… | 2017-10-01 |
3b. Computing base forecasts
Now, we need to compute base forecasts for each temporal aggregation.
The following cell computes the base forecasts for each temporal
aggregation in Y_train_df
using the AutoETS
model. Observe that
Y_hat_df
contains the forecasts but they are not coherent.
Note also that both frequency and horizon are different for each
temporal aggregation. In this example, the lowest level has a quarterly
frequency, and a horizon of 8
(constituting 2
years). The year
aggregation thus has a yearly frequency with a horizon of 2
.
It is of course possible to choose a different model for each level in the temporal aggregation - you can be as creative as you like!
3c. Reconcile forecasts
We can use the
HierarchicalReconciliation
class to reconcile the forecasts. In this example we use
BottomUp
and
MinTrace
.
Note that we have to set temporal=True
in the reconcile
function.
Note that temporal reconcilation currently isn’t supported for insample
reconciliation methods, such as MinTrace(method='mint_shrink')
.
4. Evaluation
The HierarchicalForecast
package includes the
evaluate
function to evaluate the different hierarchies.
We evaluate the temporally aggregated forecasts across all temporal aggregations.
level | metric | Base | BottomUp | MinTrace(ols) | |
---|---|---|---|---|---|
0 | year | mae | 47.0000 | 50.8000 | 46.7000 |
1 | year | scaled_crps | 0.0562 | 0.0620 | 0.0666 |
2 | semiannual | mae | 29.5000 | 30.5000 | 29.1000 |
3 | semiannual | scaled_crps | 0.0643 | 0.0681 | 0.0727 |
4 | quarter | mae | 19.4000 | 19.4000 | 18.7000 |
5 | quarter | scaled_crps | 0.0876 | 0.0876 | 0.0864 |
6 | Overall | mae | 26.2000 | 27.1000 | 25.7000 |
7 | Overall | scaled_crps | 0.0765 | 0.0784 | 0.0797 |
MinTrace(ols)
is the best overall point method, scoring the lowest
mae
on the year
and semiannual
aggregated forecasts as well as the
quarter
bottom-level aggregated forecasts. However, the Base
method
is better overall on the probabilistic measure crps
, where it scores
the lowest, indicating that the uncertainty levels predicted with the
Base
method are better in this example.
Appendix: plotting the S matrix
We plot our summing matrix for the test set. It’s fairly
straightforward: there are two years in the test set, consisting of 4
quarters each. * The first row of the S
matrix shows how the
aggregation 2016
can be obtained by summing the 4 quarters in 2016. *
The second row of the S
matrix shows how the aggregation 2017
can be
obtained by summing the 4 quarters in 2017. * The next 4 rows show how
the semi-annual aggregations can be obtained. * The final rows are the
identity matrix for each quarter, denoting the bottom temporal level
(each quarter).