In many applications, a set of time series is hierarchically organized. Examples include the presence of geographic levels, products, or categories that define different types of aggregations. In such scenarios, forecasters are often required to provide predictions for all disaggregate and aggregate series. A natural desire is for those predictions to be “coherent”, that is, for the bottom series to add up precisely to the forecasts of the aggregated series.

In this notebook we present an example on how to use HierarchicalForecast to produce coherent forecasts between geographical levels. We will use the Australian Prison Population dataset.

We will first load the dataset and produce base forecasts using an ETS model from StatsForecast, and then reconciliate the forecasts with several reconciliation algorithms from HierarchicalForecast. Finally, we show the performance is comparable with the results reported by the Forecasting: Principles and Practice which uses the R package fable.

You can run these experiments using CPU or GPU with Google Colab.

!pip install hierarchicalforecast statsforecast

1. Load and Process Data

The dataset only contains the time series at the lowest level, so we need to create the time series for all hierarchies.

import numpy as np
import pandas as pd
Y_df = pd.read_csv('https://OTexts.com/fpp3/extrafiles/prison_population.csv')
Y_df = Y_df.rename({'Count': 'y', 'Date': 'ds'}, axis=1)
Y_df.insert(0, 'Country', 'Australia')
Y_df = Y_df[['Country', 'State', 'Gender', 'Legal', 'Indigenous', 'ds', 'y']]
Y_df['ds'] = pd.to_datetime(Y_df['ds']) + pd.DateOffset(months=1)
Y_df.head()
CountryStateGenderLegalIndigenousdsy
0AustraliaACTFemaleRemandedATSI2005-04-010
1AustraliaACTFemaleRemandedNon-ATSI2005-04-012
2AustraliaACTFemaleSentencedATSI2005-04-010
3AustraliaACTFemaleSentencedNon-ATSI2005-04-015
4AustraliaACTMaleRemandedATSI2005-04-017

The dataset can be grouped in the following grouped structure.

hiers = [
    ['Country'],
    ['Country', 'State'], 
    ['Country', 'Gender'], 
    ['Country', 'Legal'], 
    ['Country', 'State', 'Gender', 'Legal']
]

Using the aggregate function from HierarchicalForecast we can get the full set of time series.

from hierarchicalforecast.utils import aggregate
Y_df, S_df, tags = aggregate(Y_df, hiers)
Y_df['y'] = Y_df['y']/1e3
Y_df.head()
unique_iddsy
0Australia2005-04-0124.296
1Australia2005-07-0124.643
2Australia2005-10-0124.511
3Australia2006-01-0124.393
4Australia2006-04-0124.524
S_df.iloc[:5, :5]
unique_idAustralia/ACT/Female/RemandedAustralia/ACT/Female/SentencedAustralia/ACT/Male/RemandedAustralia/ACT/Male/Sentenced
0Australia1.01.01.01.0
1Australia/ACT1.01.01.01.0
2Australia/NSW0.00.00.00.0
3Australia/NT0.00.00.00.0
4Australia/QLD0.00.00.00.0
tags
{'Country': array(['Australia'], dtype=object),
 'Country/State': array(['Australia/ACT', 'Australia/NSW', 'Australia/NT', 'Australia/QLD',
        'Australia/SA', 'Australia/TAS', 'Australia/VIC', 'Australia/WA'],
       dtype=object),
 'Country/Gender': array(['Australia/Female', 'Australia/Male'], dtype=object),
 'Country/Legal': array(['Australia/Remanded', 'Australia/Sentenced'], dtype=object),
 'Country/State/Gender/Legal': array(['Australia/ACT/Female/Remanded', 'Australia/ACT/Female/Sentenced',
        'Australia/ACT/Male/Remanded', 'Australia/ACT/Male/Sentenced',
        'Australia/NSW/Female/Remanded', 'Australia/NSW/Female/Sentenced',
        'Australia/NSW/Male/Remanded', 'Australia/NSW/Male/Sentenced',
        'Australia/NT/Female/Remanded', 'Australia/NT/Female/Sentenced',
        'Australia/NT/Male/Remanded', 'Australia/NT/Male/Sentenced',
        'Australia/QLD/Female/Remanded', 'Australia/QLD/Female/Sentenced',
        'Australia/QLD/Male/Remanded', 'Australia/QLD/Male/Sentenced',
        'Australia/SA/Female/Remanded', 'Australia/SA/Female/Sentenced',
        'Australia/SA/Male/Remanded', 'Australia/SA/Male/Sentenced',
        'Australia/TAS/Female/Remanded', 'Australia/TAS/Female/Sentenced',
        'Australia/TAS/Male/Remanded', 'Australia/TAS/Male/Sentenced',
        'Australia/VIC/Female/Remanded', 'Australia/VIC/Female/Sentenced',
        'Australia/VIC/Male/Remanded', 'Australia/VIC/Male/Sentenced',
        'Australia/WA/Female/Remanded', 'Australia/WA/Female/Sentenced',
        'Australia/WA/Male/Remanded', 'Australia/WA/Male/Sentenced'],
       dtype=object)}

Split Train/Test sets

We use the final two years (8 quarters) as test set.

Y_test_df = Y_df.groupby('unique_id', as_index=False).tail(8)
Y_train_df = Y_df.drop(Y_test_df.index)

2. Computing base forecasts

The following cell computes the base forecasts for each time series in Y_df using the ETS model. Observe that Y_hat_df contains the forecasts but they are not coherent.

from statsforecast.models import AutoETS
from statsforecast.core import StatsForecast
fcst = StatsForecast(models=[AutoETS(season_length=4, model='ZMZ')], 
                     freq='QS', n_jobs=-1)
Y_hat_df = fcst.forecast(df=Y_train_df, h=8, fitted=True)
Y_fitted_df = fcst.forecast_fitted_values()
Y_test_df
unique_iddsy
40Australia2015-04-0135.271
41Australia2015-07-0135.921
42Australia2015-10-0136.067
43Australia2016-01-0136.983
44Australia2016-04-0137.830
2155Australia/WA/Male/Sentenced2016-01-013.894
2156Australia/WA/Male/Sentenced2016-04-013.876
2157Australia/WA/Male/Sentenced2016-07-013.969
2158Australia/WA/Male/Sentenced2016-10-014.076
2159Australia/WA/Male/Sentenced2017-01-014.088
Y_train_df
unique_iddsy
0Australia2005-04-0124.296
1Australia2005-07-0124.643
2Australia2005-10-0124.511
3Australia2006-01-0124.393
4Australia2006-04-0124.524
2147Australia/WA/Male/Sentenced2014-01-013.614
2148Australia/WA/Male/Sentenced2014-04-013.635
2149Australia/WA/Male/Sentenced2014-07-013.692
2150Australia/WA/Male/Sentenced2014-10-013.726
2151Australia/WA/Male/Sentenced2015-01-013.780

3. Reconcile forecasts

The following cell makes the previous forecasts coherent using the HierarchicalReconciliation class. Since the hierarchy structure is not strict, we can’t use methods such as TopDown or MiddleOut. In this example we use BottomUp and MinTrace.

from hierarchicalforecast.methods import BottomUp, MinTrace
from hierarchicalforecast.core import HierarchicalReconciliation
reconcilers = [
    BottomUp(),
    MinTrace(method='mint_shrink')
]
hrec = HierarchicalReconciliation(reconcilers=reconcilers)
Y_rec_df = hrec.reconcile(Y_hat_df=Y_hat_df, Y_df=Y_fitted_df, S=S_df, tags=tags)

The dataframe Y_rec_df contains the reconciled forecasts.

Y_rec_df.head()
unique_iddsAutoETSAutoETS/BottomUpAutoETS/MinTrace_method-mint_shrink
0Australia2015-04-0134.79949734.94647634.923548
1Australia2015-07-0135.19263835.41034235.432421
2Australia2015-10-0135.18821635.58084935.473386
3Australia2016-01-0135.88862835.95187835.939526
4Australia2016-04-0136.04543736.41682936.245158

4. Evaluation

The HierarchicalForecast package includes the HierarchicalEvaluation class to evaluate the different hierarchies and also is capable of compute scaled metrics compared to a benchmark model.

from hierarchicalforecast.evaluation import evaluate
from utilsforecast.losses import mase
from functools import partial
eval_tags = {}
eval_tags['Total'] = tags['Country']
eval_tags['State'] = tags['Country/State']
eval_tags['Legal status'] = tags['Country/Legal']
eval_tags['Gender'] = tags['Country/Gender']
eval_tags['Bottom'] = tags['Country/State/Gender/Legal']

df = Y_rec_df.merge(Y_test_df, on=['unique_id', 'ds'])
evaluation = evaluate(df = df,
                      tags = eval_tags,
                      train_df = Y_train_df,
                      metrics = [partial(mase, seasonality=4)])

numeric_cols = evaluation.select_dtypes(include="number").columns
evaluation[numeric_cols] = evaluation[numeric_cols].map('{:.2f}'.format).astype(np.float64)
evaluation.rename(columns={'AutoETS': 'Base'}, inplace=True)
evaluation
levelmetricBaseAutoETS/BottomUpAutoETS/MinTrace_method-mint_shrink
0Totalmase1.361.071.17
1Statemase1.531.551.59
2Legal statusmase2.402.482.38
3Gendermase1.080.820.93
4Bottommase2.162.162.14
5Overallmase1.991.981.98

Fable Comparison

Observe that we can recover the results reported by the Forecasting: Principles and Practice book. The original results were calculated using the R package fable.

References