Quick Start (Polars)

Large collections of time series organized into structures at different aggregation levels often require their forecasts to follow their aggregation constraints, which poses the challenge of creating novel algorithms capable of coherent forecasts. The HierarchicalForecast package provides a wide collection of Python implementations of hierarchical forecasting algorithms that follow classic hierarchical reconciliation. In this notebook we will show how to use the StatsForecast library to produce base forecasts, and use HierarchicalForecast package to perform hierarchical reconciliation. You can run these experiments using CPU or GPU with Google Colab.

1. Libraries

!pip install hierarchicalforecast statsforecast datasetsforecast

2. Load Data

In this example we will use the TourismSmall dataset. The following cell gets the time series for the different levels in the hierarchy, the summing matrix S which recovers the full dataset from the bottom level hierarchy and the indices of each hierarchy denoted by tags.

import numpy as np
import polars as pl

from datasetsforecast.hierarchical import HierarchicalData, HierarchicalInfo

group_name = 'TourismSmall'
group = HierarchicalInfo.get_group(group_name)
Y_df, S_df, tags = HierarchicalData.load('./data', group_name)

Y_df = pl.from_pandas(Y_df)
S_df = pl.from_pandas(S_df.reset_index(names="unique_id"))
Y_df = Y_df.with_columns(pl.col('ds').cast(pl.Date))

S_df[:6, :6]

unique_id	nsw-hol-city	nsw-hol-noncity	vic-hol-city	vic-hol-noncity	qld-hol-city
str	f64	f64	f64	f64	f64
”total”	1.0	1.0	1.0	1.0	1.0
”hol”	1.0	1.0	1.0	1.0	1.0
”vfr”	0.0	0.0	0.0	0.0	0.0
”bus”	0.0	0.0	0.0	0.0	0.0
”oth”	0.0	0.0	0.0	0.0	0.0
”nsw-hol”	1.0	1.0	0.0	0.0	0.0

tags

{'Country': array(['total'], dtype=object),
 'Country/Purpose': array(['hol', 'vfr', 'bus', 'oth'], dtype=object),
 'Country/Purpose/State': array(['nsw-hol', 'vic-hol', 'qld-hol', 'sa-hol', 'wa-hol', 'tas-hol',
        'nt-hol', 'nsw-vfr', 'vic-vfr', 'qld-vfr', 'sa-vfr', 'wa-vfr',
        'tas-vfr', 'nt-vfr', 'nsw-bus', 'vic-bus', 'qld-bus', 'sa-bus',
        'wa-bus', 'tas-bus', 'nt-bus', 'nsw-oth', 'vic-oth', 'qld-oth',
        'sa-oth', 'wa-oth', 'tas-oth', 'nt-oth'], dtype=object),
 'Country/Purpose/State/CityNonCity': array(['nsw-hol-city', 'nsw-hol-noncity', 'vic-hol-city',
        'vic-hol-noncity', 'qld-hol-city', 'qld-hol-noncity',
        'sa-hol-city', 'sa-hol-noncity', 'wa-hol-city', 'wa-hol-noncity',
        'tas-hol-city', 'tas-hol-noncity', 'nt-hol-city', 'nt-hol-noncity',
        'nsw-vfr-city', 'nsw-vfr-noncity', 'vic-vfr-city',
        'vic-vfr-noncity', 'qld-vfr-city', 'qld-vfr-noncity',
        'sa-vfr-city', 'sa-vfr-noncity', 'wa-vfr-city', 'wa-vfr-noncity',
        'tas-vfr-city', 'tas-vfr-noncity', 'nt-vfr-city', 'nt-vfr-noncity',
        'nsw-bus-city', 'nsw-bus-noncity', 'vic-bus-city',
        'vic-bus-noncity', 'qld-bus-city', 'qld-bus-noncity',
        'sa-bus-city', 'sa-bus-noncity', 'wa-bus-city', 'wa-bus-noncity',
        'tas-bus-city', 'tas-bus-noncity', 'nt-bus-city', 'nt-bus-noncity',
        'nsw-oth-city', 'nsw-oth-noncity', 'vic-oth-city',
        'vic-oth-noncity', 'qld-oth-city', 'qld-oth-noncity',
        'sa-oth-city', 'sa-oth-noncity', 'wa-oth-city', 'wa-oth-noncity',
        'tas-oth-city', 'tas-oth-noncity', 'nt-oth-city', 'nt-oth-noncity'],
       dtype=object)}

We split the dataframe in train/test splits.

Y_test_df = Y_df.group_by('unique_id').tail(group.horizon)
Y_train_df = Y_df.filter(pl.col('ds') < Y_test_df['ds'].min())

3. Base forecasts

The following cell computes the base forecast for each time series using the auto_arima and naive models. Observe that Y_hat_df contains the forecasts but they are not coherent.

from statsforecast.core import StatsForecast
from statsforecast.models import AutoARIMA, Naive

fcst = StatsForecast(
    models=[AutoARIMA(season_length=group.seasonality), Naive()], 
    freq="1q", 
    n_jobs=-1
)
Y_hat_df = fcst.forecast(df=Y_train_df, h=group.horizon)

4. Hierarchical reconciliation

The following cell makes the previous forecasts coherent using the HierarchicalReconciliation class. The used methods to make the forecasts coherent are:

BottomUp: The reconciliation of the method is a simple addition to the upper levels.
TopDown: The second method constrains the base-level predictions to the top-most aggregate-level serie and then distributes it to the disaggregate series through the use of proportions.
MiddleOut: Anchors the base predictions in a middle level.

from hierarchicalforecast.core import HierarchicalReconciliation
from hierarchicalforecast.methods import BottomUp, TopDown, MiddleOut

reconcilers = [
    BottomUp(),
    TopDown(method='forecast_proportions'),
    MiddleOut(middle_level='Country/Purpose/State', 
              top_down_method='forecast_proportions')
]
hrec = HierarchicalReconciliation(reconcilers=reconcilers)
Y_rec_df = hrec.reconcile(Y_hat_df=Y_hat_df, Y_df=Y_train_df, S_df=S_df, tags=tags)

5. Evaluation

The HierarchicalForecast package includes the evaluate function to evaluate the different hierarchies and we can use utilsforecast to compute the mean absolute error relative to a baseline model.

from hierarchicalforecast.evaluation import evaluate
from utilsforecast.losses import mse

df = Y_rec_df.join(Y_test_df, on=['unique_id', 'ds'])
evaluation = evaluate(df = df,
                      tags = tags,
                      train_df = Y_train_df,
                      metrics = [mse],
                      benchmark="Naive")

evaluation[["level", "metric", "AutoARIMA", "AutoARIMA/BottomUp", "AutoARIMA/TopDown_method-forecast_proportions"]]

level	metric	AutoARIMA	AutoARIMA/BottomUp	AutoARIMA/TopDown_method-forecast_proportions
str	str	f64	f64	f64
”Country"	"mse-scaled”	0.317897	0.226999	0.317897
”Country/Purpose"	"mse-scaled”	0.323207	0.199359	0.251368
”Country/Purpose/State"	"mse-scaled”	0.266118	0.305711	0.308241
”Country/Purpose/State/CityNonC…	“mse-scaled”	0.305173	0.305173	0.305913
”Overall"	"mse-scaled”	0.311707	0.234934	0.289406

Getting Started

Tutorials

API Reference

Quick Start (Polars)

1. Libraries

2. Load Data

3. Base forecasts

4. Hierarchical reconciliation

5. Evaluation

References

Getting Started

Tutorials

API Reference

​1. Libraries

​2. Load Data

​3. Base forecasts

​4. Hierarchical reconciliation

​5. Evaluation

​References

1. Libraries

2. Load Data

3. Base forecasts

4. Hierarchical reconciliation

5. Evaluation

References