HierarchicalForecast contains pure Python implementations of hierarchical reconciliation methods as well as a core.HierarchicalReconciliation wrapper class that enables easy interaction with these methods through pandas DataFrames containing the hierarchical time series and the base predictions.

The core.HierarchicalReconciliation reconciliation class operates with the hierarchical time series pd.DataFrame Y_df, the base predictions pd.DataFrame Y_hat_df, the aggregation constraints matrix S. For more information on the creation of aggregation constraints matrix see the utils aggregation method.

HierarchicalReconciliation


source

HierarchicalReconciliation

 HierarchicalReconciliation
                             (reconcilers:list[hierarchicalforecast.method
                             s.HReconciler])

*Hierarchical Reconciliation Class.

The core.HierarchicalReconciliation class allows you to efficiently fit multiple HierarchicaForecast methods for a collection of time series and base predictions stored in pandas DataFrames. The Y_df dataframe identifies series and datestamps with the unique_id and ds columns while the y column denotes the target time series variable. The Y_h dataframe stores the base predictions, example (AutoARIMA, ETS, etc.).

Parameters:
reconcilers: A list of instantiated classes of the reconciliation methods module .

References:
Rob J. Hyndman and George Athanasopoulos (2018). “Forecasting principles and practice, Hierarchical and Grouped Series”.*


source

reconcile

 reconcile (Y_hat_df:Union[ForwardRef('DataFrame[Any]'),ForwardRef('LazyFr
            ame[Any]')], S:Union[ForwardRef('DataFrame[Any]'),ForwardRef('
            LazyFrame[Any]')], tags:dict[str,numpy.ndarray], Y_df:Union[Fo
            rwardRef('DataFrame[Any]'),ForwardRef('LazyFrame[Any]'),NoneTy
            pe]=None, level:Optional[list[int]]=None,
            intervals_method:str='normality', num_samples:int=-1,
            seed:int=0, is_balanced:bool=False, id_col:str='unique_id',
            time_col:str='ds', target_col:str='y')

*Hierarchical Reconciliation Method.

The reconcile method is analogous to SKLearn fit_predict method, it applies different reconciliation techniques instantiated in the reconcilers list.

Most reconciliation methods can be described by the following convenient linear algebra notation:

y~[a,b],τ=S[a,b][b]P[b][a,b]y^[a,b],τ\tilde{\mathbf{y}}_{[a,b],\tau} = \mathbf{S}_{[a,b][b]} \mathbf{P}_{[b][a,b]} \hat{\mathbf{y}}_{[a,b],\tau}

where a,ba, b represent the aggregate and bottom levels, S[a,b][b]\mathbf{S}_{[a,b][b]} contains the hierarchical aggregation constraints, and P[b][a,b]\mathbf{P}_{[b][a,b]} varies across reconciliation methods. The reconciled predictions are y~[a,b],τ\tilde{\mathbf{y}}_{[a,b],\tau}, and the base predictions y^[a,b],τ\hat{\mathbf{y}}_{[a,b],\tau}.

Parameters:
Y_hat_df: DataFrame, base forecasts with columns [‘unique_id’, ‘ds’] and models to reconcile.
Y_df: DataFrame, training set of base time series with columns ['unique_id', 'ds', 'y'].
If a class of self.reconciles receives y_hat_insample, Y_df must include them as columns.
S: DataFrame with summing matrix of size (base, bottom), see aggregate method.
tags: Each key is a level and its value contains tags associated to that level.
level: positive float list [0,100), confidence levels for prediction intervals.
intervals_method: str, method used to calculate prediction intervals, one of normality, bootstrap, permbu.
num_samples: int=-1, if positive return that many probabilistic coherent samples. seed: int=0, random seed for numpy generator’s replicability.
is_balanced: bool=False, wether Y_df is balanced, set it to True to speed things up if Y_df is balanced.
id_col : str=‘unique_id’, column that identifies each serie.
time_col : str=‘ds’, column that identifies each timestep, its values can be timestamps or integers.
target_col : str=‘y’, column that contains the target.

Returns:
Y_tilde_df: DataFrame, with reconciled predictions.*


source

bootstrap_reconcile

 bootstrap_reconcile (Y_hat_df:Union[ForwardRef('DataFrame[Any]'),ForwardR
                      ef('LazyFrame[Any]')], S_df:Union[ForwardRef('DataFr
                      ame[Any]'),ForwardRef('LazyFrame[Any]')],
                      tags:dict[str,numpy.ndarray], Y_df:Union[ForwardRef(
                      'DataFrame[Any]'),ForwardRef('LazyFrame[Any]'),NoneT
                      ype]=None, level:Optional[list[int]]=None,
                      intervals_method:str='normality',
                      num_samples:int=-1, num_seeds:int=1,
                      id_col:str='unique_id', time_col:str='ds',
                      target_col:str='y')

*Bootstraped Hierarchical Reconciliation Method.

Applies N times, based on different random seeds, the reconcile method for the different reconciliation techniques instantiated in the reconcilers list.

Parameters:
Y_hat_df: DataFrame, base forecasts with columns [‘unique_id’, ‘ds’] and models to reconcile.
Y_df: DataFrame, training set of base time series with columns ['unique_id', 'ds', 'y'].
If a class of self.reconciles receives y_hat_insample, Y_df must include them as columns.
S: DataFrame with summing matrix of size (base, bottom), see aggregate method.
tags: Each key is a level and its value contains tags associated to that level.
level: positive float list [0,100), confidence levels for prediction intervals.
intervals_method: str, method used to calculate prediction intervals, one of normality, bootstrap, permbu.
num_samples: int=-1, if positive return that many probabilistic coherent samples. num_seeds: int=1, random seed for numpy generator’s replicability.
id_col : str=‘unique_id’, column that identifies each serie.
time_col : str=‘ds’, column that identifies each timestep, its values can be timestamps or integers.
target_col : str=‘y’, column that contains the target.

Returns:
Y_bootstrap_df: DataFrame, with bootstraped reconciled predictions.*

Example

import pandas as pd

from hierarchicalforecast.core import HierarchicalReconciliation
from hierarchicalforecast.methods import BottomUp, MinTrace
from hierarchicalforecast.utils import aggregate
from hierarchicalforecast.evaluation import evaluate
from statsforecast.core import StatsForecast
from statsforecast.models import AutoETS
from utilsforecast.losses import mase, rmse
from functools import partial

# Load TourismSmall dataset
df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/tourism.csv')
df = df.rename({'Trips': 'y', 'Quarter': 'ds'}, axis=1)
df.insert(0, 'Country', 'Australia')
qs = df['ds'].str.replace(r'(\d+) (Q\d)', r'\1-\2', regex=True)
df['ds'] = pd.PeriodIndex(qs, freq='Q').to_timestamp()

# Create hierarchical seires based on geographic levels and purpose
# And Convert quarterly ds string to pd.datetime format
hierarchy_levels = [['Country'],
                    ['Country', 'State'], 
                    ['Country', 'Purpose'], 
                    ['Country', 'State', 'Region'], 
                    ['Country', 'State', 'Purpose'], 
                    ['Country', 'State', 'Region', 'Purpose']]

Y_df, S_df, tags = aggregate(df=df, spec=hierarchy_levels)

# Split train/test sets
Y_test_df  = Y_df.groupby('unique_id').tail(8)
Y_train_df = Y_df.drop(Y_test_df.index)

# Compute base auto-ETS predictions
# Careful identifying correct data freq, this data quarterly 'Q'
fcst = StatsForecast(models=[AutoETS(season_length=4, model='ZZA')], freq='QS', n_jobs=-1)
Y_hat_df = fcst.forecast(df=Y_train_df, h=8, fitted=True)
Y_fitted_df = fcst.forecast_fitted_values()

reconcilers = [
                BottomUp(),
                MinTrace(method='ols'),
                MinTrace(method='mint_shrink'),
               ]
hrec = HierarchicalReconciliation(reconcilers=reconcilers)
Y_rec_df = hrec.reconcile(Y_hat_df=Y_hat_df, 
                          Y_df=Y_fitted_df,
                          S=S_df, tags=tags)

# Evaluate
eval_tags = {}
eval_tags['Total'] = tags['Country']
eval_tags['Purpose'] = tags['Country/Purpose']
eval_tags['State'] = tags['Country/State']
eval_tags['Regions'] = tags['Country/State/Region']
eval_tags['Bottom'] = tags['Country/State/Region/Purpose']

Y_rec_df_with_y = Y_rec_df.merge(Y_test_df, on=['unique_id', 'ds'], how='left')
mase_p = partial(mase, seasonality=4)

evaluation = evaluate(Y_rec_df_with_y, 
         metrics=[mase_p, rmse], 
         tags=eval_tags, 
         train_df=Y_train_df)

numeric_cols = evaluation.select_dtypes(include="number").columns
evaluation[numeric_cols] = evaluation[numeric_cols].map('{:.2f}'.format)