Long-Horizon Probabilistic Forecasting

Long-horizon forecasting is challenging because of the volatility of the predictions and the computational complexity. To solve this problem we created the NHITS model and made the code available NeuralForecast library. NHITS specializes its partial outputs in the different frequencies of the time series through hierarchical interpolation and multi-rate input processing. We model the target time-series with Student’s t-distribution. The NHITS will output the distribution parameters for each timestamp. In this notebook we show how to use NHITS on the ETTm2 benchmark dataset for probabilistic forecasting. This data set includes data points for 2 Electricity Transformers at 2 stations, including load, oil temperature. We will show you how to load data, train, and perform automatic hyperparameter tuning, to achieve SoTA performance, outperforming even the latest Transformer architectures for a fraction of their computational cost (50x faster). You can run these experiments using GPU with Google Colab.

1. Libraries

!pip install neuralforecast datasetsforecast

2. Load ETTm2 Data

The LongHorizon class will automatically download the complete ETTm2 dataset and process it. It return three Dataframes: Y_df contains the values for the target variables, X_df contains exogenous calendar features and S_df contains static features for each time-series (none for ETTm2). For this example we will only use Y_df. If you want to use your own data just replace Y_df. Be sure to use a long format and have a simmilar structure than our data set.

import pandas as pd
from datasetsforecast.long_horizon import LongHorizon

# Change this to your own data to try the model
Y_df, _, _ = LongHorizon.load(directory='./', group='ETTm2')
Y_df['ds'] = pd.to_datetime(Y_df['ds'])

# For this excercise we are going to take 960 timestamps as validation and test
n_time = len(Y_df.ds.unique())
val_size = 96*10
test_size = 96*10

Y_df.groupby('unique_id').head(2)

	unique_id	ds	y
0	HUFL	2016-07-01 00:00:00	-0.041413
1	HUFL	2016-07-01 00:15:00	-0.185467
57600	HULL	2016-07-01 00:00:00	0.040104
57601	HULL	2016-07-01 00:15:00	-0.214450
115200	LUFL	2016-07-01 00:00:00	0.695804
115201	LUFL	2016-07-01 00:15:00	0.434685
172800	LULL	2016-07-01 00:00:00	0.434430
172801	LULL	2016-07-01 00:15:00	0.428168
230400	MUFL	2016-07-01 00:00:00	-0.599211
230401	MUFL	2016-07-01 00:15:00	-0.658068
288000	MULL	2016-07-01 00:00:00	-0.393536
288001	MULL	2016-07-01 00:15:00	-0.659338
345600	OT	2016-07-01 00:00:00	1.018032
345601	OT	2016-07-01 00:15:00	0.980124

Important DataFrames must include all ['unique_id', 'ds', 'y'] columns. Make sure y column does not have missing or non-numeric values.

Next, plot the HUFL variable marking the validation and train splits.

import matplotlib.pyplot as plt
from utilsforecast.plotting import plot_series

u_id = 'HUFL'
fig = plot_series(Y_df, ids=[u_id])
ax = fig.axes[0]

x_plot = pd.to_datetime(Y_df[Y_df.unique_id==u_id].ds)
y_plot = Y_df[Y_df.unique_id==u_id].y.values
x_val = x_plot[n_time - val_size - test_size]
x_test = x_plot[n_time - test_size]

ax.axvline(x_val, color='black', linestyle='-.')
ax.axvline(x_test, color='black', linestyle='-.')
ax.text(x_val, 5, '  Validation', fontsize=12)
ax.text(x_test, 3, '  Test', fontsize=12)
fig

3. Hyperparameter selection and forecasting

The AutoNHITS class will automatically perform hyperparamter tunning using Tune library, exploring a user-defined or default search space. Models are selected based on the error on a validation set and the best model is then stored and used during inference. The AutoNHITS.default_config attribute contains a suggested hyperparameter space. Here, we specify a different search space following the paper’s hyperparameters. Notice that 1000 Stochastic Gradient Steps are enough to achieve SoTA performance. Feel free to play around with this space.

import logging

import torch
from neuralforecast.auto import AutoNHITS
from neuralforecast.core import NeuralForecast
from neuralforecast.losses.pytorch import DistributionLoss
from ray import tune

logging.getLogger("pytorch_lightning").setLevel(logging.WARNING)
torch.set_float32_matmul_precision('high')

horizon = 96 # 24hrs = 4 * 15 min.

# Use your own config or AutoNHITS.default_config
nhits_config = {
       "learning_rate": tune.choice([1e-3]),                                     # Initial Learning rate
       "max_steps": tune.choice([1000]),                                         # Number of SGD steps
       "input_size": tune.choice([5 * horizon]),                                 # input_size = multiplier * horizon
       "batch_size": tune.choice([7]),                                           # Number of series in windows
       "windows_batch_size": tune.choice([256]),                                 # Number of windows in batch
       "n_pool_kernel_size": tune.choice([[2, 2, 2], [16, 8, 1]]),               # MaxPool's Kernelsize
       "n_freq_downsample": tune.choice([[168, 24, 1], [24, 12, 1], [1, 1, 1]]), # Interpolation expressivity ratios
       "activation": tune.choice(['ReLU']),                                      # Type of non-linear activation
       "n_blocks":  tune.choice([[1, 1, 1]]),                                    # Blocks per each 3 stacks
       "mlp_units":  tune.choice([[[512, 512], [512, 512], [512, 512]]]),        # 2 512-Layers per block for each stack
       "interpolation_mode": tune.choice(['linear']),                            # Type of multi-step interpolation
       "random_seed": tune.randint(1, 10),
       "scaler_type": tune.choice(['robust']),
       "val_check_steps": tune.choice([100])
    }

Tip Refer to https://docs.ray.io/en/latest/tune/index.html for more information on the different space options, such as lists and continous intervals.m

To instantiate AutoNHITS you need to define:

h: forecasting horizon
loss: training loss. Use the DistributionLoss to produce probabilistic forecasts.
config: hyperparameter search space. If None, the AutoNHITS class will use a pre-defined suggested hyperparameter space.
num_samples: number of configurations explored.

models = [AutoNHITS(h=horizon,
                    loss=DistributionLoss(distribution='StudentT', level=[80, 90]), 
                    config=nhits_config,
                    num_samples=5)]

Fit the model by instantiating a NeuralForecast object with the following required parameters:

models: a list of models.
freq: a string indicating the frequency of the data. (See panda’s available frequencies.)

# Fit and predict
nf = NeuralForecast(models=models, freq='15min')

The cross_validation method allows you to simulate multiple historic forecasts, greatly simplifying pipelines by replacing for loops with fit and predict methods. With time series data, cross validation is done by defining a sliding window across the historical data and predicting the period following it. This form of cross validation allows us to arrive at a better estimation of our model’s predictive abilities across a wider range of temporal instances while also keeping the data in the training set contiguous as is required by our models. The cross_validation method will use the validation set for hyperparameter selection, and will then produce the forecasts for the test set.

Y_hat_df = nf.cross_validation(df=Y_df, val_size=val_size,
                               test_size=test_size, n_windows=None)

4. Visualization

Finally, we merge the forecasts with the Y_df dataset and plot the forecasts.

Y_hat_df

	unique_id	ds	cutoff	AutoNHITS	AutoNHITS-median	AutoNHITS-lo-90	AutoNHITS-lo-80	AutoNHITS-hi-80	AutoNHITS-hi-90	y
0	HUFL	2018-02-11 00:00:00	2018-02-10 23:45:00	-0.922304	-0.914175	-1.217987	-1.138274	-0.708157	-0.617799	-0.849571
1	HUFL	2018-02-11 00:15:00	2018-02-10 23:45:00	-0.954299	-0.957198	-1.403932	-1.263984	-0.618467	-0.442688	-1.049700
2	HUFL	2018-02-11 00:30:00	2018-02-10 23:45:00	-0.987538	-0.972558	-1.512509	-1.310191	-0.621673	-0.444359	-1.185730
3	HUFL	2018-02-11 00:45:00	2018-02-10 23:45:00	-1.067760	-1.063188	-1.614276	-1.475302	-0.665729	-0.521775	-1.329785
4	HUFL	2018-02-11 01:00:00	2018-02-10 23:45:00	-1.001276	-1.001494	-1.508795	-1.390156	-0.629212	-0.470608	-1.369715
…	…	…	…	…	…	…	…	…	…	…
581275	OT	2018-02-20 22:45:00	2018-02-19 23:45:00	-1.200041	-1.200862	-1.591271	-1.490571	-0.907190	-0.779424	-1.581325
581276	OT	2018-02-20 23:00:00	2018-02-19 23:45:00	-1.237206	-1.225333	-1.618691	-1.518204	-0.960075	-0.838512	-1.581325
581277	OT	2018-02-20 23:15:00	2018-02-19 23:45:00	-1.232434	-1.229675	-1.591164	-1.481251	-0.989993	-0.870404	-1.581325
581278	OT	2018-02-20 23:30:00	2018-02-19 23:45:00	-1.259237	-1.258848	-1.659239	-1.536979	-0.985581	-0.822370	-1.562328
581279	OT	2018-02-20 23:45:00	2018-02-19 23:45:00	-1.247161	-1.251899	-1.631909	-1.520350	-0.949529	-0.832602	-1.562328

Y_hat_df = Y_hat_df.reset_index(drop=True)
Y_hat_df = Y_hat_df[(Y_hat_df['unique_id']=='OT') & (Y_hat_df['cutoff']=='2018-02-11 12:00:00')]
Y_hat_df = Y_hat_df.drop(columns=['y','cutoff'])

plot_df = Y_df.merge(Y_hat_df, on=['unique_id','ds'], how='outer').tail(96*10+50+96*4).head(96*2+96*4)
plot_series(forecasts_df=plot_df.drop(columns='AutoNHITS').rename(columns={'AutoNHITS-median': 'AutoNHITS'}), level=[90])

References

Cristian Challu, Kin G. Olivares, Boris N. Oreshkin, Federico Garza, Max Mergenthaler-Canseco, Artur Dubrawski (2021). NHITS: Neural Hierarchical Interpolation for Time Series Forecasting. Accepted at AAAI 2023.

Getting Started

Capabilities

Tutorials

Use cases

API Reference

Long-Horizon Probabilistic Forecasting

1. Libraries

2. Load ETTm2 Data

3. Hyperparameter selection and forecasting

4. Visualization

References

Getting Started

Capabilities

Tutorials

Use cases

API Reference

​1. Libraries

​2. Load ETTm2 Data

​3. Hyperparameter selection and forecasting

​4. Visualization

​References

1. Libraries

2. Load ETTm2 Data

3. Hyperparameter selection and forecasting

4. Visualization

References