> ## Documentation Index
> Fetch the complete documentation index at: https://nixtlaverse.nixtla.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Long-Horizon Probabilistic Forecasting

Long-horizon forecasting is challenging because of the *volatility* of
the predictions and the *computational complexity*. To solve this
problem we created the [NHITS](https://arxiv.org/abs/2201.12886) model
and made the code available [NeuralForecast
library](https://nixtlaverse.nixtla.io/neuralforecast/models.nhits.html).
`NHITS` specializes its partial outputs in the different frequencies of
the time series through hierarchical interpolation and multi-rate input
processing. We model the target time-series with Student’s
t-distribution. The `NHITS` will output the distribution parameters for
each timestamp.

In this notebook we show how to use `NHITS` on the
[ETTm2](https://github.com/zhouhaoyi/ETDataset) benchmark dataset for
probabilistic forecasting. This data set includes data points for 2
Electricity Transformers at 2 stations, including load, oil temperature.

We will show you how to load data, train, and perform automatic
hyperparameter tuning, **to achieve SoTA performance**, outperforming
even the latest Transformer architectures for a fraction of their
computational cost (50x faster).

You can run these experiments using GPU with Google Colab.

<a href="https://colab.research.google.com/github/Nixtla/neuralforecast/blob/main/nbs/docs/tutorials/longhorizon_probabilistic.ipynb" target="_parent">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab" />
</a>

## 1. Libraries

```python theme={null}
%%capture
!pip install neuralforecast datasetsforecast
```

## 2. Load ETTm2 Data

The `LongHorizon` class will automatically download the complete ETTm2
dataset and process it.

It return three Dataframes: `Y_df` contains the values for the target
variables, `X_df` contains exogenous calendar features and `S_df`
contains static features for each time-series (none for ETTm2). For this
example we will only use `Y_df`.

If you want to use your own data just replace `Y_df`. Be sure to use a
long format and have a similar structure to our data set.

```python theme={null}
import pandas as pd
from datasetsforecast.long_horizon import LongHorizon
```

```python theme={null}
# Change this to your own data to try the model
Y_df, _, _ = LongHorizon.load(directory='./', group='ETTm2')
Y_df['ds'] = pd.to_datetime(Y_df['ds'])

# For this excercise we are going to take 960 timestamps as validation and test
n_time = len(Y_df.ds.unique())
val_size = 96*10
test_size = 96*10

Y_df.groupby('unique_id').head(2)
```

|        | unique\_id | ds                  | y         |
| ------ | ---------- | ------------------- | --------- |
| 0      | HUFL       | 2016-07-01 00:00:00 | -0.041413 |
| 1      | HUFL       | 2016-07-01 00:15:00 | -0.185467 |
| 57600  | HULL       | 2016-07-01 00:00:00 | 0.040104  |
| 57601  | HULL       | 2016-07-01 00:15:00 | -0.214450 |
| 115200 | LUFL       | 2016-07-01 00:00:00 | 0.695804  |
| 115201 | LUFL       | 2016-07-01 00:15:00 | 0.434685  |
| 172800 | LULL       | 2016-07-01 00:00:00 | 0.434430  |
| 172801 | LULL       | 2016-07-01 00:15:00 | 0.428168  |
| 230400 | MUFL       | 2016-07-01 00:00:00 | -0.599211 |
| 230401 | MUFL       | 2016-07-01 00:15:00 | -0.658068 |
| 288000 | MULL       | 2016-07-01 00:00:00 | -0.393536 |
| 288001 | MULL       | 2016-07-01 00:15:00 | -0.659338 |
| 345600 | OT         | 2016-07-01 00:00:00 | 1.018032  |
| 345601 | OT         | 2016-07-01 00:15:00 | 0.980124  |

> **Important**
>
> DataFrames must include all `['unique_id', 'ds', 'y']` columns. Make
> sure `y` column does not have missing or non-numeric values.

Next, plot the `HUFL` variable marking the validation and train splits.

```python theme={null}
import matplotlib.pyplot as plt
from utilsforecast.plotting import plot_series
```

```python theme={null}
u_id = 'HUFL'
fig = plot_series(Y_df, ids=[u_id])
ax = fig.axes[0]

x_plot = pd.to_datetime(Y_df[Y_df.unique_id==u_id].ds)
y_plot = Y_df[Y_df.unique_id==u_id].y.values
x_val = x_plot[n_time - val_size - test_size]
x_test = x_plot[n_time - test_size]

ax.axvline(x_val, color='black', linestyle='-.')
ax.axvline(x_test, color='black', linestyle='-.')
ax.text(x_val, 5, '  Validation', fontsize=12)
ax.text(x_test, 3, '  Test', fontsize=12)
fig
```

<img src="https://mintcdn.com/nixtla/0bpBL0UL20A7UQ3S/neuralforecast/docs/tutorials/longhorizon_probabilistic_files/figure-markdown_strict/cell-6-output-1.png?fit=max&auto=format&n=0bpBL0UL20A7UQ3S&q=85&s=43db6cd599b18b23ffd9de88a8662010" alt="" width="1697" height="361" data-path="neuralforecast/docs/tutorials/longhorizon_probabilistic_files/figure-markdown_strict/cell-6-output-1.png" />

## 3. Hyperparameter selection and forecasting

The `AutoNHITS` class will automatically perform hyperparamter tunning
using [Tune library](https://docs.ray.io/en/latest/tune/index.html),
exploring a user-defined or default search space. Models are selected
based on the error on a validation set and the best model is then stored
and used during inference.

The `AutoNHITS.default_config` attribute contains a suggested
hyperparameter space. Here, we specify a different search space
following the paper’s hyperparameters. Notice that *1000 Stochastic
Gradient Steps* are enough to achieve SoTA performance. Feel free to
play around with this space.

```python theme={null}
import logging

import torch
from neuralforecast.auto import AutoNHITS
from neuralforecast.core import NeuralForecast
from neuralforecast.losses.pytorch import DistributionLoss
from ray import tune
```

```python theme={null}
logging.getLogger("pytorch_lightning").setLevel(logging.WARNING)
torch.set_float32_matmul_precision('high')
```

```python theme={null}
horizon = 96 # 24hrs = 4 * 15 min.

# Use your own config or AutoNHITS.default_config
nhits_config = {
       "learning_rate": tune.choice([1e-3]),                                     # Initial Learning rate
       "max_steps": tune.choice([1000]),                                         # Number of SGD steps
       "input_size": tune.choice([5 * horizon]),                                 # input_size = multiplier * horizon
       "batch_size": tune.choice([7]),                                           # Number of series in windows
       "windows_batch_size": tune.choice([256]),                                 # Number of windows in batch
       "n_pool_kernel_size": tune.choice([[2, 2, 2], [16, 8, 1]]),               # MaxPool's Kernelsize
       "n_freq_downsample": tune.choice([[168, 24, 1], [24, 12, 1], [1, 1, 1]]), # Interpolation expressivity ratios
       "activation": tune.choice(['ReLU']),                                      # Type of non-linear activation
       "n_blocks":  tune.choice([[1, 1, 1]]),                                    # Blocks per each 3 stacks
       "mlp_units":  tune.choice([[[512, 512], [512, 512], [512, 512]]]),        # 2 512-Layers per block for each stack
       "interpolation_mode": tune.choice(['linear']),                            # Type of multi-step interpolation
       "random_seed": tune.randint(1, 10),
       "scaler_type": tune.choice(['robust']),
       "val_check_steps": tune.choice([100])
    }
```

> **Tip**
>
> Refer to [https://docs.ray.io/en/latest/tune/index.html](https://docs.ray.io/en/latest/tune/index.html) for more
> information on the different space options, such as lists and
> continous intervals.m

To instantiate `AutoNHITS` you need to define:

* `h`: forecasting horizon
* `loss`: training loss. Use the `DistributionLoss` to produce
  probabilistic forecasts.
* `config`: hyperparameter search space. If `None`, the `AutoNHITS`
  class will use a pre-defined suggested hyperparameter space.
* `num_samples`: number of configurations explored.

```python theme={null}
models = [AutoNHITS(h=horizon,
                    loss=DistributionLoss(distribution='StudentT', level=[80, 90]), 
                    config=nhits_config,
                    num_samples=5)]
```

Fit the model by instantiating a `NeuralForecast` object with the
following required parameters:

* `models`: a list of models.

* `freq`: a string indicating the frequency of the data. (See [panda’s
  available
  frequencies](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases).)

```python theme={null}
# Fit and predict
nf = NeuralForecast(models=models, freq='15min')
```

The `cross_validation` method allows you to simulate multiple historic
forecasts, greatly simplifying pipelines by replacing for loops with
`fit` and `predict` methods.

With time series data, cross validation is done by defining a sliding
window across the historical data and predicting the period following
it. This form of cross validation allows us to arrive at a better
estimation of our model’s predictive abilities across a wider range of
temporal instances while also keeping the data in the training set
contiguous as is required by our models.

The `cross_validation` method will use the validation set for
hyperparameter selection, and will then produce the forecasts for the
test set.

```python theme={null}
%%capture
Y_hat_df = nf.cross_validation(df=Y_df, val_size=val_size,
                               test_size=test_size, n_windows=None)
```

## 4. Visualization

Finally, we merge the forecasts with the `Y_df` dataset and plot the
forecasts.

```python theme={null}
Y_hat_df
```

|        | unique\_id | ds                  | cutoff              | AutoNHITS | AutoNHITS-median | AutoNHITS-lo-90 | AutoNHITS-lo-80 | AutoNHITS-hi-80 | AutoNHITS-hi-90 | y         |
| ------ | ---------- | ------------------- | ------------------- | --------- | ---------------- | --------------- | --------------- | --------------- | --------------- | --------- |
| 0      | HUFL       | 2018-02-11 00:00:00 | 2018-02-10 23:45:00 | -0.922304 | -0.914175        | -1.217987       | -1.138274       | -0.708157       | -0.617799       | -0.849571 |
| 1      | HUFL       | 2018-02-11 00:15:00 | 2018-02-10 23:45:00 | -0.954299 | -0.957198        | -1.403932       | -1.263984       | -0.618467       | -0.442688       | -1.049700 |
| 2      | HUFL       | 2018-02-11 00:30:00 | 2018-02-10 23:45:00 | -0.987538 | -0.972558        | -1.512509       | -1.310191       | -0.621673       | -0.444359       | -1.185730 |
| 3      | HUFL       | 2018-02-11 00:45:00 | 2018-02-10 23:45:00 | -1.067760 | -1.063188        | -1.614276       | -1.475302       | -0.665729       | -0.521775       | -1.329785 |
| 4      | HUFL       | 2018-02-11 01:00:00 | 2018-02-10 23:45:00 | -1.001276 | -1.001494        | -1.508795       | -1.390156       | -0.629212       | -0.470608       | -1.369715 |
| ...    | ...        | ...                 | ...                 | ...       | ...              | ...             | ...             | ...             | ...             | ...       |
| 581275 | OT         | 2018-02-20 22:45:00 | 2018-02-19 23:45:00 | -1.200041 | -1.200862        | -1.591271       | -1.490571       | -0.907190       | -0.779424       | -1.581325 |
| 581276 | OT         | 2018-02-20 23:00:00 | 2018-02-19 23:45:00 | -1.237206 | -1.225333        | -1.618691       | -1.518204       | -0.960075       | -0.838512       | -1.581325 |
| 581277 | OT         | 2018-02-20 23:15:00 | 2018-02-19 23:45:00 | -1.232434 | -1.229675        | -1.591164       | -1.481251       | -0.989993       | -0.870404       | -1.581325 |
| 581278 | OT         | 2018-02-20 23:30:00 | 2018-02-19 23:45:00 | -1.259237 | -1.258848        | -1.659239       | -1.536979       | -0.985581       | -0.822370       | -1.562328 |
| 581279 | OT         | 2018-02-20 23:45:00 | 2018-02-19 23:45:00 | -1.247161 | -1.251899        | -1.631909       | -1.520350       | -0.949529       | -0.832602       | -1.562328 |

```python theme={null}
Y_hat_df = Y_hat_df.reset_index(drop=True)
Y_hat_df = Y_hat_df[(Y_hat_df['unique_id']=='OT') & (Y_hat_df['cutoff']=='2018-02-11 12:00:00')]
Y_hat_df = Y_hat_df.drop(columns=['y','cutoff'])
```

```python theme={null}
plot_df = Y_df.merge(Y_hat_df, on=['unique_id','ds'], how='outer').tail(96*10+50+96*4).head(96*2+96*4)
plot_series(forecasts_df=plot_df.drop(columns='AutoNHITS').rename(columns={'AutoNHITS-median': 'AutoNHITS'}), level=[90])
```

<img src="https://mintcdn.com/nixtla/0bpBL0UL20A7UQ3S/neuralforecast/docs/tutorials/longhorizon_probabilistic_files/figure-markdown_strict/cell-15-output-1.png?fit=max&auto=format&n=0bpBL0UL20A7UQ3S&q=85&s=14a65fc7b77e58b7494a18312fe01c97" alt="" width="1827" height="361" data-path="neuralforecast/docs/tutorials/longhorizon_probabilistic_files/figure-markdown_strict/cell-15-output-1.png" />

## References

[Cristian Challu, Kin G. Olivares, Boris N. Oreshkin, Federico Garza,
Max Mergenthaler-Canseco, Artur Dubrawski (2021). NHITS: Neural
Hierarchical Interpolation for Time Series Forecasting. Accepted at AAAI
2023.](https://arxiv.org/abs/2201.12886)
