> ## Documentation Index
> Fetch the complete documentation index at: https://nixtlaverse.nixtla.io/llms.txt
> Use this file to discover all available pages before exploring further.

> BaseAuto class for hyperparameter optimization in NeuralForecast. Integrates Optuna, HyperOpt, Dragonfly through Ray for automated model tuning with cross-validation.

# Hyperparameter Optimization | NeuralForecast

Machine Learning forecasting methods are defined by many hyperparameters that
control their behavior, with effects ranging from their speed and memory
requirements to their predictive performance. For a long time, manual
hyperparameter tuning prevailed. This approach is time-consuming, **automated
hyperparameter optimization** methods have been introduced, proving more
efficient than manual tuning, grid search, and random search.<br /><br /> The
`BaseAuto` class offers shared API connections to hyperparameter optimization
algorithms like
[Optuna](https://docs.ray.io/en/latest/tune/examples/bayesopt_example.html),
[HyperOpt](https://docs.ray.io/en/latest/tune/examples/hyperopt_example.html),
[Dragonfly](https://docs.ray.io/en/releases-2.7.0/tune/examples/dragonfly_example.html)
among others through `ray`, which gives you access to grid search, bayesian
optimization and other state-of-the-art tools like
hyperband.

Comprehending the impacts of hyperparameters is still a
precious skill, as it can help guide the design of informed hyperparameter
spaces that are faster to explore automatically.

<img src="https://mintcdn.com/nixtla/ldwvWbCUC65OBWwN/neuralforecast/imgs_models/data_splits.png?fit=max&auto=format&n=ldwvWbCUC65OBWwN&q=85&s=958e6fb0a17a49b3c2516c1582a45e2f" alt="" width="1075" height="420" data-path="neuralforecast/imgs_models/data_splits.png" />

*Figure 1. Example of dataset split (left), validation (yellow) and test (orange). The hyperparameter optimization guiding signal is obtained from the validation set.*

##

### `BaseAuto`

```python theme={null}
BaseAuto(
    cls_model,
    h,
    loss,
    valid_loss,
    config,
    search_alg=BasicVariantGenerator(random_state=1),
    num_samples=10,
    time_budget=None,
    cpus=cpu_count(),
    gpus=torch.cuda.device_count(),
    refit_with_val=False,
    verbose=False,
    alias=None,
    backend="ray",
    callbacks=None,
)
```

Bases: <code>[LightningModule](#pytorch_lightning.LightningModule)</code>

Class for Automatic Hyperparameter Optimization, it builds on top of `ray` to
give access to a wide variety of hyperparameter optimization tools ranging
from classic grid search, to Bayesian optimization and HyperBand algorithm.

The validation loss to be optimized is defined by the `config['loss']` dictionary
value, the config also contains the rest of the hyperparameter search space.

It is important to note that the success of this hyperparameter optimization
heavily relies on a strong correlation between the validation and test periods.

**Parameters:**

| Name             | Type                                                   | Description                                                                                                                                                                                                                                                                                                     | Default                                                                                                     |
| ---------------- | ------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------- |
| `cls_model`      | <code>PyTorch/PyTorchLightning model</code>            | See `neuralforecast.models` [collection here](./models.html).                                                                                                                                                                                                                                                   | *required*                                                                                                  |
| `h`              | <code>[int](#int)</code>                               | Forecast horizon                                                                                                                                                                                                                                                                                                | *required*                                                                                                  |
| `loss`           | <code>PyTorch module</code>                            | Instantiated train loss class from [losses collection](./losses.pytorch.html).                                                                                                                                                                                                                                  | *required*                                                                                                  |
| `valid_loss`     | <code>PyTorch module</code>                            | Instantiated valid loss class from [losses collection](./losses.pytorch.html).                                                                                                                                                                                                                                  | *required*                                                                                                  |
| `config`         | <code>[dict](#dict) or [callable](#callable)</code>    | Dictionary with ray.tune defined search space or function that takes an optuna trial and returns a configuration dict.                                                                                                                                                                                          | *required*                                                                                                  |
| `search_alg`     | <code>ray.tune.search variant or optuna.sampler</code> | For ray see [https://docs.ray.io/en/latest/tune/api\_docs/suggestion.html](https://docs.ray.io/en/latest/tune/api_docs/suggestion.html) For optuna see [https://optuna.readthedocs.io/en/stable/reference/samplers/index.html](https://optuna.readthedocs.io/en/stable/reference/samplers/index.html).          | <code>[BasicVariantGenerator](#ray.tune.search.basic_variant.BasicVariantGenerator)(random\_state=1)</code> |
| `num_samples`    | <code>[int](#int)</code>                               | Number of hyperparameter optimization steps/samples.                                                                                                                                                                                                                                                            | <code>10</code>                                                                                             |
| `time_budget`    | <code>[int](#int)</code>                               | Time budget in seconds for the hyperparameter search.                                                                                                                                                                                                                                                           | <code>None</code>                                                                                           |
| `cpus`           | <code>[int](#int)</code>                               | Number of cpus to use during optimization. Only used with ray tune.                                                                                                                                                                                                                                             | <code>[cpu\_count](#os.cpu_count)()</code>                                                                  |
| `gpus`           | <code>[int](#int)</code>                               | Number of gpus to use during optimization, default all available. Only used with ray tune.                                                                                                                                                                                                                      | <code>[device\_count](#torch.cuda.device_count)()</code>                                                    |
| `refit_with_val` | <code>[bool](#bool)</code>                             | Refit of best model should preserve val\_size.                                                                                                                                                                                                                                                                  | <code>False</code>                                                                                          |
| `verbose`        | <code>[bool](#bool)</code>                             | Track progress.                                                                                                                                                                                                                                                                                                 | <code>False</code>                                                                                          |
| `alias`          | <code>[str](#str)</code>                               | Custom name of the model.                                                                                                                                                                                                                                                                                       | <code>None</code>                                                                                           |
| `backend`        | <code>[str](#str)</code>                               | Backend to use for searching the hyperparameter space, can be either 'ray' or 'optuna'.                                                                                                                                                                                                                         | <code>'ray'</code>                                                                                          |
| `callbacks`      | <code>list of callable</code>                          | List of functions to call during the optimization process. ray reference: [https://docs.ray.io/en/latest/tune/tutorials/tune-metrics.html](https://docs.ray.io/en/latest/tune/tutorials/tune-metrics.html) optuna reference: [https://optuna.readthedocs.io/en/stable](https://optuna.readthedocs.io/en/stable) | <code>None</code>                                                                                           |

#### `BaseAuto.fit`

```python theme={null}
fit(
    dataset, val_size=0, test_size=0, random_seed=None, distributed_config=None
)
```

BaseAuto.fit

Perform the hyperparameter optimization as specified by the BaseAuto configuration
dictionary `config`.

The optimization is performed on the `TimeSeriesDataset` using temporal cross validation with
the validation set that sequentially precedes the test set.

**Parameters:**

| Name          | Type                                              | Description                                                                 | Default           |
| ------------- | ------------------------------------------------- | --------------------------------------------------------------------------- | ----------------- |
| `dataset`     | <code>NeuralForecast's `TimeSeriesDataset`</code> | NeuralForecast's `TimeSeriesDataset` see details [here](./tsdataset.html)   | *required*        |
| `val_size`    | <code>[int](#int)</code>                          | Size of temporal validation set (needs to be bigger than 0).                | <code>0</code>    |
| `test_size`   | <code>[int](#int)</code>                          | Size of temporal test set (default 0).                                      | <code>0</code>    |
| `random_seed` | <code>[int](#int)</code>                          | Random seed for hyperparameter exploration algorithms, not yet implemented. | <code>None</code> |

**Returns:**

| Name   | Type | Description                                                          |
| ------ | ---- | -------------------------------------------------------------------- |
| `self` |      | Fitted instance of `BaseAuto` with best hyperparameters and results. |

#### `BaseAuto.predict`

```python theme={null}
predict(dataset, step_size=1, h=None, **data_kwargs)
```

BaseAuto.predict

Predictions of the best performing model on validation.

**Parameters:**

| Name           | Type                                              | Description                                                                     | Default           |
| -------------- | ------------------------------------------------- | ------------------------------------------------------------------------------- | ----------------- |
| `dataset`      | <code>NeuralForecast's `TimeSeriesDataset`</code> | NeuralForecast's `TimeSeriesDataset` see details [here](./tsdataset.html)       | *required*        |
| `step_size`    | <code>[int](#int)</code>                          | Steps between sequential predictions, (default 1).                              | <code>1</code>    |
| `h`            | <code>[int](#int)</code>                          | Prediction horizon, if None, uses the model's fitted horizon. Defaults to None. | <code>None</code> |
| `**data_kwarg` |                                                   | Additional parameters for the dataset module.                                   | *required*        |

**Returns:**

| Name    | Type | Description                                      |
| ------- | ---- | ------------------------------------------------ |
| `y_hat` |      | Numpy predictions of the `NeuralForecast` model. |

### Usage Example

```python theme={null}
class RayLogLossesCallback(tune.Callback):
    def on_trial_complete(self, iteration, trials, trial, **info):
        result = trial.last_result
        print(40 * '-' + 'Trial finished' + 40 * '-')
        print(f'Train loss: {result["train_loss"]:.2f}. Valid loss: {result["loss"]:.2f}')
        print(80 * '-')
```

```python theme={null}
config = {
    "hidden_size": tune.choice([512]),
    "num_layers": tune.choice([3, 4]),
    "input_size": 12,
    "max_steps": 10,
    "val_check_steps": 5
}
auto = BaseAuto(h=12, loss=MAE(), valid_loss=MSE(), cls_model=MLP, config=config, num_samples=2, cpus=1, gpus=0, callbacks=[RayLogLossesCallback()])
auto.fit(dataset=dataset)
y_hat = auto.predict(dataset=dataset)
assert mae(Y_test_df['y'].values, y_hat[:, 0]) < 200
```

```python theme={null}
def config_f(trial):
    return {
        "hidden_size": trial.suggest_categorical('hidden_size', [512]),
        "num_layers": trial.suggest_categorical('num_layers', [3, 4]),
        "input_size": 12,
        "max_steps": 10,
        "val_check_steps": 5
    }

class OptunaLogLossesCallback:
    def __call__(self, study, trial):
        metrics = trial.user_attrs['METRICS']
        print(40 * '-' + 'Trial finished' + 40 * '-')
        print(f'Train loss: {metrics["train_loss"]:.2f}. Valid loss: {metrics["loss"]:.2f}')
        print(80 * '-')
```

```python theme={null}
auto2 = BaseAuto(h=12, loss=MAE(), valid_loss=MSE(), cls_model=MLP, config=config_f, search_alg=optuna.samplers.RandomSampler(), num_samples=2, backend='optuna', callbacks=[OptunaLogLossesCallback()])
auto2.fit(dataset=dataset)
assert isinstance(auto2.results, optuna.Study)
y_hat2 = auto2.predict(dataset=dataset)
assert mae(Y_test_df['y'].values, y_hat2[:, 0]) < 200
```

### References

* [James Bergstra, Remi Bardenet, Yoshua Bengio, and Balazs Kegl
  (2011). “Algorithms for Hyper-Parameter Optimization”. In: Advances
  in Neural Information Processing Systems. url:
  https://proceedings.neurips.cc/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf](https://proceedings.neurips.cc/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf)
* [Kirthevasan Kandasamy, Karun Raju Vysyaraju, Willie Neiswanger,
  Biswajit Paria, Christopher R. Collins, Jeff Schneider, Barnabas
  Poczos, Eric P. Xing (2019). “Tuning Hyperparameters without Grad
  Students: Scalable and Robust Bayesian Optimisation with Dragonfly”.
  Journal of Machine Learning Research. url:
  https://arxiv.org/abs/1903.06694](https://arxiv.org/abs/1903.06694)
* [Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh,
  Ameet Talwalkar (2016). “Hyperband: A Novel Bandit-Based Approach to
  Hyperparameter Optimization”. Journal of Machine Learning Research.
  url:
  https://arxiv.org/abs/1603.06560](https://arxiv.org/abs/1603.06560)
