> ## Documentation Index
> Fetch the complete documentation index at: https://nixtlaverse.nixtla.io/llms.txt
> Use this file to discover all available pages before exploring further.

# M5

> M5 dataset

##

### `M5`

```python theme={null}
M5(source_url='https://github.com/Nixtla/m5-forecasts/raw/main/datasets/m5.zip')
```

#### `M5.download`

```python theme={null}
download(directory)
```

Downloads M5 Competition Dataset.

**Parameters:**

| Name        | Type                     | Description                         | Default    |
| ----------- | ------------------------ | ----------------------------------- | ---------- |
| `directory` | <code>[str](#str)</code> | Directory path to download dataset. | *required* |

#### `M5.load`

```python theme={null}
load(directory, cache=True)
```

Downloads and loads M5 data.

**Parameters:**

| Name        | Type                       | Description                              | Default           |
| ----------- | -------------------------- | ---------------------------------------- | ----------------- |
| `directory` | <code>[str](#str)</code>   | Directory where data will be downloaded. | *required*        |
| `cache`     | <code>[bool](#bool)</code> | If `True` saves and loads.               | <code>True</code> |

**Returns:**

| Type                                                                                                                                 | Description                                                                                                                                                                                                                                                      |
| ------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| <code>[Tuple](#typing.Tuple)\[[DataFrame](#pandas.DataFrame), [DataFrame](#pandas.DataFrame), [DataFrame](#pandas.DataFrame)]</code> | Tuple\[pd.DataFrame, pd.DataFrame, pd.DataFrame]: Target time series with columns \['unique\_id', 'ds', 'y'], Exogenous time series with columns \['unique\_id', 'ds', 'y'], Static exogenous variables with columns \['unique\_id', 'ds'] and static variables. |

#### `M5.source_url`

```python theme={null}
source_url: str = 'https://github.com/Nixtla/m5-forecasts/raw/main/datasets/m5.zip'
```

## Evaluation class

### `M5Evaluation`

#### `M5Evaluation.aggregate_levels`

```python theme={null}
aggregate_levels(y_hat, categories=None)
```

Aggregates the 30\_480 series to get 42\_840.

**Parameters:**

| Name         | Type                                        | Description                                                      | Default           |
| ------------ | ------------------------------------------- | ---------------------------------------------------------------- | ----------------- |
| `y_hat`      | <code>[DataFrame](#pandas.DataFrame)</code> | Forecasts as wide pandas dataframe with columns \['unique\_id']. | *required*        |
| `categories` | <code>[DataFrame](#pandas.DataFrame)</code> | Categories of M5 dataset (not used). Defaults to None.           | <code>None</code> |

**Returns:**

| Type                                        | Description                                                                               |
| ------------------------------------------- | ----------------------------------------------------------------------------------------- |
| <code>[DataFrame](#pandas.DataFrame)</code> | pd.DataFrame: Aggregated forecasts as wide pandas dataframe with columns \['unique\_id']. |

#### `M5Evaluation.evaluate`

```python theme={null}
evaluate(directory, y_hat, validation=False)
```

Evaluates y\_hat according to M4 methodology.

**Parameters:**

| Name         | Type                                                                              | Description                                                                                                                                                                                                                   | Default            |
| ------------ | --------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------ |
| `directory`  | <code>[str](#str)</code>                                                          | Directory where data will be downloaded.                                                                                                                                                                                      | *required*         |
| `validation` | <code>[bool](#bool)</code>                                                        | Wheter perform validation evaluation. Default False, return test evaluation.                                                                                                                                                  | <code>False</code> |
| `y_hat`      | <code>[Union](#typing.Union)\[[DataFrame](#pandas.DataFrame), [str](#str)]</code> | Forecasts as wide pandas dataframe with columns \['unique\_id'] and forecasts or benchmark url from [https://github.com/Nixtla/m5-forecasts/tree/main/forecasts](https://github.com/Nixtla/m5-forecasts/tree/main/forecasts). | *required*         |

**Returns:**

| Type                                        | Description                                                               |
| ------------------------------------------- | ------------------------------------------------------------------------- |
| <code>[DataFrame](#pandas.DataFrame)</code> | pd.DataFrame: DataFrame with columns OWA, SMAPE, MASE and group as index. |

Examples:

```python theme={null}
m5_winner_url = 'https://github.com/Nixtla/m5-forecasts/raw/main/forecasts/0001 YJ_STU.zip'
winner_evaluation = M5Evaluation.evaluate('data', m5_winner_url)

m5_second_place_url = 'https://github.com/Nixtla/m5-forecasts/raw/main/forecasts/0002 Matthias.zip'
m5_second_place_forecasts = M5Evaluation.load_benchmark('data', m5_second_place_url)
second_place_evaluation = M5Evaluation.evaluate('data', m5_second_place_forecasts)
```

#### `M5Evaluation.levels`

```python theme={null}
levels: dict = dict(Level1=['total'], Level2=['state_id'], Level3=['store_id'], Level4=['cat_id'], Level5=['dept_id'], Level6=['state_id', 'cat_id'], Level7=['state_id', 'dept_id'], Level8=['store_id', 'cat_id'], Level9=['store_id', 'dept_id'], Level10=['item_id'], Level11=['state_id', 'item_id'], Level12=['item_id', 'store_id'])
```

#### `M5Evaluation.load_benchmark`

```python theme={null}
load_benchmark(directory, source_url=None, validation=False)
```

Downloads and loads a bechmark forecasts.

**Parameters:**

| Name         | Type                       | Description                                                                                                                                                                                         | Default            |
| ------------ | -------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------ |
| `directory`  | <code>[str](#str)</code>   | Directory where data will be downloaded.                                                                                                                                                            | *required*         |
| `source_url` | <code>[str](#str)</code>   | Optional benchmark url obtained from [https://github.com/Nixtla/m5-forecasts/tree/master/forecasts](https://github.com/Nixtla/m5-forecasts/tree/master/forecasts). If `None` returns the M5 winner. | <code>None</code>  |
| `validation` | <code>[bool](#bool)</code> | Wheter return validation forecasts. Default False, return test forecasts.                                                                                                                           | <code>False</code> |

**Returns:**

| Type                                   | Description                                            |
| -------------------------------------- | ------------------------------------------------------ |
| <code>[ndarray](#numpy.ndarray)</code> | np.ndarray: Numpy array of shape (n\_series, horizon). |

Example:

```python theme={null}
winner_benchmark = M5Evaluation.load_benchmark('data')
winner_evaluation = M5Evaluation.evaluate('data', winner_benchmark)
```

### URL-based evaluation

The method `evaluate` from the class
[`M5Evaluation`](https://Nixtla.github.io/datasetsforecast/m5.html#m5evaluation)
can receive a url of a [submission to the M5
competiton](https://github.com/Nixtla/m5-forecasts/tree/main/forecasts).

The results compared to the on-the-fly evaluation were obtained from the
[official
evaluation](https://github.com/Mcompetitions/M5-methods/blob/master/Scores%20and%20Ranks.xlsx).

```python theme={null}
m5_winner_url = 'https://github.com/Nixtla/m5-forecasts/raw/main/forecasts/0001 YJ_STU.zip'
winner_evaluation = M5Evaluation.evaluate('data', m5_winner_url)
# Test of the same evaluation as the original one
test_close(winner_evaluation.loc['Total'].item(), 0.520, eps=1e-3)
winner_evaluation
```

### Pandas-based evaluation

Also the method `evaluate` can recevie a pandas DataFrame of forecasts.

```python theme={null}
m5_second_place_url = 'https://github.com/Nixtla/m5-forecasts/raw/main/forecasts/0002 Matthias.zip'
m5_second_place_forecasts = M5Evaluation.load_benchmark('data', m5_second_place_url)
second_place_evaluation = M5Evaluation.evaluate('data', m5_second_place_forecasts)
# Test of the same evaluation as the original one
test_close(second_place_evaluation.loc['Total'].item(), 0.528, eps=1e-3)
second_place_evaluation
```

By default you can load the winner benchmark using the following.

```python theme={null}
winner_benchmark = M5Evaluation.load_benchmark('data')
winner_evaluation = M5Evaluation.evaluate('data', winner_benchmark)
# Test of the same evaluation as the original one
test_close(winner_evaluation.loc['Total'].item(), 0.520, eps=1e-3)
winner_evaluation
```

### Validation evaluation

You can also evaluate the official validation set.

```python theme={null}
winner_benchmark_val = M5Evaluation.load_benchmark('data', validation=True)
winner_evaluation_val = M5Evaluation.evaluate('data', winner_benchmark_val, validation=True)
winner_evaluation_val
```

## Kaggle-Competition-M5 References

The evaluation metric of the Favorita Kaggle competition was the
normalized weighted root mean squared logarithmic error (NWRMSLE).
Perishable items have a score weight of 1.25; otherwise, the weight is
1.0.

$ NWRMSLE = \sqrt{\frac{\sum^{n}_{i=1} w_{i}\left(log(\hat{y}_{i}+1)  - log(y_{i}+1)\right)^{2}}{\sum^{n}_{i=1} w_{i}}}$

|                                Kaggle Competition Forecasting Methods                                | 16D ahead NWRMSLE |
| :--------------------------------------------------------------------------------------------------: | :---------------: |
| [LGBM](https://www.kaggle.com/shixw125/1st-place-lgb-model-public-0-506-private-0-511/comments) \[1] |       0.5091      |
|                       [Seq2Seq WaveNet](https://arxiv.org/abs/1803.04037) \[2]                       |       0.5129      |

1. [Corporación Favorita. Corporación favorita grocery sales
   forecasting. Kaggle Competition Leaderboard,
   2018.](https://www.kaggle.com/c/favorita-grocery-sales-forecasting/leaderboard)
2. [Glib Kechyn, Lucius Yu, Yangguang Zang, and Svyatoslav Kechyn.
   Sales forecasting using wavenet within the framework of the Favorita
   Kaggle competition. Computing Research Repository, abs/1803.04037,
   2018](https://arxiv.org/abs/1803.04037).
