module datasetsforecast.m5


class M5

M5(source_url: str = ‘https://github.com/Nixtla/m5-forecasts/raw/main/datasets/m5.zip’)

method __init__

__init__(
    source_url: str = 'https://github.com/Nixtla/m5-forecasts/raw/main/datasets/m5.zip'
) → None

method download

download(directory: str) → None
Downloads M5 Competition Dataset. Args:
  • directory (str): Directory path to download dataset.

method load

load(
    directory: str,
    cache: bool = True
) → Tuple[DataFrame, DataFrame, DataFrame]
Downloads and loads M5 data. Args:
  • directory (str): Directory where data will be downloaded.
  • cache (bool): If True saves and loads.
Returns: Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame]: Target time series with columns [‘unique_id’, ‘ds’, ‘y’], Exogenous time series with columns [‘unique_id’, ‘ds’, ‘y’], Static exogenous variables with columns [‘unique_id’, ‘ds’] and static variables.

class M5Evaluation


method aggregate_levels

aggregate_levels(y_hat: DataFrame, categories: DataFrame = None) → DataFrame
Aggregates the 30_480 series to get 42_840. Args:
  • y_hat (pd.DataFrame): Forecasts as wide pandas dataframe with columns [‘unique_id’].
  • categories (pd.DataFrame, optional): Categories of M5 dataset (not used). Defaults to None.
Returns:
  • pd.DataFrame: Aggregated forecasts as wide pandas dataframe with columns [‘unique_id’].

method evaluate

evaluate(
    directory: str,
    y_hat: Union[DataFrame, str],
    validation: bool = False
) → DataFrame
Evaluates y_hat according to M4 methodology. Args:
  • directory (str): Directory where data will be downloaded.
  • validation (bool): Wheter perform validation evaluation. Default False, return test evaluation.
  • y_hat (Union[pd.DataFrame, str]): Forecasts as wide pandas dataframe with columns [‘unique_id’] and forecasts or benchmark url from
  • https: //github.com/Nixtla/m5-forecasts/tree/main/forecasts.
Returns:
  • pd.DataFrame: DataFrame with columns OWA, SMAPE, MASE and group as index.
Examples:
m5_winner_url = 'https://github.com/Nixtla/m5-forecasts/raw/main/forecasts/0001 YJ_STU.zip'
winner_evaluation = M5Evaluation.evaluate('data', m5_winner_url)

m5_second_place_url = 'https://github.com/Nixtla/m5-forecasts/raw/main/forecasts/0002 Matthias.zip'
m5_second_place_forecasts = M5Evaluation.load_benchmark('data', m5_second_place_url)
second_place_evaluation = M5Evaluation.evaluate('data', m5_second_place_forecasts)

method load_benchmark

load_benchmark(
    directory: str,
    source_url: Optional[str] = None,
    validation: bool = False
) → ndarray
Downloads and loads a bechmark forecasts. Args:
  • directory (str): Directory where data will be downloaded.
  • source_url (str, optional): Optional benchmark url obtained from
  • https: //github.com/Nixtla/m5-forecasts/tree/master/forecasts. If None returns the M5 winner.
  • validation (bool): Wheter return validation forecasts. Default False, return test forecasts.
Returns:
  • np.ndarray: Numpy array of shape (n_series, horizon).
Example:
winner_benchmark = M5Evaluation.load_benchmark('data')
winner_evaluation = M5Evaluation.evaluate('data', winner_benchmark)