module datasetsforecast.long_horizon


class ETTh1

The ETTh1 dataset monitors an electricity transformer from a region of a province of China including oil temperature and variants of load (such as high useful load and high useless load) from July 2016 to July 2018 at an hourly frequency.

method __init__

__init__(
    freq: str = 'H',
    name: str = 'ETTh1',
    n_ts: int = 1,
    test_size: int = 11520,
    val_size: int = 11520,
    horizons: Tuple[int] = (96, 192, 336, 720)
) → None

class ETTh2

The ETTh2 dataset monitors an electricity transformer from a region of a province of China including oil temperature and variants of load (such as high useful load and high useless load) from July 2016 to July 2018 at an hourly frequency.

method __init__

__init__(
    freq: str = 'H',
    name: str = 'ETTh2',
    n_ts: int = 1,
    test_size: int = 11520,
    val_size: int = 11520,
    horizons: Tuple[int] = (96, 192, 336, 720)
) → None

class ETTm1

The ETTm1 dataset monitors an electricity transformer from a region of a province of China including oil temperature and variants of load (such as high useful load and high useless load) from July 2016 to July 2018 at a fifteen minute frequency.

method __init__

__init__(
    freq: str = '15T',
    name: str = 'ETTm1',
    n_ts: int = 7,
    test_size: int = 11520,
    val_size: int = 11520,
    horizons: Tuple[int] = (96, 192, 336, 720)
) → None

class ETTm2

The ETTm2 dataset monitors an electricity transformer from a region of a province of China including oil temperature and variants of load (such as high useful load and high useless load) from July 2016 to July 2018 at a fifteen minute frequency. Reference: Zhou, et al. Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. AAAI 2021. https://arxiv.org/abs/2012.07436

method __init__

__init__(
    freq: str = '15T',
    name: str = 'ETTm2',
    n_ts: int = 7,
    test_size: int = 11520,
    val_size: int = 11520,
    horizons: Tuple[int] = (96, 192, 336, 720)
) → None

class ECL

The Electricity dataset reports the fifteen minute electricity consumption (KWh) of 321 customers from 2012 to 2014. For comparability, we aggregate it hourly. Reference: Li, S et al. Enhancing the locality and breaking the memory bottleneck of Transformer on time series forecasting. NeurIPS 2019. http://arxiv.org/abs/1907.00235.

method __init__

__init__(
    freq: str = '15T',
    name: str = 'ECL',
    n_ts: int = 321,
    test_size: int = 5260,
    val_size: int = 2632,
    horizons: Tuple[int] = (96, 192, 336, 720)
) → None

class Exchange

The Exchange dataset is a collection of daily exchange rates of eight countries relative to the US dollar. The countries include Australia, UK, Canada, Switzerland, China, Japan, New Zealand and Singapore from 1990 to 2016. Reference: Lai, G., Chang, W., Yang, Y., and Liu, H. Modeling Long and Short-Term Temporal Patterns with Deep Neural Networks. SIGIR 2018. http://arxiv.org/abs/1703.07015.

method __init__

__init__(
    freq: str = 'D',
    name: str = 'Exchange',
    n_ts: int = 8,
    test_size: int = 1517,
    val_size: int = 760,
    horizons: Tuple[int] = (96, 192, 336, 720)
) → None

class TrafficL

This large Traffic dataset was collected by the California Department of Transportation, it reports road hourly occupancy rates of 862 sensors, from January 2015 to December 2016. Reference: Lai, G., Chang, W., Yang, Y., and Liu, H. Modeling Long and Short-Term Temporal Patterns with Deep Neural Networks. SIGIR 2018. http://arxiv.org/abs/1703.07015. Wu, H., Xu, J., Wang, J., and Long, M. Autoformer: Decomposition Transformers with auto-correlation for long-term series forecasting. NeurIPS 2021. https://arxiv.org/abs/2106.13008.

method __init__

__init__(
    freq: str = 'H',
    name: str = 'traffic',
    n_ts: int = 862,
    test_size: int = 3508,
    val_size: int = 1756,
    horizons: Tuple[int] = (96, 192, 336, 720)
) → None

class ILI

This dataset reports weekly recorded influenza-like illness (ILI) patients from Centers for Disease Control and Prevention of the United States from 2002 to 2021. It is measured as a ratio of ILI patients versus the total patients in the week. Reference: Wu, H., Xu, J., Wang, J., and Long, M. Autoformer: Decomposition Transformers with auto-correlation for long-term series forecasting. NeurIPS 2021. https://arxiv.org/abs/2106.13008.

method __init__

__init__(
    freq: str = 'W',
    name: str = 'ili',
    n_ts: int = 7,
    test_size: int = 193,
    val_size: int = 97,
    horizons: Tuple[int] = (24, 36, 48, 60)
) → None

class Weather

This Weather dataset contains the 2020 year of 21 meteorological measurements recorded every 10 minutes from the Weather Station of the Max Planck Biogeochemistry Institute in Jena, Germany. Reference: Wu, H., Xu, J., Wang, J., and Long, M. Autoformer: Decomposition Transformers with auto-correlation for long-term series forecasting. NeurIPS 2021. https://arxiv.org/abs/2106.13008.

method __init__

__init__(
    freq: str = '10M',
    name: str = 'weather',
    n_ts: int = 21,
    test_size: int = 10539,
    val_size: int = 5270,
    horizons: Tuple[int] = (96, 192, 336, 720)
) → None

class LongHorizon

This Long-Horizon datasets wrapper class, provides with utility to download and wrangle the following datasets: ETT, ECL, Exchange, Traffic, ILI and Weather.
  • Each set is normalized with the train data mean and standard deviation.
  • Datasets are partitioned into train, validation and test splits.
  • For all datasets: 70%, 10%, and 20% of observations are train, validation, test, except ETT that uses 20% validation.

method __init__

__init__(
    source_url: str = 'https://nhits-experiments.s3.amazonaws.com/datasets.zip'
) → None

method download

download(directory: str) → None
Download ETT Dataset. Args:
  • directory (str): Directory path to download dataset.

method load

load(
    directory: str,
    group: str,
    cache: bool = True
) → Tuple[DataFrame, Optional[DataFrame], Optional[DataFrame]]
Downloads and long-horizon forecasting benchmark datasets. Args:
  • directory (str): Directory where data will be downloaded.
  • group (str): Group name.
  • Allowed groups: ‘ETTh1’, ‘ETTh2’, ‘ETTm1’, ‘ETTm2’, ‘ECL’, ‘Exchange’, ‘Traffic’, ‘Weather’, ‘ILI’.
  • cache (bool): If True saves and loads
Returns: Tuple[pd.DataFrame, Optional[pd.DataFrame], Optional[pd.DataFrame]]: Target time series with columns [‘unique_id’, ‘ds’, ‘y’], Exogenous time series with columns [‘unique_id’, ‘ds’, ‘y’], Static exogenous variables with columns [‘unique_id’, ‘ds’] and static variables.