module utilsforecast.feature_engineering

Create exogenous regressors for your models

Global Variables

  • pl

function fourier

fourier(
    df: DataFrame,
    freq: Union[str, int],
    season_length: int,
    k: int,
    h: int = 0,
    id_col: str = 'unique_id',
    time_col: str = 'ds'
) → Tuple[DataFrame, DataFrame]
Compute fourier seasonal terms for training and forecasting Args:
  • df (pandas or polars DataFrame): Dataframe with ids, times and values for the exogenous regressors.
  • freq (str or int): Frequency of the data. Must be a valid pandas or polars offset alias, or an integer.
  • season_length (int): Number of observations per unit of time.
  • Ex: 24 Hourly data.
  • k (int): Maximum order of the fourier terms
  • h (int, optional): Forecast horizon. Defaults to 0.
  • id_col (str, optional): Column that identifies each serie. Defaults to ‘unique_id’.
  • time_col (str, optional): Column that identifies each timestep, its values can be timestamps or integers. Defaults to ‘ds’.
Returns:
  • tuple[pandas or polars DataFrame, pandas or polars DataFrame]: A tuple containing the original DataFrame with the computed features and DataFrame with future values.

function trend

trend(
    df: DataFrame,
    freq: Union[str, int],
    h: int = 0,
    id_col: str = 'unique_id',
    time_col: str = 'ds'
) → Tuple[DataFrame, DataFrame]
Add a trend column with consecutive integers for training and forecasting Args:
  • df (pandas or polars DataFrame): Dataframe with ids, times and values for the exogenous regressors.
  • freq (str or int): Frequency of the data. Must be a valid pandas or polars offset alias, or an integer.
  • h (int, optional): Forecast horizon. Defaults to 0.
  • id_col (str, optional): Column that identifies each serie. Defaults to ‘unique_id’.
  • time_col (str, optional): Column that identifies each timestep, its values can be timestamps or integers. Defaults to ‘ds’.
Returns:
  • tuple[pandas or polars DataFrame, pandas or polars DataFrame]: A tuple containing the original DataFrame with the computed features and DataFrame with future values.

function time_features

time_features(
    df: DataFrame,
    freq: Union[str, int],
    features: List[Union[str, Callable]],
    h: int = 0,
    id_col: str = 'unique_id',
    time_col: str = 'ds'
) → Tuple[DataFrame, DataFrame]
Compute timestamp-based features for training and forecasting Args:
  • df (pandas or polars DataFrame): Dataframe with ids, times and values for the exogenous regressors.
  • freq (str or int): Frequency of the data. Must be a valid pandas or polars offset alias, or an integer.
  • features (list of str or callable): Features to compute. Can be string aliases of timestamp attributes or functions to apply to the times.
  • h (int, optional): Forecast horizon. Defaults to 0.
  • id_col (str, optional): Column that identifies each serie. Defaults to ‘unique_id’.
  • time_col (str, optional): Column that identifies each timestep, its values can be timestamps or integers. Defaults to ‘ds’.
Returns:
  • tuple[pandas or polars DataFrame, pandas or polars DataFrame]: A tuple containing the original DataFrame with the computed features and DataFrame with future values.

function future_exog_to_historic

future_exog_to_historic(
    df: DataFrame,
    freq: Union[str, int],
    features: List[str],
    h: int = 0,
    id_col: str = 'unique_id',
    time_col: str = 'ds'
) → Tuple[DataFrame, DataFrame]
Turn future exogenous features into historic by shifting them h steps. Args:
  • df (pandas or polars DataFrame): Dataframe with ids, times and values for the exogenous regressors.
  • freq (str or int): Frequency of the data. Must be a valid pandas or polars offset alias, or an integer.
  • features (list of str): Features to be converted into historic.
  • h (int, optional): Forecast horizon. Defaults to 0.
  • id_col (str, optional): Column that identifies each serie. Defaults to ‘unique_id’.
  • time_col (str, optional): Column that identifies each timestep, its values can be timestamps or integers. Defaults to ‘ds’.
Returns:
  • tuple[pandas or polars DataFrame, pandas or polars DataFrame]: A tuple containing the original DataFrame with the computed features and DataFrame with future values.

function pipeline

pipeline(
    df: DataFrame,
    features: List[Callable],
    freq: Union[str, int],
    h: int = 0,
    id_col: str = 'unique_id',
    time_col: str = 'ds'
) → Tuple[DataFrame, DataFrame]
Compute several features for training and forecasting Args:
  • df (pandas or polars DataFrame): Dataframe with ids, times and values for the exogenous regressors.
  • features (list of callable): List of features to compute. Must take only df, freq, h, id_col and time_col (other arguments must be fixed).
  • freq (str or int): Frequency of the data. Must be a valid pandas or polars offset alias, or an integer.
  • h (int, optional): Forecast horizon. Defaults to 0.
  • id_col (str, optional): Column that identifies each serie. Defaults to ‘unique_id’.
  • time_col (str, optional): Column that identifies each timestep, its values can be timestamps or integers. Defaults to ‘ds’.
Returns:
  • tuple[pandas or polars DataFrame, pandas or polars DataFrame]: A tuple containing the original DataFrame with the computed features and DataFrame with future values.

This file was automatically generated via lazydocs.