`module` `utilsforecast.preprocessing`

Utilities for processing data before training/analysis

`function` `id_time_grid`

id_time_grid(
    df: ~DFType,
    freq: Union[str, int],
    start: Union[str, int, date, datetime] = 'per_serie',
    end: Union[str, int, date, datetime] = 'global',
    id_col: str = 'unique_id',
    time_col: str = 'ds'
) → ~DFType

Generate all expected combiations of ids and times. Args:

df (pandas or polars DataFrame): Input data
freq (str or int): Series’ frequency
start (str, int, date or datetime, optional): Initial timestamp for the series. * ‘per_serie’ uses each serie’s first timestamp * ‘global’ uses the first timestamp seen in the data * Can also be a specific timestamp or integer, e.g. ‘2000-01-01’, 2000 or datetime(2000, 1, 1) Defaults to “per_serie”.
end (str, int, date or datetime, optional): Initial timestamp for the series. * ‘per_serie’ uses each serie’s last timestamp * ‘global’ uses the last timestamp seen in the data * Can also be a specific timestamp or integer, e.g. ‘2000-01-01’, 2000 or datetime(2000, 1, 1) Defaults to “global”.
id_col (str, optional): Column that identifies each serie. Defaults to ‘unique_id’.
time_col (str, optional): Column that identifies each timestamp. Defaults to ‘ds’.

Returns:

pandas or polars DataFrame: Dataframe with expected ids and times.

`function` `fill_gaps`

fill_gaps(
    df: ~DFType,
    freq: Union[str, int],
    start: Union[str, int, date, datetime] = 'per_serie',
    end: Union[str, int, date, datetime] = 'global',
    id_col: str = 'unique_id',
    time_col: str = 'ds'
) → ~DFType

Enforce start and end datetimes for dataframe. Args:

df (pandas or polars DataFrame): Input data
freq (str or int): Series’ frequency
start (str, int, date or datetime, optional): Initial timestamp for the series. * ‘per_serie’ uses each serie’s first timestamp * ‘global’ uses the first timestamp seen in the data * Can also be a specific timestamp or integer, e.g. ‘2000-01-01’, 2000 or datetime(2000, 1, 1) Defaults to “per_serie”.
end (str, int, date or datetime, optional): Initial timestamp for the series. * ‘per_serie’ uses each serie’s last timestamp * ‘global’ uses the last timestamp seen in the data * Can also be a specific timestamp or integer, e.g. ‘2000-01-01’, 2000 or datetime(2000, 1, 1) Defaults to “global”.
id_col (str, optional): Column that identifies each serie. Defaults to ‘unique_id’.
time_col (str, optional): Column that identifies each timestamp. Defaults to ‘ds’.

Returns:

pandas or polars DataFrame: Dataframe with gaps filled.

API Reference

Preprocessing

`module` `utilsforecast.preprocessing`

`function` `id_time_grid`

`function` `fill_gaps`

API Reference

​module utilsforecast.preprocessing

​function id_time_grid

​function fill_gaps

`module` `utilsforecast.preprocessing`

`function` `id_time_grid`

`function` `fill_gaps`