source

fourier

 fourier
          (df:Union[pandas.core.frame.DataFrame,polars.dataframe.frame.Dat
          aFrame], freq:str, season_length:int, k:int, h:int=0,
          id_col:str='unique_id', time_col:str='ds')

Compute fourier seasonal terms for training and forecasting

TypeDefaultDetails
dfUnionDataframe with ids, times and values for the exogenous regressors.
freqstrFrequency of the data. Must be a valid pandas or polars offset alias, or an integer.
season_lengthintNumber of observations per unit of time. Ex: 24 Hourly data.
kintMaximum order of the fourier terms
hint0Forecast horizon.
id_colstrunique_idColumn that identifies each serie.
time_colstrdsColumn that identifies each timestep, its values can be timestamps or integers.
ReturnsTupleOriginal DataFrame with the computed features
import pandas as pd

from utilsforecast.data import generate_series
series = generate_series(5, equal_ends=True)
transformed_df, future_df = fourier(series, freq='D', season_length=7, k=2, h=1)
transformed_df
unique_iddsysin1_7sin2_7cos1_7cos2_7
002000-10-050.428973-0.9749270.433894-0.222526-0.900964
102000-10-061.423626-0.781835-0.9749260.623486-0.222531
202000-10-072.311782-0.000005-0.0000091.0000001.000000
302000-10-083.1921910.7818290.9749300.623493-0.222512
402000-10-094.1487670.974929-0.433877-0.222517-0.900972
109642001-05-104.058910-0.9749270.433888-0.222523-0.900967
109742001-05-115.178157-0.781823-0.9749340.623500-0.222495
109842001-05-126.133142-0.000002-0.0000031.0000001.000000
109942001-05-130.4037090.7818400.9749220.623479-0.222548
110042001-05-141.0817790.974928-0.433882-0.222520-0.900970
future_df
unique_iddssin1_7sin2_7cos1_7cos2_7
002001-05-150.433871-0.781813-0.9009750.623513
112001-05-150.433871-0.781813-0.9009750.623513
222001-05-150.433871-0.781813-0.9009750.623513
332001-05-150.433871-0.781813-0.9009750.623513
442001-05-150.433871-0.781813-0.9009750.623513

source

trend

 trend
        (df:Union[pandas.core.frame.DataFrame,polars.dataframe.frame.DataF
        rame], freq:str, h:int=0, id_col:str='unique_id',
        time_col:str='ds')

Add a trend column with consecutive integers for training and forecasting

TypeDefaultDetails
dfUnionDataframe with ids, times and values for the exogenous regressors.
freqstrFrequency of the data. Must be a valid pandas or polars offset alias, or an integer.
hint0Forecast horizon.
id_colstrunique_idColumn that identifies each serie.
time_colstrdsColumn that identifies each timestep, its values can be timestamps or integers.
ReturnsTupleOriginal DataFrame with the computed features
series = generate_series(5, equal_ends=True)
transformed_df, future_df = trend(series, freq='D', h=1)
transformed_df
unique_iddsytrend
002000-10-050.428973152.0
102000-10-061.423626153.0
202000-10-072.311782154.0
302000-10-083.192191155.0
402000-10-094.148767156.0
109642001-05-104.058910369.0
109742001-05-115.178157370.0
109842001-05-126.133142371.0
109942001-05-130.403709372.0
110042001-05-141.081779373.0
future_df
unique_iddstrend
002001-05-15374.0
112001-05-15374.0
222001-05-15374.0
332001-05-15374.0
442001-05-15374.0

source

time_features

 time_features
                (df:Union[pandas.core.frame.DataFrame,polars.dataframe.fra
                me.DataFrame], freq:str,
                features:List[Union[str,Callable]], h:int=0,
                id_col:str='unique_id', time_col:str='ds')

Compute timestamp-based features for training and forecasting

TypeDefaultDetails
dfUnionDataframe with ids, times and values for the exogenous regressors.
freqstrFrequency of the data. Must be a valid pandas or polars offset alias, or an integer.
featuresListFeatures to compute. Can be string aliases of timestamp attributes or functions to apply to the times.
hint0Forecast horizon.
id_colstrunique_idColumn that identifies each serie.
time_colstrdsColumn that identifies each timestep, its values can be timestamps or integers.
ReturnsTupleOriginal DataFrame with the computed features
transformed_df, future_df = time_features(series, freq='D', features=['month', 'day'], h=1)
transformed_df
unique_iddsymonthday
002000-10-050.428973105
102000-10-061.423626106
202000-10-072.311782107
302000-10-083.192191108
402000-10-094.148767109
109642001-05-104.058910510
109742001-05-115.178157511
109842001-05-126.133142512
109942001-05-130.403709513
110042001-05-141.081779514
future_df
unique_iddsmonthday
002001-05-15515
112001-05-15515
222001-05-15515
332001-05-15515
442001-05-15515

source

pipeline

 pipeline
           (df:Union[pandas.core.frame.DataFrame,polars.dataframe.frame.Da
           taFrame], features:List[Callable], freq:str, h:int=0,
           id_col:str='unique_id', time_col:str='ds')

Compute several features for training and forecasting

TypeDefaultDetails
dfUnionDataframe with ids, times and values for the exogenous regressors.
featuresListList of features to compute. Must take only df, freq, h, id_col and time_col (other arguments must be fixed).
freqstrFrequency of the data. Must be a valid pandas or polars offset alias, or an integer.
hint0Forecast horizon.
id_colstrunique_idColumn that identifies each serie.
time_colstrdsColumn that identifies each timestep, its values can be timestamps or integers.
ReturnsTupleOriginal DataFrame with the computed features
def is_weekend(times):
    if isinstance(times, pd.Index):
        dow = times.weekday + 1  # monday=0 in pandas and 1 in polars
    else:
        dow = times.dt.weekday()
    return dow >= 6

def even_days_and_months(times):
    if isinstance(times, pd.Index):
        out = pd.DataFrame(
            {
                'even_day': (times.weekday + 1) % 2 == 0,
                'even_month': times.month % 2 == 0,
            }
        )
    else:
        # for polars you can return a list of expressions
        out = [
            (times.dt.weekday() % 2 == 0).alias('even_day'),
            (times.dt.month() % 2 == 0).alias('even_month'),
        ]
    return out

features = [
    trend,
    partial(fourier, season_length=7, k=1),
    partial(fourier, season_length=28, k=1),
    partial(time_features, features=['day', is_weekend, even_days_and_months]),
]
transformed_df, future_df = pipeline(
    series,
    features=features,
    freq='D',
    h=1,
)
transformed_df
unique_iddsytrendsin1_7cos1_7sin1_28cos1_28dayis_weekendeven_dayeven_month
002000-10-050.428973152.0-0.974927-0.2225260.433885-9.009683e-015FalseTrueTrue
102000-10-061.423626153.0-0.7818350.6234860.222522-9.749276e-016FalseFalseTrue
202000-10-072.311782154.0-0.0000051.0000000.000001-1.000000e+007TrueTrueTrue
302000-10-083.192191155.00.7818290.623493-0.222520-9.749281e-018TrueFalseTrue
402000-10-094.148767156.00.974929-0.222517-0.433883-9.009693e-019FalseFalseTrue
109642001-05-104.058910369.0-0.974927-0.2225230.9009694.338843e-0110FalseTrueFalse
109742001-05-115.178157370.0-0.7818230.6235000.9749292.225177e-0111FalseFalseFalse
109842001-05-126.133142371.0-0.0000021.0000001.0000004.251100e-0712TrueTrueFalse
109942001-05-130.403709372.00.7818400.6234790.974927-2.225243e-0113TrueFalseFalse
110042001-05-141.081779373.00.974928-0.2225200.900969-4.338835e-0114FalseFalseFalse