source

fourier

 fourier (df:~DFType, freq:Union[str,int], season_length:int, k:int,
          h:int=0, id_col:str='unique_id', time_col:str='ds')

Compute fourier seasonal terms for training and forecasting

TypeDefaultDetails
dfDFTypeDataframe with ids, times and values for the exogenous regressors.
freqUnionFrequency of the data. Must be a valid pandas or polars offset alias, or an integer.
season_lengthintNumber of observations per unit of time. Ex: 24 Hourly data.
kintMaximum order of the fourier terms
hint0Forecast horizon.
id_colstrunique_idColumn that identifies each serie.
time_colstrdsColumn that identifies each timestep, its values can be timestamps or integers.
ReturnsTupleOriginal DataFrame with the computed features
import pandas as pd

from utilsforecast.data import generate_series
series = generate_series(5, equal_ends=True)
transformed_df, future_df = fourier(series, freq='D', season_length=7, k=2, h=1)
transformed_df
unique_iddsysin1_7sin2_7cos1_7cos2_7
002000-10-050.428973-0.9749270.433894-0.222526-0.900964
102000-10-061.423626-0.781835-0.9749260.623486-0.222531
202000-10-072.311782-0.000005-0.0000091.0000001.000000
302000-10-083.1921910.7818290.9749300.623493-0.222512
402000-10-094.1487670.974929-0.433877-0.222517-0.900972
109642001-05-104.058910-0.9749270.433888-0.222523-0.900967
109742001-05-115.178157-0.781823-0.9749340.623500-0.222495
109842001-05-126.133142-0.000002-0.0000031.0000001.000000
109942001-05-130.4037090.7818400.9749220.623479-0.222548
110042001-05-141.0817790.974928-0.433882-0.222520-0.900970
future_df
unique_iddssin1_7sin2_7cos1_7cos2_7
002001-05-150.433871-0.781813-0.9009750.623513
112001-05-150.433871-0.781813-0.9009750.623513
222001-05-150.433871-0.781813-0.9009750.623513
332001-05-150.433871-0.781813-0.9009750.623513
442001-05-150.433871-0.781813-0.9009750.623513

source

trend

 trend (df:~DFType, freq:Union[str,int], h:int=0, id_col:str='unique_id',
        time_col:str='ds')

Add a trend column with consecutive integers for training and forecasting

TypeDefaultDetails
dfDFTypeDataframe with ids, times and values for the exogenous regressors.
freqUnionFrequency of the data. Must be a valid pandas or polars offset alias, or an integer.
hint0Forecast horizon.
id_colstrunique_idColumn that identifies each serie.
time_colstrdsColumn that identifies each timestep, its values can be timestamps or integers.
ReturnsTupleOriginal DataFrame with the computed features
series = generate_series(5, equal_ends=True)
transformed_df, future_df = trend(series, freq='D', h=1)
transformed_df
unique_iddsytrend
002000-10-050.428973152.0
102000-10-061.423626153.0
202000-10-072.311782154.0
302000-10-083.192191155.0
402000-10-094.148767156.0
109642001-05-104.058910369.0
109742001-05-115.178157370.0
109842001-05-126.133142371.0
109942001-05-130.403709372.0
110042001-05-141.081779373.0
future_df
unique_iddstrend
002001-05-15374.0
112001-05-15374.0
222001-05-15374.0
332001-05-15374.0
442001-05-15374.0

source

time_features

 time_features (df:~DFType, freq:Union[str,int],
                features:List[Union[str,Callable]], h:int=0,
                id_col:str='unique_id', time_col:str='ds')

Compute timestamp-based features for training and forecasting

TypeDefaultDetails
dfDFTypeDataframe with ids, times and values for the exogenous regressors.
freqUnionFrequency of the data. Must be a valid pandas or polars offset alias, or an integer.
featuresListFeatures to compute. Can be string aliases of timestamp attributes or functions to apply to the times.
hint0Forecast horizon.
id_colstrunique_idColumn that identifies each serie.
time_colstrdsColumn that identifies each timestep, its values can be timestamps or integers.
ReturnsTupleOriginal DataFrame with the computed features
transformed_df, future_df = time_features(series, freq='D', features=['month', 'day', 'week'], h=1)
transformed_df
unique_iddsymonthdayweek
002000-10-050.42897310540
102000-10-061.42362610640
202000-10-072.31178210740
302000-10-083.19219110840
402000-10-094.14876710941
109642001-05-104.05891051019
109742001-05-115.17815751119
109842001-05-126.13314251219
109942001-05-130.40370951319
110042001-05-141.08177951420
future_df
unique_iddsmonthdayweek
002001-05-1551520
112001-05-1551520
222001-05-1551520
332001-05-1551520
442001-05-1551520

source

future_exog_to_historic

 future_exog_to_historic (df:~DFType, freq:Union[str,int],
                          features:List[str], h:int=0,
                          id_col:str='unique_id', time_col:str='ds')

Turn future exogenous features into historic by shifting them h steps.

TypeDefaultDetails
dfDFTypeDataframe with ids, times and values for the exogenous regressors.
freqUnionFrequency of the data. Must be a valid pandas or polars offset alias, or an integer.
featuresListFeatures to be converted into historic.
hint0Forecast horizon.
id_colstrunique_idColumn that identifies each serie.
time_colstrdsColumn that identifies each timestep, its values can be timestamps or integers.
ReturnsTupleOriginal DataFrame with the computed features
series_with_prices = series.assign(price=np.random.rand(len(series))).sample(frac=1.0)
series_with_prices
unique_iddsyprice
43622001-03-262.3691130.774476
31212001-05-084.4052120.557957
53632000-11-044.3620740.745237
3402000-11-086.1111610.809978
65232001-02-281.4482910.685294
60932001-01-160.2158920.699703
87342000-09-295.3981980.677651
26812001-03-252.3937710.735438
17102001-03-253.0854930.463871
93142000-11-260.2922960.691377
transformed_df, future_df = future_exog_to_historic(
    df=series_with_prices, 
    freq='D',
    features=['price'],
    h=2,
)
transformed_df
unique_iddsyprice
022001-03-262.3691130.870133
112001-05-084.4052120.869751
232000-11-044.3620740.877901
302000-11-086.1111610.629413
432001-02-281.4482910.088073
109632001-01-160.2158920.472261
109742000-09-295.3981980.887531
109812001-03-252.3937710.481712
109902001-03-253.0854930.433153
110042000-11-260.2922960.620219
future_df
unique_iddsprice
002001-05-150.874328
102001-05-160.481385
212001-05-150.009058
312001-05-160.083749
422001-05-150.726212
522001-05-160.052221
632001-05-150.942335
732001-05-160.274816
842001-05-150.267545
942001-05-160.112129

source

pipeline

 pipeline (df:~DFType, features:List[Callable], freq:Union[str,int],
           h:int=0, id_col:str='unique_id', time_col:str='ds')

Compute several features for training and forecasting

TypeDefaultDetails
dfDFTypeDataframe with ids, times and values for the exogenous regressors.
featuresListList of features to compute. Must take only df, freq, h, id_col and time_col (other arguments must be fixed).
freqUnionFrequency of the data. Must be a valid pandas or polars offset alias, or an integer.
hint0Forecast horizon.
id_colstrunique_idColumn that identifies each serie.
time_colstrdsColumn that identifies each timestep, its values can be timestamps or integers.
ReturnsTupleOriginal DataFrame with the computed features
def is_weekend(times):
    if isinstance(times, pd.Index):
        dow = times.weekday + 1  # monday=0 in pandas and 1 in polars
    else:
        dow = times.dt.weekday()
    return dow >= 6

def even_days_and_months(times):
    if isinstance(times, pd.Index):
        out = pd.DataFrame(
            {
                'even_day': (times.weekday + 1) % 2 == 0,
                'even_month': times.month % 2 == 0,
            }
        )
    else:
        # for polars you can return a list of expressions
        out = [
            (times.dt.weekday() % 2 == 0).alias('even_day'),
            (times.dt.month() % 2 == 0).alias('even_month'),
        ]
    return out

features = [
    trend,
    partial(fourier, season_length=7, k=1),
    partial(fourier, season_length=28, k=1),
    partial(time_features, features=['day', is_weekend, even_days_and_months]),
]
transformed_df, future_df = pipeline(
    series,
    features=features,
    freq='D',
    h=1,
)
transformed_df
unique_iddsytrendsin1_7cos1_7sin1_28cos1_28dayis_weekendeven_dayeven_month
002000-10-050.428973152.0-0.974927-0.2225260.433885-9.009683e-015FalseTrueTrue
102000-10-061.423626153.0-0.7818350.6234860.222522-9.749276e-016FalseFalseTrue
202000-10-072.311782154.0-0.0000051.0000000.000001-1.000000e+007TrueTrueTrue
302000-10-083.192191155.00.7818290.623493-0.222520-9.749281e-018TrueFalseTrue
402000-10-094.148767156.00.974929-0.222517-0.433883-9.009693e-019FalseFalseTrue
109642001-05-104.058910369.0-0.974927-0.2225230.9009694.338843e-0110FalseTrueFalse
109742001-05-115.178157370.0-0.7818230.6235000.9749292.225177e-0111FalseFalseFalse
109842001-05-126.133142371.0-0.0000021.0000001.0000004.251100e-0712TrueTrueFalse
109942001-05-130.403709372.00.7818400.6234790.974927-2.225243e-0113TrueFalseFalse
110042001-05-141.081779373.00.974928-0.2225200.900969-4.338835e-0114FalseFalseFalse
future_df
unique_iddstrendsin1_7cos1_7sin1_28cos1_28dayis_weekendeven_dayeven_month
002001-05-15374.00.433871-0.9009750.781829-0.62349315FalseTrueFalse
112001-05-15374.00.433871-0.9009750.781829-0.62349315FalseTrueFalse
222001-05-15374.00.433871-0.9009750.781829-0.62349315FalseTrueFalse
332001-05-15374.00.433871-0.9009750.781829-0.62349315FalseTrueFalse
442001-05-15374.00.433871-0.9009750.781829-0.62349315FalseTrueFalse