Skip to main content

AutoRandomForest

AutoRandomForest(config=None)
Bases: AutoModel Structure to hold a model and its search space Parameters:
NameTypeDescriptionDefault
modelBaseEstimatorscikit-learn compatible regressorrequired
configcallablefunction that takes an optuna trial and produces a configurationrequired

AutoElasticNet

AutoElasticNet(config=None)
Bases: AutoModel Structure to hold a model and its search space Parameters:
NameTypeDescriptionDefault
modelBaseEstimatorscikit-learn compatible regressorrequired
configcallablefunction that takes an optuna trial and produces a configurationrequired

AutoLasso

AutoLasso(config=None)
Bases: AutoModel Structure to hold a model and its search space Parameters:
NameTypeDescriptionDefault
modelBaseEstimatorscikit-learn compatible regressorrequired
configcallablefunction that takes an optuna trial and produces a configurationrequired

AutoRidge

AutoRidge(config=None)
Bases: AutoModel Structure to hold a model and its search space Parameters:
NameTypeDescriptionDefault
modelBaseEstimatorscikit-learn compatible regressorrequired
configcallablefunction that takes an optuna trial and produces a configurationrequired

AutoLinearRegression

AutoLinearRegression(config=None)
Bases: AutoModel Structure to hold a model and its search space Parameters:
NameTypeDescriptionDefault
modelBaseEstimatorscikit-learn compatible regressorrequired
configcallablefunction that takes an optuna trial and produces a configurationrequired

AutoCatboost

AutoCatboost(config=None)
Bases: AutoModel Structure to hold a model and its search space Parameters:
NameTypeDescriptionDefault
modelBaseEstimatorscikit-learn compatible regressorrequired
configcallablefunction that takes an optuna trial and produces a configurationrequired

AutoXGBoost

AutoXGBoost(config=None)
Bases: AutoModel Structure to hold a model and its search space Parameters:
NameTypeDescriptionDefault
modelBaseEstimatorscikit-learn compatible regressorrequired
configcallablefunction that takes an optuna trial and produces a configurationrequired

AutoLightGBM

AutoLightGBM(config=None)
Bases: AutoModel Structure to hold a model and its search space Parameters:
NameTypeDescriptionDefault
modelBaseEstimatorscikit-learn compatible regressorrequired
configcallablefunction that takes an optuna trial and produces a configurationrequired

random_forest_space

random_forest_space(trial)

elastic_net_space

elastic_net_space(trial)

lasso_space

lasso_space(trial)

ridge_space

ridge_space(trial)

linear_regression_space

linear_regression_space(trial)

catboost_space

catboost_space(trial)

xgboost_space

xgboost_space(trial)

lightgbm_space

lightgbm_space(trial)

AutoModel

AutoModel(model, config)
Structure to hold a model and its search space Parameters:
NameTypeDescriptionDefault
modelBaseEstimatorscikit-learn compatible regressorrequired
configcallablefunction that takes an optuna trial and produces a configurationrequired

AutoMLForecast

AutoMLForecast(
    models,
    freq,
    season_length=None,
    init_config=None,
    fit_config=None,
    num_threads=1,
)
Hyperparameter optimization helper Parameters:
NameTypeDescriptionDefault
modelslist or dictAuto models to be optimized.required
freqstr or intpandas’ or polars’ offset alias or integer denoting the frequency of the series.required
season_lengthintLength of the seasonal period. This is used for producing the feature space. Only required if init_config is None. Defaults to None.None
init_configcallableFunction that takes an optuna trial and produces a configuration passed to the MLForecast constructor. Defaults to None.None
fit_configcallableFunction that takes an optuna trial and produces a configuration passed to the MLForecast fit method. Defaults to None.None
num_threadsintNumber of threads to use when computing the features. Defaults to 1.1

AutoMLForecast.fit

fit(
    df,
    n_windows,
    h,
    num_samples,
    step_size=None,
    input_size=None,
    refit=False,
    loss=None,
    id_col="unique_id",
    time_col="ds",
    target_col="y",
    study_kwargs=None,
    optimize_kwargs=None,
    fitted=False,
    prediction_intervals=None,
    weight_col=None,
)
Carry out the optimization process. Each model is optimized independently and the best one is trained on all data Parameters:
NameTypeDescriptionDefault
dfpandas or polars DataFrameSeries data in long format.required
n_windowsintNumber of windows to evaluate.required
hintForecast horizon.required
num_samplesintNumber of trials to runrequired
step_sizeintStep size between each cross validation window. If None it will be equal to h. Defaults to None.None
input_sizeintMaximum training samples per serie in each window. If None, will use an expanding window. Defaults to None.None
refitbool or intRetrain model for each cross validation window. If False, the models are trained at the beginning and then used to predict each window. If positive int, the models are retrained every refit windows. Defaults to False.False
losscallableFunction that takes the validation and train dataframes and produces a float. If None will use the average SMAPE across series. Defaults to None.None
id_colstrColumn that identifies each serie. Defaults to ‘unique_id’.‘unique_id’
time_colstrColumn that identifies each timestep, its values can be timestamps or integers. Defaults to ‘ds’.‘ds’
target_colstrColumn that contains the target. Defaults to ‘y’.‘y’
study_kwargsdictKeyword arguments to be passed to the optuna.Study constructor. Defaults to None.None
optimize_kwargsdictKeyword arguments to be passed to the optuna.Study.optimize method. Defaults to None.None
fittedboolWhether to compute the fitted values when retraining the best model. Defaults to False.False
prediction_intervalsOptional[PredictionIntervals]Configuration to calibrate prediction intervals when retraining the best model.None
Returns:
TypeDescription
AutoMLForecastobject with best models and optimization results

AutoMLForecast.predict

predict(h, X_df=None, level=None)
“Compute forecasts Parameters:
NameTypeDescriptionDefault
hintNumber of periods to predict.required
X_dfpandas or polars DataFrameDataframe with the future exogenous features. Should have the id column and the time column. Defaults to None.None
levellist of ints or floatsConfidence levels between 0 and 100 for prediction intervals. Defaults to None.None
Returns:
TypeDescription
pandas or polars DataFramePredictions for each serie and timestep, with one column per model.

AutoMLForecast.save

save(path)
Save AutoMLForecast objects Parameters:
NameTypeDescriptionDefault
pathstr or PathDirectory where artifacts will be stored.required

AutoMLForecast.forecast_fitted_values

forecast_fitted_values(level=None)
Access in-sample predictions. Parameters:
NameTypeDescriptionDefault
levellist of ints or floatsConfidence levels between 0 and 100 for prediction intervals. Defaults to None.None
Returns:
TypeDescription
pandas or polars DataFrameDataframe with predictions for the training set
import time

import pandas as pd
from datasetsforecast.m4 import M4, M4Evaluation, M4Info
from sklearn.linear_model import Ridge
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import OneHotEncoder
def train_valid_split(group):
    df, *_ = M4.load(directory='data', group=group)
    df['ds'] = df['ds'].astype('int')
    horizon = M4Info[group].horizon
    valid = df.groupby('unique_id').tail(horizon).copy()
    train = df.drop(valid.index).reset_index(drop=True)
    return train, valid
ridge_pipeline = make_pipeline(
    ColumnTransformer(
        [('encoder', OneHotEncoder(), ['unique_id'])],
        remainder='passthrough',
    ),
    Ridge()
)
auto_ridge = AutoModel(ridge_pipeline, lambda trial: {f'ridge__{k}': v for k, v in ridge_space(trial).items()})
optuna.logging.set_verbosity(optuna.logging.ERROR)
group = 'Weekly'
train, valid = train_valid_split(group)
train['unique_id'] = train['unique_id'].astype('category')
valid['unique_id'] = valid['unique_id'].astype(train['unique_id'].dtype)
info = M4Info[group]
h = info.horizon
season_length = info.seasonality
auto_mlf = AutoMLForecast(
    freq=1,
    season_length=season_length,
    models={
        'lgb': AutoLightGBM(),
        'ridge': auto_ridge,
    },
    fit_config=lambda trial: {'static_features': ['unique_id']},
    num_threads=2,
)
auto_mlf.fit(
    df=train,
    n_windows=2,
    h=h,
    num_samples=2,
    optimize_kwargs={'timeout': 60},
    fitted=True,
    prediction_intervals=PredictionIntervals(n_windows=2, h=h),
)
auto_mlf.predict(h, level=[80])
unique_iddslgblgb-lo-80lgb-hi-80ridgeridge-lo-80ridge-hi-80
0W1218035529.43522435061.83536235997.03508636110.92120235880.44509736341.397307
1W1218135521.76489434973.03561736070.49417136195.17575736051.01381136339.337702
2W1218235537.41726834960.05093936114.78359636107.52885235784.06216936430.995536
3W1218335538.05820634823.64070636252.47570536027.13924835612.63572536441.642771
4W1218435614.61121134627.02373936602.19868336092.85848935389.69097736796.026000
4662W99229215071.53697814484.61739915658.45655715319.14622114869.41056715768.881875
4663W99229315058.14527814229.68632215886.60423415299.54955514584.26935216014.829758
4664W99229415042.49343414096.38063615988.60623215271.74471214365.34933816178.140086
4665W99229515042.14484614037.05390416047.23578715250.07050414403.42879116096.712216
4666W99229615038.72904413944.82148016132.63660915232.12780014325.05977616139.195824
auto_mlf.forecast_fitted_values(level=[95])
unique_iddsylgblgb-lo-95lgb-hi-95ridgeridge-lo-95ridge-hi-95
0W1151071.061060.584344599.6183551521.5503341076.990151556.5354921597.444810
1W1161073.731072.669242611.7032521533.6352321083.633276563.1786171604.087936
2W1171066.971072.452128611.4861391533.4181181084.724311564.2696521605.178970
3W1181066.171065.837828604.8718381526.8038181080.127197559.6725381600.581856
4W1191064.431065.214681604.2486911526.1806711080.636826560.1821671601.091485
361881W99227915738.5415887.66122815721.23719516054.08526115927.91818115723.22276016132.613603
361882W99228015388.1315755.94378915589.51975615922.36782315841.59906415636.90364216046.294485
361883W99228115187.6215432.22470115265.80066815598.64873515584.46223215379.76681115789.157654
361884W99228215172.2715177.04083115010.61679715343.46486415396.24322315191.54780115600.938644
361885W99228315101.0315162.09080314995.66677015328.51483615335.98246515131.28704415540.677887
import polars as pl
train_pl = pl.from_pandas(train.astype({'unique_id': 'str'}))
auto_mlf = AutoMLForecast(
    freq=1,
    season_length=season_length,
    models={'ridge': AutoRidge()},
    num_threads=2,
)
auto_mlf.fit(
    df=train_pl,
    n_windows=2,
    h=h,
    num_samples=2,
    optimize_kwargs={'timeout': 60},
    fitted=True,
    prediction_intervals=PredictionIntervals(n_windows=2, h=h),
)
auto_mlf.predict(h, level=[80])
unique_iddsridgeridge-lo-80ridge-hi-80
stri64f64f64f64
”W1”218035046.09666334046.6952136045.498116
”W1”218134743.26921633325.84797536160.690457
”W1”218234489.59108632591.25455936387.927614
”W1”218334270.76817932076.50772736465.02863
”W1”218434124.02185731352.45412136895.589593
”W99”229214719.45709613983.30858215455.605609
”W99”229314631.55207713928.87433615334.229818
”W99”229414532.90523913642.84011815422.97036
”W99”229514446.06544313665.08866715227.04222
”W99”229614363.04960413654.22005115071.879157
auto_mlf.forecast_fitted_values(level=[95])
unique_iddsyridgeridge-lo-95ridge-hi-95
stri64f64f64f64f64
”W1”141061.961249.326428488.7652492009.887607
”W1”151071.061246.067836485.5066572006.629015
”W1”161073.731254.027897493.4667182014.589076
”W1”171066.971254.475948493.9147692015.037126
”W1”181066.171248.306754487.7455752008.867933
”W99”227915738.5415754.55881215411.96864516097.148979
”W99”228015388.1315655.78086515313.19069815998.371032
”W99”228115187.6215367.49846815024.90830115710.088635
”W99”228215172.2715172.59142314830.00125615515.18159
”W99”228315101.0315141.03288614798.4427215483.623053