Skip to main content

evaluate

evaluate(
    df,
    metrics,
    models=None,
    train_df=None,
    level=None,
    id_col="unique_id",
    time_col="ds",
    target_col="y",
    cutoff_col="cutoff",
    agg_fn=None,
)
Evaluate forecast using different metrics. Parameters:
NameTypeDescriptionDefault
dfpandas, polars, dask or spark DataFrameForecasts to evaluate. Must have id_col, time_col, target_col and models’ predictions.required
metricslist of callableFunctions with arguments df, models, id_col, target_col and optionally train_df.required
modelslist of strNames of the models to evaluate. If None will use every column in the dataframe after removing id, time and target. Defaults to None.None
train_dfpandas, polars, dask or spark DataFrameTraining set. Used to evaluate metrics such as mase. Defaults to None.None
levellist of intPrediction interval levels. Used to compute losses that rely on quantiles. Defaults to None.None
id_colstrColumn that identifies each serie. Defaults to ‘unique_id’.‘unique_id’
time_colstrColumn that identifies each timestep, its values can be timestamps or integers. Defaults to ‘ds’.‘ds’
target_colstrColumn that contains the target. Defaults to ‘y’.‘y’
cutoff_colstrColumn that identifies the cutoff point for each forecast cross-validation fold. Defaults to ‘cutoff’.‘cutoff’
agg_fnstrStatistic to compute on the scores by id to reduce them to a single number. Defaults to None.None
Returns:
TypeDescription
AnyDFTypepandas, polars, dask or spark DataFrame: Metrics with one row per (id, metric) combination and one column per model. If agg_fn is not None, there is only one row per metric.