module statsforecast.distributed.fugue

Global Variables

  • FUGUE_CONF_WORKFLOW_EXCEPTION_INJECT

class FugueBackend

FugueBackend for Distributed Computation. Source code. This class uses Fugue backend capable of distributing computation on Spark, Dask and Ray without any rewrites. Args:
  • engine (fugue.ExecutionEngine): A selection between Spark, Dask, and Ray.
  • conf (fugue.Config): Engine configuration.
  • **transform_kwargs: Additional kwargs for Fugue’s transform method.
Notes:
A short introduction to Fugue, with examples on how to scale pandas code to Spark, Dask or Ray is available here.

method __init__

__init__(engine: Any = None, conf: Any = None, **transform_kwargs: Any)

method cross_validation

cross_validation(
    df: ~AnyDataFrame,
    freq: Union[str, int],
    models: List[Any],
    fallback_model: Optional[Any],
    h: int,
    n_windows: int,
    step_size: int,
    test_size: int,
    input_size: int,
    level: Optional[List[int]],
    refit: bool,
    fitted: bool,
    prediction_intervals: Optional[ConformalIntervals],
    id_col: str,
    time_col: str,
    target_col: str
) → Any
Temporal Cross-Validation with core.StatsForecast and FugueBackend. This method uses Fugue’s transform function, in combination with core.StatsForecast’s cross-validation to efficiently fit a list of StatsForecast models through multiple training windows, in either chained or rolled manner. StatsForecast.models’ speed along with Fugue’s distributed computation allow to overcome this evaluation technique high computational costs. Temporal cross-validation provides better model’s generalization measurements by increasing the test’s length and diversity. Parameters ---------- df (pandas or polars DataFrame): DataFrame with ids, times, targets and exogenous. freq (str or int): Frequency of the data. Must be a valid pandas or polars offset alias, or an integer. models (List[Any]): List of instantiated objects models.StatsForecast. fallback_model (Any, optional): Model to be used if a model fails. Only works with the forecast and cross_validation methods. Defaults to None. h (int): Forecast horizon. n_windows (int): Number of windows used for cross validation. Defaults to 1. step_size (int): Step size between each window. Defaults to 1. test_size (int, optional): Length of test size. If passed, set n_windows=None. Defaults to None. input_size (int, optional): Input size for each window, if not none rolled windows. Defaults to None. level (List[float], optional): Confidence levels between 0 and 100 for prediction intervals. Defaults to None. refit (bool or int): Wether or not refit the model for each window. If int, train the models every refit windows. Defaults to True. fitted (bool): Store in-sample predictions. Defaults to False. prediction_intervals (ConformalIntervals, optional): Configuration to calibrate prediction intervals (Conformal Prediction). Defaults to None. id_col (str): Column that identifies each serie. Defaults to ‘unique_id’. time_col (str): Column that identifies each timestep, its values can be timestamps or integers. Defaults to ‘ds’. target_col (str): Column that contains the target. Defaults to ‘y’. Returns:
  • pandas.DataFrame: DataFrame, with models columns for point predictions and probabilistic predictions for all fitted models.
References:

method forecast

forecast(
    df: ~AnyDataFrame,
    freq: Union[str, int],
    models: List[Any],
    fallback_model: Optional[Any],
    X_df: Optional[~AnyDataFrame],
    h: int,
    level: Optional[List[int]],
    fitted: bool,
    prediction_intervals: Optional[ConformalIntervals],
    id_col: str,
    time_col: str,
    target_col: str
) → Any
Memory Efficient core.StatsForecast predictions with FugueBackend. This method uses Fugue’s transform function, in combination with core.StatsForecast’s forecast to efficiently fit a list of StatsForecast models. Parameters ---------- df (pandas or polars DataFrame): DataFrame with ids, times, targets and exogenous. freq (str or int): Frequency of the data. Must be a valid pandas or polars offset alias, or an integer. models (List[Any]): List of instantiated objects models.StatsForecast. fallback_model (Any, optional): Model to be used if a model fails. Only works with the forecast and cross_validation methods. Defaults to None. X_df (pandas or polars DataFrame, optional): DataFrame with ids, times and future exogenous. Defaults to None. h (int): Forecast horizon. level (List[float], optional): Confidence levels between 0 and 100 for prediction intervals. Defaults to None. fitted (bool): Store in-sample predictions. Defaults to False. prediction_intervals (ConformalIntervals, optional): Configuration to calibrate prediction intervals (Conformal Prediction). Defaults to None. id_col (str): Column that identifies each serie. Defaults to ‘unique_id’. time_col (str): Column that identifies each timestep, its values can be timestamps or integers. Defaults to ‘ds’. target_col (str): Column that contains the target. Defaults to ‘y’. Returns:
  • pandas.DataFrame: DataFrame with models columns for point predictions and probabilistic predictions for all fitted models
References:

method forecast_fitted_values

forecast_fitted_values()
Retrieve in-sample predictions