Installation

As long as Ray is installed and configured, StatsForecast will be able to use it. If executing on a distributed Ray cluster, make use the statsforecast library is installed across all the workers.

StatsForecast on Pandas

Before running on Ray, it’s recommended to test on a smaller Pandas dataset to make sure everything is working. This example also helps show the small differences when using Ray.

from statsforecast.core import StatsForecast
from statsforecast.models import AutoARIMA, AutoETS
from statsforecast.utils import generate_series
n_series = 4
horizon = 7

series = generate_series(n_series)

sf = StatsForecast(
    models=[AutoETS(season_length=7)],
    freq='D',
)
sf.forecast(df=series, h=horizon).head()
unique_iddsAutoETS
002000-08-105.261609
102000-08-116.196357
202000-08-120.282309
302000-08-131.264195
402000-08-142.262453

Executing on Ray

To run the forecasts distributed on Ray, just pass in a Ray Dataset instead.

import ray
import logging
ray.init(logging_level=logging.ERROR)

series['unique_id'] = series['unique_id'].astype(str)
ctx = ray.data.context.DatasetContext.get_current()
ctx.use_streaming_executor = False
ray_series = ray.data.from_pandas(series).repartition(4)
sf.forecast(df=ray_series, h=horizon).take(5)