Run StatsForecast distributedly on top of Spark.StatsForecast works on top of Spark, Dask, and Ray through Fugue. StatsForecast will read the input DataFrame and use the corresponding engine. For example, if the input is a Spark DataFrame, StatsForecast will use the existing Spark session to run the forecast. A benchmark (with older syntax) can be found here where we forecasted one million timeseries in under 15 minutes.
statsforecast
library is installed across all the workers.
unique_id | ds | AutoETS | |
---|---|---|---|
0 | 0 | 2000-08-10 | 5.261609 |
1 | 0 | 2000-08-11 | 6.196357 |
2 | 0 | 2000-08-12 | 0.282309 |
3 | 0 | 2000-08-13 | 1.264195 |
4 | 0 | 2000-08-14 | 2.262453 |