This notebook was originally executed using DataBricksThe purpose of this notebook is to create a scalability benchmark (time and performance). To that end, Nixtla’s StatsForecast (using the ETS model) is trained on the M5 dataset using spark to distribute the training. As a comparison, Facebook’s Prophet model is used. An AWS cluster (mounted on databricks) of 11 instances of type m5.2xlarge (8 cores, 32 GB RAM) with runtime 10.4 LTS was used. This notebook was used as base case. The example uses the M5 dataset. It consists of
30,490 bottom time series.
Main results
| Method | Time (mins) | Performance (wRMSSE) |
|---|---|---|
| StatsForecast | 7.5 | 0.68 |
| Prophet | 18.23 | 0.77 |
Installing libraries
StatsForecast pipeline
Forecast
With statsforecast you don’t have to download your data. The distributed backend can handle a file with your data.Evaluating performance
The M5 competition used the weighted root mean squared scaled error. You can find details of the metric here.| wrmsse | |
|---|---|
| Total | 0.682358 |
| Level1 | 0.449115 |
| Level2 | 0.533754 |
| Level3 | 0.592317 |
| Level4 | 0.497086 |
| Level5 | 0.572189 |
| Level6 | 0.593880 |
| Level7 | 0.665358 |
| Level8 | 0.652183 |
| Level9 | 0.734492 |
| Level10 | 1.012633 |
| Level11 | 0.969902 |
| Level12 | 0.915380 |
Prophet pipeline
Download data
Forecast function using Prophet
Training Prophet on the M5 dataset
Evaluating performance
The M5 competition used the weighted root mean squared scaled error. You can find details of the metric here.| wrmsse | |
|---|---|
| Total | 0.771800 |
| Level1 | 0.507905 |
| Level2 | 0.586328 |
| Level3 | 0.666686 |
| Level4 | 0.549358 |
| Level5 | 0.655003 |
| Level6 | 0.647176 |
| Level7 | 0.747047 |
| Level8 | 0.743422 |
| Level9 | 0.824667 |
| Level10 | 1.207069 |
| Level11 | 1.108780 |
| Level12 | 1.018163 |

