StatsForecast ETS and Facebook Prophet on Spark (M5)
This notebook was originally executed using DataBricks
The purpose of this notebook is to create a scalability benchmark (time and performance). To that end, Nixtla’s StatsForecast (using the ETS model) is trained on the M5 dataset using spark to distribute the training. As a comparison, Facebook’s Prophet model is used.
An AWS cluster (mounted on databricks) of 11 instances of type m5.2xlarge (8 cores, 32 GB RAM) with runtime 10.4 LTS was used. This notebook was used as base case.
The example uses the M5
dataset.
It consists of 30,490
bottom time series.
Main results
Method | Time (mins) | Performance (wRMSSE) |
---|---|---|
StatsForecast | 7.5 | 0.68 |
Prophet | 18.23 | 0.77 |
Installing libraries
StatsForecast pipeline
Forecast
With statsforecast you don’t have to download your data. The distributed backend can handle a file with your data.
Evaluating performance
The M5 competition used the weighted root mean squared scaled error. You can find details of the metric here.
wrmsse | |
---|---|
Total | 0.682358 |
Level1 | 0.449115 |
Level2 | 0.533754 |
Level3 | 0.592317 |
Level4 | 0.497086 |
Level5 | 0.572189 |
Level6 | 0.593880 |
Level7 | 0.665358 |
Level8 | 0.652183 |
Level9 | 0.734492 |
Level10 | 1.012633 |
Level11 | 0.969902 |
Level12 | 0.915380 |
Prophet pipeline
Download data
Forecast function using Prophet
Training Prophet on the M5 dataset
Evaluating performance
The M5 competition used the weighted root mean squared scaled error. You can find details of the metric here.
wrmsse | |
---|---|
Total | 0.771800 |
Level1 | 0.507905 |
Level2 | 0.586328 |
Level3 | 0.666686 |
Level4 | 0.549358 |
Level5 | 0.655003 |
Level6 | 0.647176 |
Level7 | 0.747047 |
Level8 | 0.743422 |
Level9 | 0.824667 |
Level10 | 1.207069 |
Level11 | 1.108780 |
Level12 | 1.018163 |