MLForecast
Full pipeline encapsulation
Data
This shows an example with just 4 series of the M4 dataset. If you want to run it yourself on all of them, you can refer to this notebook.
unique_id | ds | y | |
---|---|---|---|
86796 | H196 | 1 | 11.8 |
86797 | H196 | 2 | 11.4 |
86798 | H196 | 3 | 11.1 |
86799 | H196 | 4 | 10.8 |
86800 | H196 | 5 | 10.6 |
… | … | … | … |
325235 | H413 | 1004 | 99.0 |
325236 | H413 | 1005 | 88.0 |
325237 | H413 | 1006 | 47.0 |
325238 | H413 | 1007 | 41.0 |
325239 | H413 | 1008 | 34.0 |
We now split this data into train and validation.
source
MLForecast
Forecasting pipeline
Type | Default | Details | |
---|---|---|---|
models | Union | Models that will be trained and used to compute the forecasts. | |
freq | Union | Pandas offset, pandas offset alias, e.g. ‘D’, ‘W-THU’ or integer denoting the frequency of the series. | |
lags | Optional | None | Lags of the target to use as features. |
lag_transforms | Optional | None | Mapping of target lags to their transformations. |
date_features | Optional | None | Features computed from the dates. Can be pandas date attributes or functions that will take the dates as input. |
num_threads | int | 1 | Number of threads to use when computing the features. |
target_transforms | Optional | None | Transformations that will be applied to the target before computing the features and restored after the forecasting step. |
lag_transforms_namer | Optional | None | Function that takes a transformation (either function or class), a lag and extra arguments and produces a name. |
The MLForecast object encapsulates the feature engineering + training the models + forecasting
Once we have this setup we can compute the features and fit the model.
source
MLForecast.fit
Apply the feature engineering and train the models.
Type | Default | Details | |
---|---|---|---|
df | Union | Series data in long format. | |
id_col | str | unique_id | Column that identifies each serie. |
time_col | str | ds | Column that identifies each timestep, its values can be timestamps or integers. |
target_col | str | y | Column that contains the target. |
static_features | Optional | None | Names of the features that are static and will be repeated when forecasting. If None , will consider all columns (except id_col and time_col) as static. |
dropna | bool | True | Drop rows with missing values produced by the transformations. |
keep_last_n | Optional | None | Keep only these many records from each serie for the forecasting step. Can save time and memory if your features allow it. |
max_horizon | Optional | None | Train this many models, where each model will predict a specific horizon. |
prediction_intervals | Optional | None | Configuration to calibrate prediction intervals (Conformal Prediction). |
fitted | bool | False | Save in-sample predictions. |
as_numpy | bool | False | Cast features to numpy array. |
weight_col | Optional | None | Column that contains the sample weights. |
Returns | MLForecast | Forecast object with series values and trained models. |
unique_id | ds | LGBMRegressor | |
---|---|---|---|
0 | H196 | 961 | 16.067907 |
1 | H196 | 962 | 15.667907 |
2 | H196 | 963 | 15.267907 |
3 | H196 | 964 | 14.967907 |
4 | H196 | 965 | 14.667907 |
5 | H256 | 961 | 13.267907 |
6 | H256 | 962 | 12.667907 |
7 | H256 | 963 | 12.367907 |
8 | H256 | 964 | 12.067907 |
9 | H256 | 965 | 11.867907 |
10 | H381 | 961 | 57.786053 |
11 | H381 | 962 | 34.563712 |
12 | H381 | 963 | 34.093324 |
13 | H381 | 964 | 14.028197 |
14 | H381 | 965 | 29.925360 |
15 | H413 | 961 | 26.351542 |
16 | H413 | 962 | 18.476677 |
17 | H413 | 963 | 19.947657 |
18 | H413 | 964 | 22.287800 |
19 | H413 | 965 | 16.458991 |
unique_id | ds | cutoff | y | LGBMRegressor | |
---|---|---|---|---|---|
0 | H196 | 951 | 950 | 24.4 | 24.284508 |
1 | H196 | 952 | 950 | 24.3 | 24.184508 |
2 | H196 | 953 | 950 | 23.8 | 23.684508 |
3 | H196 | 954 | 950 | 22.8 | 22.684508 |
4 | H196 | 955 | 950 | 21.2 | 21.084508 |
5 | H256 | 951 | 950 | 19.5 | 19.684508 |
6 | H256 | 952 | 950 | 19.4 | 19.484508 |
7 | H256 | 953 | 950 | 18.9 | 19.084508 |
8 | H256 | 954 | 950 | 18.3 | 18.384508 |
9 | H256 | 955 | 950 | 17.0 | 17.084508 |
10 | H381 | 951 | 950 | 182.0 | 183.690023 |
11 | H381 | 952 | 950 | 222.0 | 240.636599 |
12 | H381 | 953 | 950 | 288.0 | 289.609776 |
13 | H381 | 954 | 950 | 264.0 | 312.935070 |
14 | H381 | 955 | 950 | 191.0 | 220.711736 |
15 | H413 | 951 | 950 | 77.0 | 63.058071 |
16 | H413 | 952 | 950 | 91.0 | 55.965923 |
17 | H413 | 953 | 950 | 76.0 | 63.742204 |
18 | H413 | 954 | 950 | 68.0 | 58.993659 |
19 | H413 | 955 | 950 | 68.0 | 78.434237 |
20 | H196 | 956 | 955 | 19.3 | 19.289245 |
21 | H196 | 957 | 955 | 18.2 | 18.189245 |
22 | H196 | 958 | 955 | 17.5 | 17.489245 |
23 | H196 | 959 | 955 | 16.9 | 16.889245 |
24 | H196 | 960 | 955 | 16.5 | 16.489245 |
25 | H256 | 956 | 955 | 15.5 | 15.689245 |
26 | H256 | 957 | 955 | 14.7 | 14.789245 |
27 | H256 | 958 | 955 | 14.1 | 14.289245 |
28 | H256 | 959 | 955 | 13.6 | 13.789245 |
29 | H256 | 960 | 955 | 13.2 | 13.389245 |
30 | H381 | 956 | 955 | 130.0 | 100.397588 |
31 | H381 | 957 | 955 | 113.0 | 122.478620 |
32 | H381 | 958 | 955 | 94.0 | 119.608739 |
33 | H381 | 959 | 955 | 192.0 | 114.323949 |
34 | H381 | 960 | 955 | 87.0 | 96.609912 |
35 | H413 | 956 | 955 | 59.0 | 81.721708 |
36 | H413 | 957 | 955 | 58.0 | 69.453475 |
37 | H413 | 958 | 955 | 53.0 | 41.784503 |
38 | H413 | 959 | 955 | 38.0 | 44.825351 |
39 | H413 | 960 | 955 | 46.0 | 45.590796 |
source
MLForecast.save
Save forecast object
Type | Details | |
---|---|---|
path | Union | Directory where artifacts will be stored. |
Returns | None |
source
MLForecast.load
Load forecast object
Type | Details | |
---|---|---|
path | Union | Directory with saved artifacts. |
Returns | MLForecast |
source
MLForecast.update
Update the values of the stored series.
Type | Details | |
---|---|---|
df | Union | Dataframe with new observations. |
Returns | None |
source
MLForecast.make_future_dataframe
Create a dataframe with all ids and future times in the forecasting horizon.
Type | Details | |
---|---|---|
h | int | Number of periods to predict. |
Returns | Union | DataFrame with expected ids and future times |
unique_id | ds | |
---|---|---|
0 | H196 | 961 |
1 | H256 | 961 |
2 | H381 | 961 |
3 | H413 | 961 |
source
MLForecast.get_missing_future
Get the missing id and time combinations in X_df
.
Type | Details | |
---|---|---|
h | int | Number of periods to predict. |
X_df | DFType | Dataframe with the future exogenous features. Should have the id column and the time column. |
Returns | DFType | DataFrame with expected ids and future times missing in X_df |
source
MLForecast.forecast_fitted_values
Access in-sample predictions.
Type | Default | Details | |
---|---|---|---|
level | Optional | None | Confidence levels between 0 and 100 for prediction intervals. |
Returns | Union | Dataframe with predictions for the training set |
unique_id | ds | y | LGBMRegressor | |
---|---|---|---|---|
0 | H196 | 193 | 12.7 | 12.671271 |
1 | H196 | 194 | 12.3 | 12.271271 |
2 | H196 | 195 | 11.9 | 11.871271 |
3 | H196 | 196 | 11.7 | 11.671271 |
4 | H196 | 197 | 11.4 | 11.471271 |
… | … | … | … | … |
3067 | H413 | 956 | 59.0 | 68.280574 |
3068 | H413 | 957 | 58.0 | 70.427570 |
3069 | H413 | 958 | 53.0 | 44.767965 |
3070 | H413 | 959 | 38.0 | 48.691257 |
3071 | H413 | 960 | 46.0 | 46.652238 |
unique_id | ds | y | LGBMRegressor | LGBMRegressor-lo-90 | LGBMRegressor-hi-90 | |
---|---|---|---|---|---|---|
0 | H196 | 193 | 12.7 | 12.671271 | 12.540634 | 12.801909 |
1 | H196 | 194 | 12.3 | 12.271271 | 12.140634 | 12.401909 |
2 | H196 | 195 | 11.9 | 11.871271 | 11.740634 | 12.001909 |
3 | H196 | 196 | 11.7 | 11.671271 | 11.540634 | 11.801909 |
4 | H196 | 197 | 11.4 | 11.471271 | 11.340634 | 11.601909 |
… | … | … | … | … | … | … |
3067 | H413 | 956 | 59.0 | 68.280574 | 58.846640 | 77.714509 |
3068 | H413 | 957 | 58.0 | 70.427570 | 60.993636 | 79.861504 |
3069 | H413 | 958 | 53.0 | 44.767965 | 35.334031 | 54.201899 |
3070 | H413 | 959 | 38.0 | 48.691257 | 39.257323 | 58.125191 |
3071 | H413 | 960 | 46.0 | 46.652238 | 37.218304 | 56.086172 |
Once we’ve run this we’re ready to compute our predictions.
source
MLForecast.predict
Compute the predictions for the next h
steps.
Type | Default | Details | |
---|---|---|---|
h | int | Number of periods to predict. | |
before_predict_callback | Optional | None | Function to call on the features before computing the predictions. This function will take the input dataframe that will be passed to the model for predicting and should return a dataframe with the same structure. The series identifier is on the index. |
after_predict_callback | Optional | None | Function to call on the predictions before updating the targets. This function will take a pandas Series with the predictions and should return another one with the same structure. The series identifier is on the index. |
new_df | Optional | None | Series data of new observations for which forecasts are to be generated. This dataframe should have the same structure as the one used to fit the model, including any features and time series data. If new_df is not None, the method will generate forecasts for the new observations. |
level | Optional | None | Confidence levels between 0 and 100 for prediction intervals. |
X_df | Optional | None | Dataframe with the future exogenous features. Should have the id column and the time column. |
ids | Optional | None | List with subset of ids seen during training for which the forecasts should be computed. |
Returns | DFType | Predictions for each serie and timestep, with one column per model. |
We can see at a couple of results.
Prediction intervals
With
MLForecast
,
you can generate prediction intervals using Conformal Prediction. To
configure Conformal Prediction, you need to pass an instance of the
PredictionIntervals
class to the prediction_intervals
argument of the fit
method. The
class takes three parameters: n_windows
, h
and method
.
n_windows
represents the number of cross-validation windows used to calibrate the intervalsh
is the forecast horizonmethod
can beconformal_distribution
orconformal_error
;conformal_distribution
(default) creates forecasts paths based on the cross-validation errors and calculate quantiles using those paths, on the other handconformal_error
calculates the error quantiles to produce prediction intervals. The strategy will adjust the intervals for each horizon step, resulting in different widths for each step. Please note that a minimum of 2 cross-validation windows must be used.
After that, you just have to include your desired confidence levels to
the predict
method using the level
argument. Levels must lie between
0 and 100.
unique_id | ds | LGBMRegressor | LGBMRegressor-lo-95 | LGBMRegressor-lo-80 | LGBMRegressor-lo-50 | LGBMRegressor-hi-50 | LGBMRegressor-hi-80 | LGBMRegressor-hi-95 | |
---|---|---|---|---|---|---|---|---|---|
0 | H196 | 961 | 16.071271 | 15.958042 | 15.971271 | 16.005091 | 16.137452 | 16.171271 | 16.184501 |
1 | H196 | 962 | 15.671271 | 15.553632 | 15.553632 | 15.578632 | 15.763911 | 15.788911 | 15.788911 |
2 | H196 | 963 | 15.271271 | 15.153632 | 15.153632 | 15.162452 | 15.380091 | 15.388911 | 15.388911 |
3 | H196 | 964 | 14.971271 | 14.858042 | 14.871271 | 14.905091 | 15.037452 | 15.071271 | 15.084501 |
4 | H196 | 965 | 14.671271 | 14.553632 | 14.553632 | 14.562452 | 14.780091 | 14.788911 | 14.788911 |
Let’s explore the generated intervals.
If you want to reduce the computational time and produce intervals with
the same width for the whole forecast horizon, simple pass h=1
to the
PredictionIntervals
class. The caveat of this strategy is that in some cases, variance of
the absolute residuals maybe be small (even zero), so the intervals may
be too narrow.
Let’s explore the generated intervals.
Forecast using a pretrained model
MLForecast allows you to use a pretrained model to generate forecasts
for a new dataset. Simply provide a pandas dataframe containing the new
observations as the value for the new_df
argument when calling the
predict
method. The dataframe should have the same structure as the
one used to fit the model, including any features and time series data.
The function will then use the pretrained model to generate forecasts
for the new observations. This allows you to easily apply a pretrained
model to a new dataset and generate forecasts without the need to
retrain the model.
If you want to take a look at the data that will be used to train the
models you can call Forecast.preprocess
.
source
MLForecast.preprocess
Add the features to data
.
Type | Default | Details | |
---|---|---|---|
df | DFType | Series data in long format. | |
id_col | str | unique_id | Column that identifies each serie. |
time_col | str | ds | Column that identifies each timestep, its values can be timestamps or integers. |
target_col | str | y | Column that contains the target. |
static_features | Optional | None | Names of the features that are static and will be repeated when forecasting. |
dropna | bool | True | Drop rows with missing values produced by the transformations. |
keep_last_n | Optional | None | Keep only these many records from each serie for the forecasting step. Can save time and memory if your features allow it. |
max_horizon | Optional | None | Train this many models, where each model will predict a specific horizon. |
return_X_y | bool | False | Return a tuple with the features and the target. If False will return a single dataframe. |
as_numpy | bool | False | Cast features to numpy array. Only works for return_X_y=True . |
weight_col | Optional | None | Column that contains the sample weights. |
Returns | Union | df plus added features and target(s). |
unique_id | ds | y | lag24 | lag48 | lag72 | lag96 | lag120 | lag144 | lag168 | exponentially_weighted_mean_lag48_alpha0.3 | |
---|---|---|---|---|---|---|---|---|---|---|---|
86988 | H196 | 193 | 0.1 | 0.0 | 0.0 | 0.0 | 0.3 | 0.1 | 0.1 | 0.3 | 0.002810 |
86989 | H196 | 194 | 0.1 | -0.1 | 0.1 | 0.0 | 0.3 | 0.1 | 0.1 | 0.3 | 0.031967 |
86990 | H196 | 195 | 0.1 | -0.1 | 0.1 | 0.0 | 0.3 | 0.1 | 0.2 | 0.1 | 0.052377 |
86991 | H196 | 196 | 0.1 | 0.0 | 0.0 | 0.0 | 0.3 | 0.2 | 0.1 | 0.2 | 0.036664 |
86992 | H196 | 197 | 0.0 | 0.0 | 0.0 | 0.1 | 0.2 | 0.2 | 0.1 | 0.2 | 0.025665 |
… | … | … | … | … | … | … | … | … | … | … | … |
325187 | H413 | 956 | 0.0 | 10.0 | 1.0 | 6.0 | -53.0 | 44.0 | -21.0 | 21.0 | 7.963225 |
325188 | H413 | 957 | 9.0 | 10.0 | 10.0 | -7.0 | -46.0 | 27.0 | -19.0 | 24.0 | 8.574257 |
325189 | H413 | 958 | 16.0 | 8.0 | 5.0 | -9.0 | -36.0 | 32.0 | -13.0 | 8.0 | 7.501980 |
325190 | H413 | 959 | -3.0 | 17.0 | -7.0 | 2.0 | -31.0 | 22.0 | 5.0 | -2.0 | 3.151386 |
325191 | H413 | 960 | 15.0 | 11.0 | -6.0 | -5.0 | -17.0 | 22.0 | -18.0 | 10.0 | 0.405970 |
If we do this we then have to call Forecast.fit_models
, since this
only stores the series information.
source
MLForecast.fit_models
Manually train models. Use this if you called
MLForecast.preprocess
beforehand.
Type | Details | |
---|---|---|
X | Union | Features. |
y | ndarray | Target. |
Returns | MLForecast | Forecast object with trained models. |
source
MLForecast.cross_validation
Perform time series cross validation. Creates n_windows
splits where
each window has h
test periods, trains the models, computes the
predictions and merges the actuals.
Type | Default | Details | |
---|---|---|---|
df | DFType | Series data in long format. | |
n_windows | int | Number of windows to evaluate. | |
h | int | Forecast horizon. | |
id_col | str | unique_id | Column that identifies each serie. |
time_col | str | ds | Column that identifies each timestep, its values can be timestamps or integers. |
target_col | str | y | Column that contains the target. |
step_size | Optional | None | Step size between each cross validation window. If None it will be equal to h . |
static_features | Optional | None | Names of the features that are static and will be repeated when forecasting. |
dropna | bool | True | Drop rows with missing values produced by the transformations. |
keep_last_n | Optional | None | Keep only these many records from each serie for the forecasting step. Can save time and memory if your features allow it. |
refit | Union | True | Retrain model for each cross validation window. If False, the models are trained at the beginning and then used to predict each window. If positive int, the models are retrained every refit windows. |
max_horizon | Optional | None | |
before_predict_callback | Optional | None | Function to call on the features before computing the predictions. This function will take the input dataframe that will be passed to the model for predicting and should return a dataframe with the same structure. The series identifier is on the index. |
after_predict_callback | Optional | None | Function to call on the predictions before updating the targets. This function will take a pandas Series with the predictions and should return another one with the same structure. The series identifier is on the index. |
prediction_intervals | Optional | None | Configuration to calibrate prediction intervals (Conformal Prediction). |
level | Optional | None | Confidence levels between 0 and 100 for prediction intervals. |
input_size | Optional | None | Maximum training samples per serie in each window. If None, will use an expanding window. |
fitted | bool | False | Store the in-sample predictions. |
as_numpy | bool | False | Cast features to numpy array. |
weight_col | Optional | None | Column that contains the sample weights. |
Returns | DFType | Predictions for each window with the series id, timestamp, last train date, target value and predictions from each model. |
If we would like to know how good our forecast will be for a specific model and set of features then we can perform cross validation. What cross validation does is take our data and split it in two parts, where the first part is used for training and the second one for validation. Since the data is time dependant we usually take the last x observations from our data as the validation set.
This process is implemented in
MLForecast.cross_validation
,
which takes our data and performs the process described above for
n_windows
times where each window has h
validation samples in it.
For example, if we have 100 samples and we want to perform 2 backtests
each of size 14, the splits will be as follows:
- Train: 1 to 72. Validation: 73 to 86.
- Train: 1 to 86. Validation: 87 to 100.
You can control the size between each cross validation window using the
step_size
argument. For example, if we have 100 samples and we want to
perform 2 backtests each of size 14 and move one step ahead in each fold
(step_size=1
), the splits will be as follows:
- Train: 1 to 85. Validation: 86 to 99.
- Train: 1 to 86. Validation: 87 to 100.
You can also perform cross validation without refitting your models for
each window by setting refit=False
. This allows you to evaluate the
performance of your models using multiple window sizes without having to
retrain them each time.
unique_id | ds | cutoff | y | LGBMRegressor | |
---|---|---|---|---|---|
0 | H196 | 865 | 864 | 15.5 | 15.373393 |
1 | H196 | 866 | 864 | 15.1 | 14.973393 |
2 | H196 | 867 | 864 | 14.8 | 14.673393 |
3 | H196 | 868 | 864 | 14.4 | 14.373393 |
4 | H196 | 869 | 864 | 14.2 | 14.073393 |
… | … | … | … | … | … |
379 | H413 | 956 | 912 | 59.0 | 64.284167 |
380 | H413 | 957 | 912 | 58.0 | 64.830429 |
381 | H413 | 958 | 912 | 53.0 | 40.726851 |
382 | H413 | 959 | 912 | 38.0 | 42.739657 |
383 | H413 | 960 | 912 | 46.0 | 52.802769 |
Since we set fitted=True
we can access the predictions for the
training sets as well with the cross_validation_fitted_values
method.
unique_id | ds | fold | y | LGBMRegressor | |
---|---|---|---|---|---|
0 | H196 | 193 | 0 | 12.7 | 12.673393 |
1 | H196 | 194 | 0 | 12.3 | 12.273393 |
2 | H196 | 195 | 0 | 11.9 | 11.873393 |
3 | H196 | 196 | 0 | 11.7 | 11.673393 |
4 | H196 | 197 | 0 | 11.4 | 11.473393 |
… | … | … | … | … | … |
5563 | H413 | 908 | 1 | 49.0 | 50.620196 |
5564 | H413 | 909 | 1 | 39.0 | 35.972331 |
5565 | H413 | 910 | 1 | 29.0 | 29.359678 |
5566 | H413 | 911 | 1 | 24.0 | 25.784563 |
5567 | H413 | 912 | 1 | 20.0 | 23.168413 |
We can also compute prediction intervals by passing a configuration to
prediction_intervals
as well as values for the width through levels
.
unique_id | ds | cutoff | y | LGBMRegressor | LGBMRegressor-lo-90 | LGBMRegressor-lo-80 | LGBMRegressor-hi-80 | LGBMRegressor-hi-90 | |
---|---|---|---|---|---|---|---|---|---|
0 | H196 | 865 | 864 | 15.5 | 15.373393 | 15.311379 | 15.316528 | 15.430258 | 15.435407 |
1 | H196 | 866 | 864 | 15.1 | 14.973393 | 14.940556 | 14.940556 | 15.006230 | 15.006230 |
2 | H196 | 867 | 864 | 14.8 | 14.673393 | 14.606230 | 14.606230 | 14.740556 | 14.740556 |
3 | H196 | 868 | 864 | 14.4 | 14.373393 | 14.306230 | 14.306230 | 14.440556 | 14.440556 |
4 | H196 | 869 | 864 | 14.2 | 14.073393 | 14.006230 | 14.006230 | 14.140556 | 14.140556 |
… | … | … | … | … | … | … | … | … | … |
379 | H413 | 956 | 912 | 59.0 | 64.284167 | 29.890099 | 34.371545 | 94.196788 | 98.678234 |
380 | H413 | 957 | 912 | 58.0 | 64.830429 | 56.874572 | 57.827689 | 71.833169 | 72.786285 |
381 | H413 | 958 | 912 | 53.0 | 40.726851 | 35.296195 | 35.846206 | 45.607495 | 46.157506 |
382 | H413 | 959 | 912 | 38.0 | 42.739657 | 35.292153 | 35.807640 | 49.671674 | 50.187161 |
383 | H413 | 960 | 912 | 46.0 | 52.802769 | 42.465597 | 43.895670 | 61.709869 | 63.139941 |
The refit
argument allows us to control if we want to retrain the
models in every window. It can either be:
- A boolean: True will retrain on every window and False only on the first one.
- A positive integer: The models will be trained on the first window
and then every
refit
windows.
source
MLForecast.from_cv
Once you’ve found a set of features and parameters that work for your
problem you can build a forecast object from it using
MLForecast.from_cv
,
which takes the trained
LightGBMCV
object and builds an
MLForecast
object that will use the same features and parameters. Then you can call
fit and predict as you normally would.