Correct horizon-dependent bias in recursive multi-step forecasts by training a small residual model on top of an existing forecaster.
What you’ll learn
- Why recursive multi-step forecasts develop bias as the horizon grows
- How the Rectify strategy combines recursive (low-variance) and direct (low-bias) forecasting
- How to compute horizon-indexed residuals from cross-validation output
- How to align feature matrices with residuals for training correction models
- How to apply fitted correction models to new forecasts using both
per_horizonandhorizon_awaremodes
The problem: bias accumulates with horizon
Recursive forecasting trains a single one-step-ahead model and feeds its predictions back as inputs for longer horizons. It is fast and produces smooth multi-step trajectories, but for non-linear data-generating processes the bias compounds: each step uses a slightly biased prediction as input, which produces a more biased prediction for the next step. Direct forecasting avoids that problem by training a separate model per horizon. It is unbiased but high-variance and produces no continuity between horizons. The Rectify strategy by Ben Taieb & Hyndman (2012) keeps the recursive base forecast and adds a per-horizon correction:utilsforecast provides three building blocks for this — computing
residuals, aligning features, and applying corrections. You bring the
base forecaster and the correction regressor.
Install libraries
Generate synthetic time series
We usegenerate_series to create a panel of 8 daily series with weekly
seasonality and a positive trend. The trend is what makes the example
interesting: our base forecaster will be SeasonalNaive, which repeats
the last observed week and has no way to capture trend. Its forecasts
will drift further behind the actuals as the horizon grows — a textbook
setup for rectify to fix.
Generate cross-validation forecasts
We need cross-validation output (forecasts plus actuals across multiple cutoffs) to train the correction models.StatsForecast.cross_validation() produces exactly that, with a
cutoff column distinguishing each fold.
| unique_id | ds | cutoff | y | SeasonalNaive | |
|---|---|---|---|---|---|
| 0 | 0 | 2000-05-21 | 2000-05-20 | 119.270219 | 113.641010 |
| 1 | 0 | 2000-05-22 | 2000-05-20 | 120.881632 | 115.122690 |
| 2 | 0 | 2000-05-23 | 2000-05-20 | 122.832229 | 117.082437 |
| 3 | 0 | 2000-05-24 | 2000-05-20 | 124.984052 | 118.821265 |
| 4 | 0 | 2000-05-25 | 2000-05-20 | 126.486714 | 120.650260 |
y and the
SeasonalNaive forecast. With 4 cutoffs and a horizon of 7, every series
contributes 28 training rows for the correction model.
Step 1: compute per-horizon residuals
compute_rectify_residuals joins actuals and forecasts on
(id_col, time_col, cutoff_col) and computes actual - forecast per
row. It also adds a 1-indexed horizon column based on the row position
within each (id, cutoff) group.
| unique_id | ds | cutoff | horizon | SeasonalNaive | |
|---|---|---|---|---|---|
| 0 | 0 | 2000-05-21 | 2000-05-20 | 1 | 5.629209 |
| 1 | 0 | 2000-05-22 | 2000-05-20 | 2 | 5.758942 |
| 2 | 0 | 2000-05-23 | 2000-05-20 | 3 | 5.749792 |
| 3 | 0 | 2000-05-24 | 2000-05-20 | 4 | 6.162787 |
| 4 | 0 | 2000-05-25 | 2000-05-20 | 5 | 5.836454 |
| 5 | 0 | 2000-05-26 | 2000-05-20 | 6 | 5.913097 |
| 6 | 0 | 2000-05-27 | 2000-05-20 | 7 | 5.799038 |
| 7 | 0 | 2000-05-22 | 2000-05-21 | 1 | 5.758942 |
| 8 | 0 | 2000-05-23 | 2000-05-21 | 2 | 5.749792 |
| 9 | 0 | 2000-05-24 | 2000-05-21 | 3 | 6.162787 |
cv_df for both df and forecasts_df because the
cross-validation output already contains both columns. The output is
sorted by (unique_id, cutoff, ds) — the same order we sorted cv_df
in. Keeping the orderings consistent matters for the next step.
Step 2: build features
The correction model needs a feature matrix with the same row count and order asresiduals_df. Anything that explains residual variation is
fair game — calendar effects, lags, or exogenous variables. To keep the
tutorial focused, we use two simple features:
dow: day of week of the prediction datelag_y: the most recent observedyper series, used as a level proxy
Step 3: align features with residuals
align_rectify_features partitions the residuals by horizon and slices
the feature matrix accordingly. In per_horizon mode it returns a dict
mapping each horizon h to a (X_h, {model: residuals_h}) tuple — one
training set per horizon.
Step 4: train one correction model per horizon
Any object with a.predict(X) method works. Here we use
LinearRegression, but Ridge, LightGBM, or a custom regressor would
all be valid.
Step 5: forecast on the held-out window and rectify
We refit the base model on the full training set, predict the held-out horizon, then build a matching feature matrix and applyrectify. The
corrected forecasts add the predicted residual back to each base
prediction.
| unique_id | ds | SeasonalNaive | |
|---|---|---|---|
| 0 | 0 | 2000-05-31 | 131.317404 |
| 1 | 0 | 2000-06-01 | 132.994530 |
| 2 | 0 | 2000-06-02 | 134.817132 |
| 3 | 0 | 2000-06-03 | 129.886634 |
| 4 | 0 | 2000-06-04 | 131.599479 |
Step 6: compare base vs rectified MAE
Merge the held-out actuals with both forecast frames and useevaluate() to compute MAE per series.
| metric | base | rectified | |
|---|---|---|---|
| 0 | mae | 4.865848 | 0.326756 |
Horizon-aware mode: a single corrector for all horizons
per_horizon mode trains one correction model per horizon, which is the
classic Rectify formulation. When training data is small or you want a
single model to share information across horizons, horizon_aware mode
appends the horizon as an extra feature column and trains one corrector
for the full range.
| unique_id | ds | SeasonalNaive | |
|---|---|---|---|
| 0 | 0 | 2000-05-31 | 131.422401 |
| 1 | 0 | 2000-06-01 | 132.950212 |
| 2 | 0 | 2000-06-02 | 134.765042 |
| 3 | 0 | 2000-06-03 | 129.799484 |
| 4 | 0 | 2000-06-04 | 131.909747 |
mode='horizon_aware'is keyword-only, so it must be passed by namecorrection_modelsis flat ({model_name: regressor}) instead of nested by horizon- The horizon column is appended to the feature matrix before fitting, so the corrector sees horizon as just another feature
Custom column names
All three functions acceptid_col and time_col parameters.
compute_rectify_residuals additionally accepts target_col and
cutoff_col. Pass them consistently across the pipeline if your data
uses non-default names.
Key takeaways
- Rectify is post-processing. Your base forecaster doesn’t need to change; the correction model lives on top of it.
- Use
cutoff_colwhen training from cross-validation output. Without it, repeated(id, ds)pairs across folds collide on the join. - Features must be row-aligned with the dataframe. Sort
cv_dfand the forecast frame consistently before extracting features and you’re safe. per_horizonvshorizon_awareis a sample-size call. When you have plenty of folds, separate models per horizon usually win. When data is thin, one shared model with horizon as a feature does better.- Anything with a
predict(X)method works. sklearn, lightgbm, xgboost, your own class.

