Detect Demand Peaks
In this example we will show how to perform electricity load forecasting on the ERCOT (Texas) market for detecting daily peaks.
Introduction
Predicting peaks in different markets is useful. In the electricity market, consuming electricity at peak demand is penalized with higher tarifs. When an individual or company consumes electricity when its most demanded, regulators calls that a coincident peak (CP).
In the Texas electricity market (ERCOT), the peak is the monthly 15-minute interval when the ERCOT Grid is at a point of highest capacity. The peak is caused by all consumers’ combined demand on the electrical grid. The coincident peak demand is an important factor used by ERCOT to determine final electricity consumption bills. ERCOT registers the CP demand of each client for 4 months, between June and September, and uses this to adjust electricity prices. Clients can therefore save on electricity bills by reducing the coincident peak demand.
In this example we will train a LightGBM
model on historic load data
to forecast day-ahead peaks on September 2022. Multiple seasonality is
traditionally present in low sampled electricity data. Demand exhibits
daily and weekly seasonality, with clear patterns for specific hours of
the day such as 6:00pm vs 3:00am or for specific days such as Sunday vs
Friday.
First, we will load ERCOT historic demand, then we will use the
MLForecast.cross_validation
method to fit the LightGBM
model and forecast daily load during
September. Finally, we show how to use the forecasts to detect the
coincident peak.
Outline
- Install libraries
- Load and explore the data
- Fit LightGBM model and forecast
- Peak detection
Tip
You can use Colab to run this Notebook interactively
Libraries
We assume you have MLForecast already installed. Check this guide for instructions on how to install MLForecast.
Install the necessary packages using pip install mlforecast
.
Also we have to install LightGBM
using pip install lightgbm
.
Load Data
The input to MLForecast is always a data frame in long
format with
three columns: unique_id
, ds
and y
:
-
The
unique_id
(string, int or category) represents an identifier for the series. -
The
ds
(datestamp or int) column should be either an integer indexing time or a datestamp ideally like YYYY-MM-DD for a date or YYYY-MM-DD HH:MM:SS for a timestamp. -
The
y
(numeric) represents the measurement we wish to forecast. We will rename the
First, read the 2022 historic total demand of the ERCOT market. We processed the original data (available here), by adding the missing hour due to daylight saving time, parsing the date to datetime format, and filtering columns of interest.
We observe that the time series exhibits seasonal patterns. Moreover,
the time series contains 6,552
observations, so it is necessary to use
computationally efficient methods to deploy them in production.
Fit and Forecast LightGBM model
Import the
MLForecast
class and the models you need.
First, instantiate the model and define the parameters.
Tip
In this example we are using the default parameters of the
lgb.LGBMRegressor
model, but you can change them to improve the forecasting performance.
We fit the model by instantiating a
MLForecast
object with the following required parameters:
-
models
: a list of sklearn-like (fit and predict) models. -
freq
: a string indicating the frequency of the data. (See panda’s available frequencies.) -
target_transforms
: Transformations to apply to the target before computing the features. These are restored at the forecasting step. -
lags
: Lags of the target to use as features.
Tip
In this example, we are only using differences and lags to produce features. See the full documentation to see all available features.
The cross_validation
method allows the user to simulate multiple
historic forecasts, greatly simplifying pipelines by replacing for loops
with fit
and predict
methods. This method re-trains the model and
forecast each window. See this
tutorial
for an animation of how the windows are defined.
Use the cross_validation
method to produce all the daily forecasts for
September. To produce daily forecasts set the forecasting horizon
window_size
as 24. In this example we are simulating deploying the
pipeline during September, so set the number of windows as 30 (one for
each day). Finally, the step size between windows is 24 (equal to the
window_size
). This ensure to only produce one forecast per day.
Additionally,
id_col
: identifies each time series.time_col
: indetifies the temporal column of the time series.target_col
: identifies the column to model.
unique_id | ds | cutoff | y | LGBMRegressor | |
---|---|---|---|---|---|
0 | ERCOT | 2022-09-01 00:00:00 | 2022-08-31 23:00:00 | 45482.471757 | 45685.265537 |
1 | ERCOT | 2022-09-01 01:00:00 | 2022-08-31 23:00:00 | 43602.658043 | 43779.819515 |
2 | ERCOT | 2022-09-01 02:00:00 | 2022-08-31 23:00:00 | 42284.817342 | 42672.470923 |
3 | ERCOT | 2022-09-01 03:00:00 | 2022-08-31 23:00:00 | 41663.156771 | 42091.768192 |
4 | ERCOT | 2022-09-01 04:00:00 | 2022-08-31 23:00:00 | 41710.621904 | 42481.403168 |
Important
When using
cross_validation
make sure the forecasts are produced at the desired timestamps. Check thecutoff
column which specifices the last timestamp before the forecasting window.
Peak Detection
Finally, we use the forecasts in crossvaldation_df
to detect the daily
hourly demand peaks. For each day, we set the detected peaks as the
highest forecasts. In this case, we want to predict one peak (npeaks
);
depending on your setting and goals, this parameter might change. For
example, the number of peaks can correspond to how many hours a battery
can be discharged to reduce demand.
For the ERCOT 4CP detection task we are interested in correctly predicting the highest monthly load. Next, we filter the day in September with the highest hourly demand and predict the peak.
In the following plot we see how the LightGBM model is able to correctly detect the coincident peak for September 2022.
Important
In this example we only include September. However, MLForecast and LightGBM can correctly predict the peaks for the 4 months of 2022. You can try this by increasing the
n_windows
parameter ofcross_validation
or filtering theY_df
dataset.
Next steps
MLForecast and LightGBM in particular are good benchmarking models for peak detection. However, it might be useful to explore further and newer forecasting algorithms or perform hyperparameter optimization.