`module` `mlforecast.lgb_cv`

`class` `LightGBMCV`

`method` `init`

__init__(
    freq: Union[int, str],
    lags: Optional[Iterable[int]] = None,
    lag_transforms: Optional[Dict[int, List[Union[Callable, Tuple[Callable, Any]]]]] = None,
    date_features: Optional[Iterable[Union[str, Callable]]] = None,
    num_threads: int = 1,
    target_transforms: Optional[List[Union[BaseTargetTransform, _BaseGroupedArrayTargetTransform]]] = None
)

Create LightGBM CV object. Args:

freq (str or int): Pandas offset alias, e.g. ‘D’, ‘W-THU’ or integer denoting the frequency of the series.
lags (list of int, optional): Lags of the target to use as features. Defaults to None.
lag_transforms (dict of int to list of functions, optional): Mapping of target lags to their transformations. Defaults to None.
date_features (list of str or callable, optional): Features computed from the dates. Can be pandas date attributes or functions that will take the dates as input. Defaults to None.
num_threads (int): Number of threads to use when computing the features. Defaults to 1.
target_transforms (list of transformers, optional): Transformations that will be applied to the target before computing the features and restored after the forecasting step. Defaults to None.

`method` `find_best_iter`

find_best_iter(hist, early_stopping_evals) → int

`method` `fit`

fit(
    df: DataFrame,
    n_windows: int,
    h: int,
    id_col: str = 'unique_id',
    time_col: str = 'ds',
    target_col: str = 'y',
    step_size: Optional[int] = None,
    num_iterations: int = 100,
    params: Optional[Dict[str, Any]] = None,
    static_features: Optional[List[str]] = None,
    dropna: bool = True,
    keep_last_n: Optional[int] = None,
    eval_every: int = 10,
    weights: Optional[Sequence[float]] = None,
    metric: Union[str, Callable] = 'mape',
    verbose_eval: bool = True,
    early_stopping_evals: int = 2,
    early_stopping_pct: float = 0.01,
    compute_cv_preds: bool = False,
    before_predict_callback: Optional[Callable] = None,
    after_predict_callback: Optional[Callable] = None,
    input_size: Optional[int] = None
) → List[Tuple[int, float]]

Train boosters simultaneously and assess their performance on the complete forecasting window. Args:

df (pandas DataFrame): Series data in long format.
n_windows (int): Number of windows to evaluate.
h (int): Forecast horizon.
id_col (str): Column that identifies each serie. Defaults to ‘unique_id’.
time_col (str): Column that identifies each timestep, its values can be timestamps or integers. Defaults to ‘ds’.
target_col (str): Column that contains the target. Defaults to ‘y’.
step_size (int, optional): Step size between each cross validation window. If None it will be equal to h. Defaults to None.
num_iterations (int): Maximum number of boosting iterations to run. Defaults to 100.
params (dict, optional): Parameters to be passed to the LightGBM Boosters. Defaults to None.
static_features (list of str, optional): Names of the features that are static and will be repeated when forecasting. Defaults to None.
dropna (bool): Drop rows with missing values produced by the transformations. Defaults to True.
keep_last_n (int, optional): Keep only these many records from each serie for the forecasting step. Can save time and memory if your features allow it. Defaults to None.
eval_every (int): Number of boosting iterations to train before evaluating on the whole forecast window. Defaults to 10.
weights (sequence of float, optional): Weights to multiply the metric of each window. If None, all windows have the same weight. Defaults to None.
metric (str or callable): Metric used to assess the performance of the models and perform early stopping. Defaults to ‘mape’.
verbose_eval (bool): Print the metrics of each evaluation.
early_stopping_evals (int): Maximum number of evaluations to run without improvement. Defaults to 2.
early_stopping_pct (float): Minimum percentage improvement in metric value in early_stopping_evals evaluations. Defaults to 0.01.
compute_cv_preds (bool): Compute predictions for each window after finding the best iteration. Defaults to False.
before_predict_callback (callable, optional): Function to call on the features before computing the predictions. This function will take the input dataframe that will be passed to the model for predicting and should return a dataframe with the same structure. The series identifier is on the index. Defaults to None.
after_predict_callback (callable, optional): Function to call on the predictions before updating the targets. This function will take a pandas Series with the predictions and should return another one with the same structure. The series identifier is on the index. Defaults to None.
input_size (int, optional): Maximum training samples per serie in each window. If None, will use an expanding window. Defaults to None.

Returns:

(list of tuple): List of (boosting rounds, metric value) tuples.

`method` `partial_fit`

partial_fit(
    num_iterations: int,
    before_predict_callback: Optional[Callable] = None,
    after_predict_callback: Optional[Callable] = None
) → float

Train the boosters for some iterations. Args:

num_iterations (int): Number of boosting iterations to run
before_predict_callback (callable, optional): Function to call on the features before computing the predictions. This function will take the input dataframe that will be passed to the model for predicting and should return a dataframe with the same structure. The series identifier is on the index. Defaults to None.
after_predict_callback (callable, optional): Function to call on the predictions before updating the targets. This function will take a pandas Series with the predictions and should return another one with the same structure. The series identifier is on the index. Defaults to None.

Returns:

(float): Weighted metric after training for num_iterations.

`method` `predict`

predict(
    h: int,
    before_predict_callback: Optional[Callable] = None,
    after_predict_callback: Optional[Callable] = None,
    X_df: Optional[DataFrame] = None
) → DataFrame

Compute predictions with each of the trained boosters. Args:

h (int): Forecast horizon.
before_predict_callback (callable, optional): Function to call on the features before computing the predictions. This function will take the input dataframe that will be passed to the model for predicting and should return a dataframe with the same structure. The series identifier is on the index. Defaults to None.
after_predict_callback (callable, optional): Function to call on the predictions before updating the targets. This function will take a pandas Series with the predictions and should return another one with the same structure. The series identifier is on the index. Defaults to None.
X_df (pandas DataFrame, optional): Dataframe with the future exogenous features. Should have the id column and the time column. Defaults to None.

Returns:

(pandas DataFrame): Predictions for each serie and timestep, with one column per window.

`method` `setup`

setup(
    df: DataFrame,
    n_windows: int,
    h: int,
    id_col: str = 'unique_id',
    time_col: str = 'ds',
    target_col: str = 'y',
    step_size: Optional[int] = None,
    params: Optional[Dict[str, Any]] = None,
    static_features: Optional[List[str]] = None,
    dropna: bool = True,
    keep_last_n: Optional[int] = None,
    weights: Optional[Sequence[float]] = None,
    metric: Union[str, Callable] = 'mape',
    input_size: Optional[int] = None
)

Initialize internal data structures to iteratively train the boosters. Use this before calling partial_fit. Args:

df (pandas DataFrame): Series data in long format.
n_windows (int): Number of windows to evaluate.
h (int): Forecast horizon.
id_col (str): Column that identifies each serie. Defaults to ‘unique_id’.
time_col (str): Column that identifies each timestep, its values can be timestamps or integers. Defaults to ‘ds’.
target_col (str): Column that contains the target. Defaults to ‘y’.
step_size (int, optional): Step size between each cross validation window. If None it will be equal to h. Defaults to None.
params (dict, optional): Parameters to be passed to the LightGBM Boosters. Defaults to None.
static_features (list of str, optional): Names of the features that are static and will be repeated when forecasting. Defaults to None.
dropna (bool): Drop rows with missing values produced by the transformations. Defaults to True.
keep_last_n (int, optional): Keep only these many records from each serie for the forecasting step. Can save time and memory if your features allow it. Defaults to None.
weights (sequence of float, optional): Weights to multiply the metric of each window. If None, all windows have the same weight. Defaults to None.
metric (str or callable): Metric used to assess the performance of the models and perform early stopping. Defaults to ‘mape’.
input_size (int, optional): Maximum training samples per serie in each window. If None, will use an expanding window. Defaults to None.

Returns:

(LightGBMCV): CV object with internal data structures for partial_fit.

`method` `should_stop`

should_stop(hist, early_stopping_evals, early_stopping_pct) → bool

Getting Started

How-to guides

Tutorials

API Reference

Lgb cv

`module` `mlforecast.lgb_cv`

`class` `LightGBMCV`

`method` `init`

`method` `find_best_iter`

`method` `fit`

`method` `partial_fit`

`method` `predict`

`method` `setup`

`method` `should_stop`

Getting Started

How-to guides

Tutorials

API Reference

​module mlforecast.lgb_cv

​class LightGBMCV

​method __init__

​method find_best_iter

​method fit

​method partial_fit

​method predict

​method setup

​method should_stop

`module` `mlforecast.lgb_cv`

`class` `LightGBMCV`

`method` `init`

`method` `find_best_iter`

`method` `fit`

`method` `partial_fit`

`method` `predict`

`method` `setup`

`method` `should_stop`