`module` `neuralforecast.models.tft`

`function` `get_activation_fn`

get_activation_fn(activation_str: str) → Callable

`class` `MaybeLayerNorm`

`method` `init`

__init__(output_size, hidden_size, eps)

`method` `forward`

forward(x)

`class` `GLU`

`method` `init`

__init__(hidden_size, output_size)

`method` `forward`

forward(x: Tensor) → Tensor

`class` `GRN`

`method` `init`

__init__(
    input_size,
    hidden_size,
    output_size=None,
    context_hidden_size=None,
    dropout=0,
    activation='ELU'
)

`method` `forward`

forward(a: Tensor, c: Optional[Tensor] = None)

`class` `TFTEmbedding`

`method` `init`

__init__(
    hidden_size,
    stat_input_size,
    futr_input_size,
    hist_input_size,
    tgt_size
)

`method` `forward`

forward(target_inp, stat_exog=None, futr_exog=None, hist_exog=None)

`class` `VariableSelectionNetwork`

`method` `init`

__init__(hidden_size, num_inputs, dropout, grn_activation)

`method` `forward`

forward(x: Tensor, context: Optional[Tensor] = None)

`class` `InterpretableMultiHeadAttention`

`method` `init`

__init__(n_head, hidden_size, example_length, attn_dropout, dropout)

`method` `forward`

forward(x: Tensor, mask_future_timesteps: bool = True) → Tuple[Tensor, Tensor]

`class` `StaticCovariateEncoder`

`method` `init`

__init__(
    hidden_size,
    num_static_vars,
    dropout,
    grn_activation,
    rnn_type='lstm',
    n_rnn_layers=1,
    one_rnn_initial_state=False
)

`method` `forward`

forward(x: Tensor) → Tuple[Tensor, Tensor, Tensor, Tensor]

`class` `TemporalCovariateEncoder`

`method` `init`

__init__(
    hidden_size,
    num_historic_vars,
    num_future_vars,
    dropout,
    grn_activation,
    rnn_type='lstm',
    n_rnn_layers=1
)

`method` `forward`

forward(historical_inputs, future_inputs, cs, ch, cc)

`class` `TemporalFusionDecoder`

`method` `init`

__init__(
    n_head,
    hidden_size,
    example_length,
    encoder_length,
    attn_dropout,
    dropout,
    grn_activation
)

`method` `forward`

forward(temporal_features, ce)

`class` `TFT`

TFT The Temporal Fusion Transformer architecture (TFT) is an Sequence-to-Sequence model that combines static, historic and future available data to predict an univariate target. The method combines gating layers, an LSTM recurrent encoder, with and interpretable multi-head attention layer and a multi-step forecasting strategy decoder. Args:

h (int): Forecast horizon.
input_size (int): autorregresive inputs size, y=[1,2,3,4] input_size=2 -> y_[t-2:t]=[1,2].
tgt_size (int): target size.
stat_exog_list (str list): static continuous columns.
hist_exog_list (str list): historic continuous columns.
futr_exog_list (str list): future continuous columns.
hidden_size (int): units of embeddings and encoders.
n_head (int): number of attention heads in temporal fusion decoder.
attn_dropout (float): dropout of fusion decoder’s attention layer.
grn_activation (str): activation for the GRN module from [‘ReLU’, ‘Softplus’, ‘Tanh’, ‘SELU’, ‘LeakyReLU’, ‘Sigmoid’, ‘ELU’, ‘GLU’].
n_rnn_layers (int): number of RNN layers.
rnn_type (str): recurrent neural network (RNN) layer type from [“lstm”,“gru”].
one_rnn_initial_state (str): Initialize all rnn layers with the same initial states computed from static covariates.
dropout (float): dropout of inputs VSNs.
loss (PyTorch module): instantiated train loss class from losses collection.
valid_loss (PyTorch module): instantiated valid loss class from losses collection.
max_steps (int): maximum number of training steps.
learning_rate (float): Learning rate between (0, 1).
num_lr_decays (int): Number of learning rate decays, evenly distributed across max_steps.
early_stop_patience_steps (int): Number of validation iterations before early stopping.
val_check_steps (int): Number of training steps between every validation loss check.
batch_size (int): number of different series in each batch.
valid_batch_size (int): number of different series in each validation and test batch.
windows_batch_size (int): windows sampled from rolled data, default uses all.
inference_windows_batch_size (int): number of windows to sample in each inference batch, -1 uses all.
start_padding_enabled (bool): if True, the model will pad the time series with zeros at the beginning, by input size.
training_data_availability_threshold (Union[float, List[float]]): minimum fraction of valid data points required for training windows. Single float applies to both insample and outsample; list of two floats specifies [insample_fraction, outsample_fraction]. Default 0.0 allows windows with only 1 valid data point (current behavior).
step_size (int): step size between each window of temporal data.
scaler_type (str): type of scaler for temporal inputs normalization see temporal scalers.
random_seed (int): random seed initialization for replicability.
drop_last_loader (bool): if True TimeSeriesDataLoader drops last non-full batch.
alias (str): optional, Custom name of the model.
optimizer (Subclass of ‘torch.optim.Optimizer’): optional, user specified optimizer instead of the default choice (Adam).
optimizer_kwargs (dict): optional, list of parameters used by the user specified optimizer.
lr_scheduler (Subclass of ‘torch.optim.lr_scheduler.LRScheduler’): optional, user specified lr_scheduler instead of the default choice (StepLR).
lr_scheduler_kwargs (dict): optional, list of parameters used by the user specified lr_scheduler.
dataloader_kwargs (dict): optional, list of parameters passed into the PyTorch Lightning dataloader by the TimeSeriesDataLoader.
**trainer_kwargs (int): keyword trainer arguments inherited from PyTorch Lighning’s trainer.

References:

Bryan Lim, Sercan O. Arik, Nicolas Loeff, Tomas Pfister, “Temporal Fusion Transformers for interpretable multi-horizon time series forecasting”

`method` `init`

__init__(
    h,
    input_size,
    tgt_size: int = 1,
    stat_exog_list=None,
    hist_exog_list=None,
    futr_exog_list=None,
    hidden_size: int = 128,
    n_head: int = 4,
    attn_dropout: float = 0.0,
    grn_activation: str = 'ELU',
    n_rnn_layers: int = 1,
    rnn_type: str = 'lstm',
    one_rnn_initial_state: bool = False,
    dropout: float = 0.1,
    loss=MAE(),
    valid_loss=None,
    max_steps: int = 1000,
    learning_rate: float = 0.001,
    num_lr_decays: int = -1,
    early_stop_patience_steps: int = -1,
    val_check_steps: int = 100,
    batch_size: int = 32,
    valid_batch_size: Optional[int] = None,
    windows_batch_size: int = 1024,
    inference_windows_batch_size: int = 1024,
    start_padding_enabled=False,
    training_data_availability_threshold=0.0,
    step_size: int = 1,
    scaler_type: str = 'robust',
    random_seed: int = 1,
    drop_last_loader=False,
    alias: Optional[str] = None,
    optimizer=None,
    optimizer_kwargs=None,
    lr_scheduler=None,
    lr_scheduler_kwargs=None,
    dataloader_kwargs=None,
    **trainer_kwargs
)

`property` automatic_optimization

If set to False you are responsible for calling .backward(), .step(), .zero_grad().

`property` current_epoch

The current epoch in the Trainer, or 0 if not attached.

`property` device

`property` device_mesh

Strategies like ModelParallelStrategy will create a device mesh that can be accessed in the :meth:~pytorch_lightning.core.hooks.ModelHooks.configure_model hook to parallelize the LightningModule.

`property` dtype

`property` example_input_array

The example input array is a specification of what the module can consume in the :meth:forward method. The return type is interpreted as follows:

Single tensor: It is assumed the model takes a single argument, i.e., model.forward(model.example_input_array)
Tuple: The input array should be interpreted as a sequence of positional arguments, i.e., model.forward(*model.example_input_array)
Dict: The input array represents named keyword arguments, i.e., model.forward(**model.example_input_array)

`property` fabric

`property` global_rank

The index of the current process across all nodes and devices.

`property` global_step

Total training batches seen across all epochs. If no Trainer is attached, this property is 0.

`property` hparams

The collection of hyperparameters saved with :meth:save_hyperparameters. It is mutable by the user. For the frozen set of initial hyperparameters, use :attr:hparams_initial. Returns: Mutable hyperparameters dictionary

`property` hparams_initial

The collection of hyperparameters saved with :meth:save_hyperparameters. These contents are read-only. Manual updates to the saved hyperparameters can instead be performed through :attr:hparams. Returns:

AttributeDict: immutable initial hyperparameters

`property` local_rank

The index of the current process within a single node.

`property` logger

Reference to the logger object in the Trainer.

`property` loggers

Reference to the list of loggers in the Trainer.

`property` on_gpu

Returns True if this model is currently located on a GPU. Useful to set flags around the LightningModule for different CPU vs GPU behavior.

`property` strict_loading

Determines how Lightning loads this model using .load_state_dict(..., strict=model.strict_loading).

`property` trainer

`method` `attention_weights`

attention_weights()

Batch average attention weights Returns: np.ndarray: A 1D array containing the attention weights for each time step.

`method` `feature_importance_correlations`

feature_importance_correlations() → DataFrame

Compute the correlation between the past and future feature importances and the mean attention weights. Returns: pd.DataFrame: A DataFrame containing the correlation coefficients between the past feature importances and the mean attention weights.

`method` `feature_importances`

feature_importances()

Compute the feature importances for historical, future, and static features. Returns:

dict: A dictionary containing the feature importances for each feature type. The keys are ‘hist_vsn’, ‘future_vsn’, and ‘static_vsn’, and the values are pandas DataFrames with the corresponding feature importances.

`method` `forward`

forward(windows_batch)

`method` `mean_on_batch`

mean_on_batch(tensor)

Getting Started

Capabilities

Tutorials

Use cases

API Reference

​module neuralforecast.models.tft

​function get_activation_fn

​class MaybeLayerNorm

​method __init__

​method forward

​class GLU

​method __init__

​method forward

​class GRN

​method __init__

​method forward

​class TFTEmbedding

​method __init__

​method forward

​class VariableSelectionNetwork

​method __init__

​method forward

​class InterpretableMultiHeadAttention

​method __init__

​method forward

​class StaticCovariateEncoder

​method __init__

​method forward

​class TemporalCovariateEncoder

​method __init__

​method forward

​class TemporalFusionDecoder

​method __init__

​method forward

​class TFT

​method __init__

​property automatic_optimization

​property current_epoch

​property device

​property device_mesh

​property dtype

​property example_input_array

​property fabric

​property global_rank

​property global_step

​property hparams

​property hparams_initial

​property local_rank

​property logger

​property loggers

​property on_gpu

​property strict_loading

​property trainer

​method attention_weights

​method feature_importance_correlations

​method feature_importances

​method forward

​method mean_on_batch

`module` `neuralforecast.models.tft`

`function` `get_activation_fn`

`class` `MaybeLayerNorm`

`method` `init`

`method` `forward`

`class` `GLU`

`method` `init`

`method` `forward`

`class` `GRN`

`method` `init`

`method` `forward`

`class` `TFTEmbedding`

`method` `init`

`method` `forward`

`class` `VariableSelectionNetwork`

`method` `init`

`method` `forward`

`class` `InterpretableMultiHeadAttention`

`method` `init`

`method` `forward`

`class` `StaticCovariateEncoder`

`method` `init`

`method` `forward`

`class` `TemporalCovariateEncoder`

`method` `init`

`method` `forward`

`class` `TemporalFusionDecoder`

`method` `init`

`method` `forward`

`class` `TFT`

`method` `init`

`property` automatic_optimization

`property` current_epoch

`property` device

`property` device_mesh

`property` dtype

`property` example_input_array

`property` fabric

`property` global_rank

`property` global_step

`property` hparams

`property` hparams_initial

`property` local_rank

`property` logger

`property` loggers

`property` on_gpu

`property` strict_loading

`property` trainer

`method` `attention_weights`

`method` `feature_importance_correlations`

`method` `feature_importances`

`method` `forward`

`method` `mean_on_batch`