Skip to main content

module neuralforecast.models.fedformer


function get_frequency_modes

get_frequency_modes(seq_len, modes=64, mode_select_method='random')
Get modes on frequency domain: ‘random’ for sampling randomly ‘else’ for sampling the lowest modes;

class LayerNorm

Special designed layernorm for the seasonal part

method __init__

__init__(channels)

method forward

forward(x)

class AutoCorrelationLayer

Auto Correlation Layer

method __init__

__init__(correlation, hidden_size, n_head, d_keys=None, d_values=None)

method forward

forward(queries, keys, values, attn_mask)

class EncoderLayer

FEDformer encoder layer with the progressive decomposition architecture

method __init__

__init__(
    attention,
    hidden_size,
    conv_hidden_size=None,
    MovingAvg=25,
    dropout=0.1,
    activation='relu'
)

method forward

forward(x, attn_mask=None)

class Encoder

FEDformer encoder

method __init__

__init__(attn_layers, conv_layers=None, norm_layer=None)

method forward

forward(x, attn_mask=None)

class DecoderLayer

FEDformer decoder layer with the progressive decomposition architecture

method __init__

__init__(
    self_attention,
    cross_attention,
    hidden_size,
    c_out,
    conv_hidden_size=None,
    MovingAvg=25,
    dropout=0.1,
    activation='relu'
)

method forward

forward(x, cross, x_mask=None, cross_mask=None)

class Decoder

FEDformer decoder

method __init__

__init__(layers, norm_layer=None, projection=None)

method forward

forward(x, cross, x_mask=None, cross_mask=None, trend=None)

class FourierBlock

Fourier block

method __init__

__init__(
    in_channels,
    out_channels,
    seq_len,
    modes=0,
    mode_select_method='random'
)

method compl_mul1d

compl_mul1d(input, weights)

method forward

forward(q, k, v, mask)

class FourierCrossAttention

Fourier Cross Attention layer

method __init__

__init__(
    in_channels,
    out_channels,
    seq_len_q,
    seq_len_kv,
    modes=64,
    mode_select_method='random',
    activation='tanh',
    policy=0
)

method compl_mul1d

compl_mul1d(input, weights)

method forward

forward(q, k, v, mask)

class FEDformer

FEDformer The FEDformer model tackles the challenge of finding reliable dependencies on intricate temporal patterns of long-horizon forecasting. The architecture has the following distinctive features:
  • In-built progressive decomposition in trend and seasonal components based on a moving average filter.
  • Frequency Enhanced Block and Frequency Enhanced Attention to perform attention in the sparse representation on basis such as Fourier transform.
  • Classic encoder-decoder proposed by Vaswani et al. (2017) with a multi-head attention mechanism.
The FEDformer model utilizes a three-component approach to define its embedding:
  • It employs encoded autoregressive features obtained from a convolution network.
  • Absolute positional embeddings obtained from calendar features are utilized.
Args:
  • h (int): forecast horizon.
  • input_size (int): maximum sequence length for truncated train backpropagation.
  • stat_exog_list (List[str]): static exogenous columns.
  • hist_exog_list (List[str]): historic exogenous columns.
  • futr_exog_list (List[str]): future exogenous columns.
  • decoder_input_size_multiplier (float): multiplier for the input size of the decoder.
  • version (str): version of the model.
  • modes (int): number of modes for the Fourier block.
  • mode_select (str): method to select the modes for the Fourier block.
  • hidden_size (int): units of embeddings and encoders.
  • dropout (float): dropout throughout Autoformer architecture.
  • n_head (int): controls number of multi-head’s attention.
  • conv_hidden_size (int): channels of the convolutional encoder.
  • activation (str): activation from [‘ReLU’, ‘Softplus’, ‘Tanh’, ‘SELU’, ‘LeakyReLU’, ‘PReLU’, ‘Sigmoid’, ‘GELU’].
  • encoder_layers (int): number of layers for the TCN encoder.
  • decoder_layers (int): number of layers for the MLP decoder.
  • MovingAvg_window (int): window size for the moving average filter.
  • loss (PyTorch module): instantiated train loss class from losses collection.
  • valid_loss (PyTorch module): instantiated validation loss class from losses collection.
  • max_steps (int): maximum number of training steps.
  • learning_rate (float): Learning rate between (0, 1).
  • num_lr_decays (int): Number of learning rate decays, evenly distributed across max_steps.
  • early_stop_patience_steps (int): Number of validation iterations before early stopping.
  • val_check_steps (int): Number of training steps between every validation loss check.
  • batch_size (int): number of different series in each batch.
  • valid_batch_size (int): number of different series in each validation and test batch, if None uses batch_size.
  • windows_batch_size (int): number of windows to sample in each training batch, default uses all.
  • inference_windows_batch_size (int): number of windows to sample in each inference batch.
  • start_padding_enabled (bool): if True, the model will pad the time series with zeros at the beginning, by input size.
  • training_data_availability_threshold (Union[float, List[float]]): minimum fraction of valid data points required for training windows. Single float applies to both insample and outsample; list of two floats specifies [insample_fraction, outsample_fraction]. Default 0.0 allows windows with only 1 valid data point (current behavior).
  • step_size (int): step size between each window of temporal data.
  • scaler_type (str): type of scaler for temporal inputs normalization see temporal scalers.
  • random_seed (int): random_seed for pytorch initializer and numpy generators.
  • drop_last_loader (bool): if True TimeSeriesDataLoader drops last non-full batch.
  • alias (str): optional, Custom name of the model.
  • optimizer (Subclass of ‘torch.optim.Optimizer’): optional, user specified optimizer instead of the default choice (Adam).
  • optimizer_kwargs (dict): optional, list of parameters used by the user specified optimizer.
  • lr_scheduler (Subclass of ‘torch.optim.lr_scheduler.LRScheduler’): optional, user specified lr_scheduler instead of the default choice (StepLR).
  • lr_scheduler_kwargs (dict): optional, list of parameters used by the user specified lr_scheduler.
  • dataloader_kwargs (dict): optional, list of parameters passed into the PyTorch Lightning dataloader by the TimeSeriesDataLoader.
  • **trainer_kwargs (int): keyword trainer arguments inherited from PyTorch Lighning’s trainer.
References:

method __init__

__init__(
    h: int,
    input_size: int,
    stat_exog_list=None,
    hist_exog_list=None,
    futr_exog_list=None,
    decoder_input_size_multiplier: float = 0.5,
    version: str = 'Fourier',
    modes: int = 64,
    mode_select: str = 'random',
    hidden_size: int = 128,
    dropout: float = 0.05,
    n_head: int = 8,
    conv_hidden_size: int = 32,
    activation: str = 'gelu',
    encoder_layers: int = 2,
    decoder_layers: int = 1,
    MovingAvg_window: int = 25,
    loss=MAE(),
    valid_loss=None,
    max_steps: int = 5000,
    learning_rate: float = 0.0001,
    num_lr_decays: int = -1,
    early_stop_patience_steps: int = -1,
    val_check_steps: int = 100,
    batch_size: int = 32,
    valid_batch_size: Optional[int] = None,
    windows_batch_size=1024,
    inference_windows_batch_size=1024,
    start_padding_enabled=False,
    training_data_availability_threshold=0.0,
    step_size: int = 1,
    scaler_type: str = 'identity',
    random_seed: int = 1,
    drop_last_loader: bool = False,
    alias: Optional[str] = None,
    optimizer=None,
    optimizer_kwargs=None,
    lr_scheduler=None,
    lr_scheduler_kwargs=None,
    dataloader_kwargs=None,
    **trainer_kwargs
)

property automatic_optimization

If set to False you are responsible for calling .backward(), .step(), .zero_grad().

property current_epoch

The current epoch in the Trainer, or 0 if not attached.

property device


property device_mesh

Strategies like ModelParallelStrategy will create a device mesh that can be accessed in the :meth:~pytorch_lightning.core.hooks.ModelHooks.configure_model hook to parallelize the LightningModule.

property dtype


property example_input_array

The example input array is a specification of what the module can consume in the :meth:forward method. The return type is interpreted as follows:
  • Single tensor: It is assumed the model takes a single argument, i.e., model.forward(model.example_input_array)
  • Tuple: The input array should be interpreted as a sequence of positional arguments, i.e., model.forward(*model.example_input_array)
  • Dict: The input array represents named keyword arguments, i.e., model.forward(**model.example_input_array)

property fabric


property global_rank

The index of the current process across all nodes and devices.

property global_step

Total training batches seen across all epochs. If no Trainer is attached, this property is 0.

property hparams

The collection of hyperparameters saved with :meth:save_hyperparameters. It is mutable by the user. For the frozen set of initial hyperparameters, use :attr:hparams_initial. Returns: Mutable hyperparameters dictionary

property hparams_initial

The collection of hyperparameters saved with :meth:save_hyperparameters. These contents are read-only. Manual updates to the saved hyperparameters can instead be performed through :attr:hparams. Returns:
  • AttributeDict: immutable initial hyperparameters

property local_rank

The index of the current process within a single node.

property logger

Reference to the logger object in the Trainer.

property loggers

Reference to the list of loggers in the Trainer.

property on_gpu

Returns True if this model is currently located on a GPU. Useful to set flags around the LightningModule for different CPU vs GPU behavior.

property strict_loading

Determines how Lightning loads this model using .load_state_dict(..., strict=model.strict_loading).

property trainer


method forward

forward(windows_batch)