Temporal normalization has proven to be essential in neural forecasting tasks, as it enables network’s non-linearities to express themselves. Forecasting scaling methods take particular interest in the temporal dimension where most of the variance dwells, contrary to other deep learning techniques like BatchNorm
that normalizes across batch and temporal dimensions, and LayerNorm
that normalizes across the feature dimension. Currently we support the following techniques: std
, median
, norm
, norm1
, invariant
, revin
.
*Masked Median Compute the median of tensor
x
along dim, ignoring values where mask
is False. x
and mask
need to be broadcastable.
Parameters:x
: torch.Tensor to compute median of along dim
dimension.mask
: torch Tensor bool with same shape as x
, where
x
is valid and False where x
should be masked. Mask should not be
all False in any column of dimension dim to avoid NaNs from zero
division.dim
(int, optional): Dimension to take median of.
Defaults to -1.keepdim
(bool, optional): Keep dimension of x
or
not. Defaults to True.x_median
: torch.Tensor with normalized values.*
*Masked Mean Compute the mean of tensor
x
along dimension, ignoring values where
mask
is False. x
and mask
need to be broadcastable.
Parameters:x
: torch.Tensor to compute mean of along dim
dimension.mask
: torch Tensor bool with same shape as x
, where
x
is valid and False where x
should be masked. Mask should not be
all False in any column of dimension dim to avoid NaNs from zero
division.dim
(int, optional): Dimension to take mean of. Defaults
to -1.keepdim
(bool, optional): Keep dimension of x
or not.
Defaults to True.x_mean
: torch.Tensor with normalized values.*
*MinMax Scaler Standardizes temporal features by ensuring its range dweels between [0,1] range. This transformation is often used as an alternative to the standard scaler. The scaled features are obtained as: Parameters:
x
: torch.Tensor input tensor.mask
: torch
Tensor bool, same dimension as x
, indicates where x
is valid and
False where x
should be masked. Mask should not be all False in any
column of dimension dim to avoid NaNs from zero division.eps
(float, optional): Small value to avoid division by zero. Defaults to
1e-6.dim
(int, optional): Dimension over to compute min and max.
Defaults to -1.z
: torch.Tensor same shape as x
, except scaled.*
*MinMax1 Scaler Standardizes temporal features by ensuring its range dweels between [-1,1] range. This transformation is often used as an alternative to the standard scaler or classic Min Max Scaler. The scaled features are obtained as: Parameters:
x
: torch.Tensor input tensor.mask
: torch
Tensor bool, same dimension as x
, indicates where x
is valid and
False where x
should be masked. Mask should not be all False in any
column of dimension dim to avoid NaNs from zero division.eps
(float, optional): Small value to avoid division by zero. Defaults to
1e-6.dim
(int, optional): Dimension over to compute min and max.
Defaults to -1.z
: torch.Tensor same shape as x
, except scaled.*
*Standard Scaler Standardizes features by removing the mean and scaling to unit variance along the
dim
dimension.
For example, for base_windows
models, the scaled features are obtained
as (with dim=1):
Parameters:x
: torch.Tensor.mask
: torch Tensor bool,
same dimension as x
, indicates where x
is valid and False where x
should be masked. Mask should not be all False in any column of
dimension dim to avoid NaNs from zero division.eps
(float,
optional): Small value to avoid division by zero. Defaults to 1e-6.dim
(int, optional): Dimension over to compute mean and std. Defaults
to -1.z
: torch.Tensor same shape as x
, except scaled.*
*Robust Median Scaler Standardizes features by removing the median and scaling with the mean absolute deviation (mad) a robust estimator of variance. This scaler is particularly useful with noisy data where outliers can heavily influence the sample mean / variance in a negative way. In these scenarios the median and amd give better results. For example, for
base_windows
models, the scaled features are obtained
as (with dim=1):
Parameters:x
: torch.Tensor input tensor.mask
: torch
Tensor bool, same dimension as x
, indicates where x
is valid and
False where x
should be masked. Mask should not be all False in any
column of dimension dim to avoid NaNs from zero division.eps
(float, optional): Small value to avoid division by zero. Defaults to
1e-6.dim
(int, optional): Dimension over to compute median and
mad. Defaults to -1.z
: torch.Tensor same shape as x
, except scaled.*
*Invariant Median Scaler Standardizes features by removing the median and scaling with the mean absolute deviation (mad) a robust estimator of variance. Aditionally it complements the transformation with the arcsinh transformation. For example, for
base_windows
models, the scaled features are obtained
as (with dim=1):
Parameters:x
: torch.Tensor input tensor.mask
: torch
Tensor bool, same dimension as x
, indicates where x
is valid and
False where x
should be masked. Mask should not be all False in any
column of dimension dim to avoid NaNs from zero division.eps
(float, optional): Small value to avoid division by zero. Defaults to
1e-6.dim
(int, optional): Dimension over to compute median and
mad. Defaults to -1.z
: torch.Tensor same shape as x
, except scaled.*
*Identity Scaler A placeholder identity scaler, that is argument insensitive. Parameters:
x
: torch.Tensor input tensor.mask
: torch
Tensor bool, same dimension as x
, indicates where x
is valid and
False where x
should be masked. Mask should not be all False in any
column of dimension dim to avoid NaNs from zero division.eps
(float, optional): Small value to avoid division by zero. Defaults to
1e-6.dim
(int, optional): Dimension over to compute median and
mad. Defaults to -1.x
: original torch.Tensor x
.*
*Temporal Normalization Standardization of the features is a common requirement for many machine learning estimators, and it is commonly achieved by removing the level and scaling its variance. The
TemporalNorm
module applies temporal
normalization over the batch of inputs as defined by the type of scaler.
If scaler_type
is revin
learnable normalization parameters are added
on top of the usual normalization technique, the parameters are learned
through scale decouple global skip connections. The technique is
available for point and probabilistic outputs.
Parameters:scaler_type
: str, defines the type of scaler used
by TemporalNorm. Available [identity
, standard
, robust
, minmax
,
minmax1
, invariant
, revin
].dim
(int, optional): Dimension
over to compute scale and shift. Defaults to -1.eps
(float,
optional): Small value to avoid division by zero. Defaults to 1e-6.num_features
: int=None, for RevIN-like learnable affine parameters
initialization.*Center and scale the data. Parameters:
x
: torch.Tensor shape [batch, time,
channels].mask
: torch Tensor bool, shape [batch, time] where
x
is valid and False where x
should be masked. Mask should not be
all False in any column of dimension dim to avoid NaNs from zero
division.z
: torch.Tensor same shape as x
, except scaled.*
*Scale back the data to the original representation. Parameters:
z
: torch.Tensor shape [batch, time, channels],
scaled.x
: torch.Tensor original data.*