Skip to main content

module neuralforecast.losses.pytorch


function level_to_outputs

level_to_outputs(level)

function quantiles_to_outputs

quantiles_to_outputs(quantiles)

function weighted_average

weighted_average(x: Tensor, weights: Optional[Tensor] = None, dim=None) → Tensor
Computes the weighted average of a given tensor across a given dim. Masks values associated with weight zero, meaning instead of nan * 0 = nan you will get 0 * 0 = 0. Args:
  • x (torch.Tensor): Input tensor, of which the average must be computed.
  • weights (Optional[torch.Tensor], optional): Weights tensor, of the same shape as x. Defaults to None.
  • dim (optional): The dim along which to average x. Defaults to None.
Returns:
  • torch.Tensor: The tensor with values averaged along the specified dim.

function bernoulli_scale_decouple

bernoulli_scale_decouple(output, loc=None, scale=None)
Bernoulli Scale Decouple. Stabilizes model’s output optimization, by learning residual variance and residual location based on anchoring loc, scale. Also adds Bernoulli domain protection to the distribution parameters. Args:
  • output: Model output tensor.
  • loc (optional): Location parameter. Defaults to None.
  • scale (optional): Scale parameter. Defaults to None.
Returns:
  • tuple: Processed probabilities.

function student_scale_decouple

student_scale_decouple(output, loc=None, scale=None, eps: float = 0.1)
Student-T Scale Decouple. Stabilizes model’s output optimization, by learning residual variance and residual location based on anchoring loc, scale. Also adds StudentT domain protection to the distribution parameters. Args:
  • output: Model output tensor.
  • loc (optional): Location parameter. Defaults to None.
  • scale (optional): Scale parameter. Defaults to None.
  • eps (float, optional): Epsilon value for numerical stability. Defaults to 0.1.
Returns:
  • tuple: Processed degrees of freedom, mean, and scale parameters.

function normal_scale_decouple

normal_scale_decouple(output, loc=None, scale=None, eps: float = 0.2)
Normal Scale Decouple. Stabilizes model’s output optimization, by learning residual variance and residual location based on anchoring loc, scale. Also adds Normal domain protection to the distribution parameters. Args:
  • output: Model output tensor.
  • loc (optional): Location parameter. Defaults to None.
  • scale (optional): Scale parameter. Defaults to None.
  • eps (float, optional): Epsilon value for numerical stability. Defaults to 0.2.
Returns:
  • tuple: Processed mean and standard deviation parameters.

function poisson_scale_decouple

poisson_scale_decouple(output, loc=None, scale=None)
Poisson Scale Decouple Stabilizes model’s output optimization, by learning residual variance and residual location based on anchoring loc, scale. Also adds Poisson domain protection to the distribution parameters.

function nbinomial_scale_decouple

nbinomial_scale_decouple(output, loc=None, scale=None)
Negative Binomial Scale Decouple Stabilizes model’s output optimization, by learning total count and logits based on anchoring loc, scale. Also adds Negative Binomial domain protection to the distribution parameters.

function est_lambda

est_lambda(mu, rho)

function est_alpha

est_alpha(rho)

function est_beta

est_beta(mu, rho)

function tweedie_domain_map

tweedie_domain_map(input: Tensor, rho: float = 1.5)
Maps output of neural network to domain of distribution loss

function tweedie_scale_decouple

tweedie_scale_decouple(output, loc=None, scale=None)
Tweedie Scale Decouple Stabilizes model’s output optimization, by learning total count and logits based on anchoring loc, scale. Also adds Tweedie domain protection to the distribution parameters.

function isqf_domain_map

isqf_domain_map(
    input: Tensor,
    tol: float = 0.0001,
    quantiles: Tensor = tensor([0.1000, 0.5000, 0.9000]),
    num_pieces: int = 5
)
ISQF Domain Map Maps input into distribution constraints, by construction input’s last dimension is of matching distr_args length. Args:
  • input (torch.Tensor): Tensor of dimensions [B, H, N * n_outputs].
  • tol (float, optional): Tolerance. Defaults to 1e-4.
  • quantiles (torch.Tensor, optional): Quantiles used for ISQF (i.e. x-positions for the knots). Defaults to torch.tensor([0.1, 0.5, 0.9], dtype=torch.float32).
  • num_pieces (int, optional): Number of pieces used for each quantile spline. Defaults to 5.
Returns:
  • tuple: Tuple with tensors of ISQF distribution arguments.

function isqf_scale_decouple

isqf_scale_decouple(output, loc=None, scale=None)
ISQF Scale Decouple Stabilizes model’s output optimization. We simply pass through the location and the scale to the (transformed) distribution constructor

class BasePointLoss

Base class for point loss functions. Args:
  • horizon_weight (Optional[torch.Tensor]): Tensor of size h, weight for each timestamp of the forecasting window. Defaults to None.
  • outputsize_multiplier (Optional[int]): Multiplier for the output size. Defaults to None.
  • output_names (Optional[List[str]]): Names of the outputs. Defaults to None.

method __init__

__init__(horizon_weight=None, outputsize_multiplier=None, output_names=None)

method domain_map

domain_map(y_hat: Tensor)
Domain mapping for predicted values. Args:
  • y_hat (torch.Tensor): Predicted values tensor.
    • Univariate: [B, H, 1]
    • Multivariate: [B, H, N]
Returns:
  • torch.Tensor: Mapped values tensor with shape [B, H, N].

class MAE

Mean Absolute Error. Calculates Mean Absolute Error between y and y_hat. MAE measures the relative prediction accuracy of a forecasting method by calculating the deviation of the prediction and the true value at a given time and averages these devations over the length of the series. MAE(yτ,y^τ)=1Hτ=t+1t+Hyτy^τ\mathrm{MAE}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}_{\tau}) = \frac{1}{H} \sum^{t+H}_{\tau=t+1} |y_{\tau} - \hat{y}_{\tau}| Args:
  • horizon_weight (Optional[torch.Tensor]): Tensor of size h, weight for each timestamp of the forecasting window. Defaults to None.

method __init__

__init__(horizon_weight=None)

method domain_map

domain_map(y_hat: Tensor)
Domain mapping for predicted values. Args:
  • y_hat (torch.Tensor): Predicted values tensor.
    • Univariate: [B, H, 1]
    • Multivariate: [B, H, N]
Returns:
  • torch.Tensor: Mapped values tensor with shape [B, H, N].

class MSE

Mean Squared Error. Calculates Mean Squared Error between y and y_hat. MSE measures the relative prediction accuracy of a forecasting method by calculating the squared deviation of the prediction and the true value at a given time, and averages these devations over the length of the series. MSE(yτ,y^τ)=1Hτ=t+1t+H(yτy^τ)2\mathrm{MSE}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}_{\tau}) = \frac{1}{H} \sum^{t+H}_{\tau=t+1} (y_{\tau} - \hat{y}_{\tau})^{2} Args:
  • horizon_weight (Optional[torch.Tensor]): Tensor of size h, weight for each timestamp of the forecasting window. Defaults to None.

method __init__

__init__(horizon_weight=None)

method domain_map

domain_map(y_hat: Tensor)
Domain mapping for predicted values. Args:
  • y_hat (torch.Tensor): Predicted values tensor.
    • Univariate: [B, H, 1]
    • Multivariate: [B, H, N]
Returns:
  • torch.Tensor: Mapped values tensor with shape [B, H, N].

class RMSE

Root Mean Squared Error. Calculates Root Mean Squared Error between y and y_hat. RMSE measures the relative prediction accuracy of a forecasting method by calculating the squared deviation of the prediction and the observed value at a given time and averages these devations over the length of the series. Finally the RMSE will be in the same scale as the original time series so its comparison with other series is possible only if they share a common scale. RMSE has a direct connection to the L2 norm. RMSE(yτ,y^τ)=1Hτ=t+1t+H(yτy^τ)2\mathrm{RMSE}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}_{\tau}) = \sqrt{\frac{1}{H} \sum^{t+H}_{\tau=t+1} (y_{\tau} - \hat{y}_{\tau})^{2}} Args:
  • horizon_weight (Optional[torch.Tensor]): Tensor of size h, weight for each timestamp of the forecasting window. Defaults to None.

method __init__

__init__(horizon_weight=None)

method domain_map

domain_map(y_hat: Tensor)
Domain mapping for predicted values. Args:
  • y_hat (torch.Tensor): Predicted values tensor.
    • Univariate: [B, H, 1]
    • Multivariate: [B, H, N]
Returns:
  • torch.Tensor: Mapped values tensor with shape [B, H, N].

class MAPE

Mean Absolute Percentage Error Calculates Mean Absolute Percentage Error between y and y_hat. MAPE measures the relative prediction accuracy of a forecasting method by calculating the percentual deviation of the prediction and the observed value at a given time and averages these devations over the length of the series. The closer to zero an observed value is, the higher penalty MAPE loss assigns to the corresponding error. MAPE(yτ,y^τ)=1Hτ=t+1t+Hyτy^τyτ\mathrm{MAPE}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}_{\tau}) = \frac{1}{H} \sum^{t+H}_{\tau=t+1} \frac{|y_{\tau}-\hat{y}_{\tau}|}{|y_{\tau}|} Args:
  • horizon_weight: Tensor of size h, weight for each timestamp of the forecasting window.
References:

method __init__

__init__(horizon_weight=None)

method domain_map

domain_map(y_hat: Tensor)
Domain mapping for predicted values. Args:
  • y_hat (torch.Tensor): Predicted values tensor.
    • Univariate: [B, H, 1]
    • Multivariate: [B, H, N]
Returns:
  • torch.Tensor: Mapped values tensor with shape [B, H, N].

class SMAPE

Symmetric Mean Absolute Percentage Error Calculates Symmetric Mean Absolute Percentage Error between y and y_hat. SMAPE measures the relative prediction accuracy of a forecasting method by calculating the relative deviation of the prediction and the observed value scaled by the sum of the absolute values for the prediction and observed value at a given time, then averages these devations over the length of the series. This allows the SMAPE to have bounds between 0% and 200% which is desireble compared to normal MAPE that may be undetermined when the target is zero. sMAPE2(yτ,y^τ)=1Hτ=t+1t+Hyτy^τyτ+y^τ\mathrm{sMAPE}_{2}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}_{\tau}) = \frac{1}{H} \sum^{t+H}_{\tau=t+1} \frac{|y_{\tau}-\hat{y}_{\tau}|}{|y_{\tau}|+|\hat{y}_{\tau}|} Args:
  • horizon_weight: Tensor of size h, weight for each timestamp of the forecasting window.
References:

method __init__

__init__(horizon_weight=None)

method domain_map

domain_map(y_hat: Tensor)
Domain mapping for predicted values. Args:
  • y_hat (torch.Tensor): Predicted values tensor.
    • Univariate: [B, H, 1]
    • Multivariate: [B, H, N]
Returns:
  • torch.Tensor: Mapped values tensor with shape [B, H, N].

class MASE

Mean Absolute Scaled Error Calculates the Mean Absolute Scaled Error between y and y_hat. MASE measures the relative prediction accuracy of a forecasting method by comparinng the mean absolute errors of the prediction and the observed value against the mean absolute errors of the seasonal naive model. The MASE partially composed the Overall Weighted Average (OWA), used in the M4 Competition. MASE(yτ,y^τ,y^τseason)=1Hτ=t+1t+Hyτy^τMAE(yτ,y^τseason)\mathrm{MASE}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}_{\tau}, \mathbf{\hat{y}}^{season}_{\tau}) = \frac{1}{H} \sum^{t+H}_{\tau=t+1} \frac{|y_{\tau}-\hat{y}_{\tau}|}{\mathrm{MAE}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}^{season}_{\tau})} Args:
  • seasonality: Int. Main frequency of the time series; Hourly 24, Daily 7, Weekly 52, Monthly 12, Quarterly 4, Yearly 1.
  • horizon_weight: Tensor of size h, weight for each timestamp of the forecasting window.
References:

method __init__

__init__(seasonality: int, horizon_weight=None)

method domain_map

domain_map(y_hat: Tensor)
Domain mapping for predicted values. Args:
  • y_hat (torch.Tensor): Predicted values tensor.
    • Univariate: [B, H, 1]
    • Multivariate: [B, H, N]
Returns:
  • torch.Tensor: Mapped values tensor with shape [B, H, N].

class relMSE

Relative Mean Squared Error Computes Relative Mean Squared Error (relMSE), as proposed by Hyndman & Koehler (2006) as an alternative to percentage errors, to avoid measure unstability. relMSE(y,y^,y^benchmark)=racMSE(y,y^)MSE(y,y^benchmark)\mathrm{relMSE}(\mathbf{y}, \mathbf{\hat{y}}, \mathbf{\hat{y}}^{benchmark}) = rac{\mathrm{MSE}(\mathbf{y}, \mathbf{\hat{y}})}{\mathrm{MSE}(\mathbf{y}, \mathbf{\hat{y}}^{benchmark})} Args:
  • y_train: Numpy array, deprecated.
  • horizon_weight: Tensor of size h, weight for each timestamp of the forecasting window.
References:

method __init__

__init__(y_train=None, horizon_weight=None)

method domain_map

domain_map(y_hat: Tensor)
Domain mapping for predicted values. Args:
  • y_hat (torch.Tensor): Predicted values tensor.
    • Univariate: [B, H, 1]
    • Multivariate: [B, H, N]
Returns:
  • torch.Tensor: Mapped values tensor with shape [B, H, N].

class QuantileLoss

Quantile Loss. Computes the quantile loss between y and y_hat. QL measures the deviation of a quantile forecast. By weighting the absolute deviation in a non symmetric way, the loss pays more attention to under or over estimation. A common value for q is 0.5 for the deviation from the median (Pinball loss). QL(yτ,y^τ(q))=1Hτ=t+1t+H((1q)(y^τ(q)yτ)++q(yτy^τ(q))+)\mathrm{QL}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}^{(q)}_{\tau}) = \frac{1}{H} \sum^{t+H}_{\tau=t+1} \Big( (1-q)\,( \hat{y}^{(q)}_{\tau} - y_{\tau} )_{+} + q\,( y_{\tau} - \hat{y}^{(q)}_{\tau} )_{+} \Big) Args:
  • q (float): Between 0 and 1. The slope of the quantile loss, in the context of quantile regression, the q determines the conditional quantile level.
  • horizon_weight (Optional[torch.Tensor]): Tensor of size h, weight for each timestamp of the forecasting window. Defaults to None.
References:

method __init__

__init__(q, horizon_weight=None)

method domain_map

domain_map(y_hat: Tensor)
Domain mapping for predicted values. Args:
  • y_hat (torch.Tensor): Predicted values tensor.
    • Univariate: [B, H, 1]
    • Multivariate: [B, H, N]
Returns:
  • torch.Tensor: Mapped values tensor with shape [B, H, N].

class MQLoss

Multi-Quantile loss Calculates the Multi-Quantile loss (MQL) between y and y_hat. MQL calculates the average multi-quantile Loss for a given set of quantiles, based on the absolute difference between predicted quantiles and observed values. MQL(yτ,[y^τ(q1),...,y^τ(qn)])=1nqiQL(yτ,y^τ(qi))\mathrm{MQL}(\mathbf{y}_{\tau},[\mathbf{\hat{y}}^{(q_{1})}_{\tau}, ... ,\hat{y}^{(q_{n})}_{\tau}]) = \frac{1}{n} \sum_{q_{i}} \mathrm{QL}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}^{(q_{i})}_{\tau}) The limit behavior of MQL allows to measure the accuracy of a full predictive distribution F^τ\mathbf{\hat{F}}_{\tau} with the continuous ranked probability score (CRPS). This can be achieved through a numerical integration technique, that discretizes the quantiles and treats the CRPS integral with a left Riemann approximation, averaging over uniformly distanced quantiles. CRPS(yτ,F^τ)=01QL(yτ,y^τ(q))dq\mathrm{CRPS}(y_{\tau}, \mathbf{\hat{F}}_{\tau}) = \int^{1}_{0} \mathrm{QL}(y_{\tau}, \hat{y}^{(q)}_{\tau}) dq Args:
  • level (List[int], optional): Probability levels for prediction intervals. Defaults to [80, 90].
  • quantiles (Optional[List[float]]): Alternative to level, quantiles to estimate from y distribution. Defaults to None.
  • horizon_weight (Optional[torch.Tensor]): Tensor of size h, weight for each timestamp of the forecasting window. Defaults to None.
References:

method __init__

__init__(level=[80, 90], quantiles=None, horizon_weight=None)

method domain_map

domain_map(y_hat: Tensor)
Reshapes input tensor to match the expected output format. Args:
  • y_hat (torch.Tensor): Input tensor.
    • Univariate: [B, H, 1 * Q]
    • Multivariate: [B, H, N * Q]
Returns:
  • torch.Tensor: Reshaped tensor with shape [B, H, N, Q].

class QuantileLayer

Implicit Quantile Layer from the paper IQN for Distributional Reinforcement Learning. Code from GluonTS: https://github.com/awslabs/gluonts/blob/61133ef6e2d88177b32ace4afc6843ab9a7bc8cd/src/gluonts/torch/distributions/implicit_quantile_network.py References: Dabney et al. 2018. https://arxiv.org/abs/1806.06923

method __init__

__init__(num_output: int, cos_embedding_dim: int = 128)

method forward

forward(tau: Tensor) → Tensor

class IQLoss

Implicit Quantile Loss. Computes the quantile loss between y and y_hat, with the quantile q provided as an input to the network. IQL measures the deviation of a quantile forecast. By weighting the absolute deviation in a non symmetric way, the loss pays more attention to under or over estimation. QL(yτ,y^τ(q))=1Hτ=t+1t+H((1q)(y^τ(q)yτ)++q(yτy^τ(q))+)\mathrm{QL}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}^{(q)}_{\tau}) = \frac{1}{H} \sum^{t+H}_{\tau=t+1} \Big( (1-q)\,( \hat{y}^{(q)}_{\tau} - y_{\tau} )_{+} + q\,( y_{\tau} - \hat{y}^{(q)}_{\tau} )_{+} \Big) Args:
  • cos_embedding_dim (int, optional): Cosine embedding dimension. Defaults to 64.
  • concentration0 (float, optional): Beta distribution concentration parameter. Defaults to 1.0.
  • concentration1 (float, optional): Beta distribution concentration parameter. Defaults to 1.0.
  • horizon_weight (Optional[torch.Tensor]): Tensor of size h, weight for each timestamp of the forecasting window. Defaults to None.
References:
  • Gouttes, Adèle, Kashif Rasul, Mateusz Koren, Johannes Stephan, and Tofigh Naghibi, "Probabilistic Time Series Forecasting with Implicit Quantile Networks". http: //arxiv.org/abs/2107.03743

method __init__

__init__(
    cos_embedding_dim=64,
    concentration0=1.0,
    concentration1=1.0,
    horizon_weight=None
)

method domain_map

domain_map(y_hat)
Adds IQN network to output of network. Args:
  • y_hat (torch.Tensor): Input tensor.
    • Univariate: [B, h, 1]
    • Multivariate: [B, h, N]
Returns:
  • torch.Tensor: Domain mapped tensor.

method update_quantile

update_quantile(q: List[float] = [0.5])

class Tweedie

Tweedie Distribution. The Tweedie distribution is a compound probability, special case of exponential dispersion models EDMs defined by its mean-variance relationship. The distribution particularly useful to model sparse series as the probability has possitive mass at zero but otherwise is continuous. YED(μ,σ2)P(yμ,σ2)=h(σ2,y)exp(θyA(θ)σ2)Y \sim \mathrm{ED}(\mu,\sigma^{2}) \qquad \mathbb{P}(y|\mu ,\sigma^{2})=h(\sigma^{2},y) \exp \left({\frac {\theta y-A(\theta )}{\sigma^{2}}}\right) μ=A(θ)Var(Y)=σ2μρ\mu =A'(\theta ) \qquad \mathrm{Var}(Y) = \sigma^{2} \mu^{\rho} Cases of the variance relationship include Normal (rho = 0), Poisson (rho = 1), Gamma (rho = 2), inverse Gaussian (rho = 3). Args:
  • log_mu (torch.Tensor): Tensor with log of means.
  • rho (float): Tweedie variance power (1,2). Fixed across all observations.
  • validate_args (optional): Validation arguments. Defaults to None.
Note:
sigma2: Tweedie variance. Currently fixed in 1. References: - Tweedie, M. C. K. (1984). An index which distinguishes between some important exponential families. Statistics: Applications and New Directions. Proceedings of the Indian Statistical Institute Golden Jubilee International Conference (Eds. J. K. Ghosh and J. Roy), pp. 579-604. Calcutta: Indian Statistical Institute. - Jorgensen, B. (1987). Exponential Dispersion Models. Journal of the Royal Statistical Society. Series B (Methodological), 49(2), 127–162. http://www.jstor.org/stable/2345415

method __init__

__init__(log_mu, rho, validate_args=None)

property batch_shape

Returns the shape over which parameters are batched.

property event_shape

Returns the shape of a single sample (without batching).

property mean


property mode

Returns the mode of the distribution.

property stddev

Returns the standard deviation of the distribution.

property variance


method log_prob

log_prob(y_true)

method sample

sample(sample_shape=torch.Size([]))

class ISQF

Distribution class for the Incremental (Spline) Quantile Function. Args:
  • spline_knots (torch.Tensor): Tensor parametrizing the x-positions of the spline knots. Shape: (*batch_shape, (num_qk-1), num_pieces)
  • spline_heights (torch.Tensor): Tensor parametrizing the y-positions of the spline knots. Shape: (*batch_shape, (num_qk-1), num_pieces)
  • beta_l (torch.Tensor): Tensor containing the non-negative learnable parameter of the left tail. Shape: (*batch_shape,)
  • beta_r (torch.Tensor): Tensor containing the non-negative learnable parameter of the right tail. Shape: (*batch_shape,)
  • qk_y (torch.Tensor): Tensor containing the increasing y-positions of the quantile knots. Shape: (*batch_shape, num_qk)
  • qk_x (torch.Tensor): Tensor containing the increasing x-positions of the quantile knots. Shape: (*batch_shape, num_qk)
  • loc (torch.Tensor): Tensor containing the location in case of a transformed random variable. Shape: (*batch_shape,)
  • scale (torch.Tensor): Tensor containing the scale in case of a transformed random variable. Shape: (*batch_shape,)
References:
  • Park, Youngsuk, Danielle Maddix, François-Xavier Aubet, Kelvin Kan, Jan Gasthaus, and Yuyang Wang (2022). "Learning Quantile Functions without Quantile Crossing for Distribution-free Time Series Forecasting". https: //proceedings.mlr.press/v151/park22a.html

method __init__

__init__(
    spline_knots: Tensor,
    spline_heights: Tensor,
    beta_l: Tensor,
    beta_r: Tensor,
    qk_y: Tensor,
    qk_x: Tensor,
    loc: Tensor,
    scale: Tensor,
    validate_args=None
) → None

property batch_shape

Returns the shape over which parameters are batched.

property event_shape

Returns the shape of a single sample (without batching).

property has_rsample


property mean

Function used to compute the empirical mean

property mode

Returns the mode of the distribution.

property stddev

Returns the standard deviation of the distribution.

property variance

Returns the variance of the distribution.

method crps

crps(y: Tensor) → Tensor

class BaseISQF

Base distribution class for the Incremental (Spline) Quantile Function. Args:
  • spline_knots (torch.Tensor): Tensor parametrizing the x-positions of the spline knots. Shape: (*batch_shape, (num_qk-1), num_pieces)
  • spline_heights (torch.Tensor): Tensor parametrizing the y-positions of the spline knots. Shape: (*batch_shape, (num_qk-1), num_pieces)
  • beta_l (torch.Tensor): Tensor containing the non-negative learnable parameter of the left tail. (*batch_shape,)
  • beta_r (torch.Tensor): Tensor containing the non-negative learnable parameter of the right tail. (*batch_shape,)
  • qk_y (torch.Tensor): Tensor containing the increasing y-positions of the quantile knots. Shape: (*batch_shape, num_qk)
  • qk_x (torch.Tensor): Tensor containing the increasing x-positions of the quantile knots. Shape: (*batch_shape, num_qk)
  • tol (float, optional): Tolerance hyperparameter for numerical stability. Defaults to 1e-4.
  • validate_args (bool, optional): Whether to validate arguments. Defaults to False.
References:
  • Park, Youngsuk, Danielle Maddix, François-Xavier Aubet, Kelvin Kan, Jan Gasthaus, and Yuyang Wang (2022). "Learning Quantile Functions without Quantile Crossing for Distribution-free Time Series Forecasting". https: //proceedings.mlr.press/v151/park22a.html

method __init__

__init__(
    spline_knots: Tensor,
    spline_heights: Tensor,
    beta_l: Tensor,
    beta_r: Tensor,
    qk_y: Tensor,
    qk_x: Tensor,
    tol: float = 0.0001,
    validate_args: bool = False
) → None

property arg_constraints

Returns a dictionary from argument names to :class:~torch.distributions.constraints.Constraint objects that should be satisfied by each argument of this distribution. Args that are not tensors need not appear in this dict.

property batch_shape


property event_shape

Returns the shape of a single sample (without batching).

property mean

Returns the mean of the distribution.

property mode

Returns the mode of the distribution.

property stddev

Returns the standard deviation of the distribution.

property support

Returns a :class:~torch.distributions.constraints.Constraint object representing this distribution’s support.

property variance

Returns the variance of the distribution.

method cdf

cdf(z: Tensor) → Tensor
Computes the quantile level alpha_tilde such that q(alpha_tilde) = z. Args:
  • z (torch.Tensor): Tensor of shape = (*batch_shape,)
Returns:
  • torch.Tensor: Quantile level alpha_tilde.

method cdf_spline

cdf_spline(z: Tensor) → Tensor
For observations z and splines defined in [qk_x[k], qk_x[k+1]]. Computes the quantile level alpha_tilde such that:
  • alpha_tilde = q^(z) if z is in-between qk_x[k] and qk_x[k+1]
  • alpha_tilde = qk_x[k] if z< qk_x[k]
  • alpha_tilde = qk_x[k+1] if z>qk_x[k+1]
Args:
  • z (torch.Tensor): Observation. Shape: (*batch_shape,)
Returns:
  • torch.Tensor: Corresponding quantile level alpha_tilde. Shape: (*batch_shape, num_qk-1)

method cdf_tail

cdf_tail(z: Tensor, left_tail: bool = True) → Tensor
Computes the quantile level alpha_tilde such that:
  • alpha_tilde = q^(z) if z is in the tail region
  • alpha_tilde = qk_x_l or qk_x_r if z is in the non-tail region
Args:
  • z (torch.Tensor): Observation. Shape: (*batch_shape,)
  • left_tail (bool, optional): If True, compute alpha_tilde for the left tail. Otherwise, compute alpha_tilde for the right tail. Defaults to True.
Returns:
  • torch.Tensor: Corresponding quantile level alpha_tilde. Shape: (*batch_shape,)

method crps

crps(z: Tensor) → Tensor
Compute CRPS in analytical form. Args:
  • z (torch.Tensor): Observation to evaluate.
Returns:
  • torch.Tensor: CRPS value.

method crps_spline

crps_spline(z: Tensor) → Tensor
Compute CRPS in analytical form for the spline. Args:
  • z (torch.Tensor): Observation to evaluate.
Returns:
  • torch.Tensor: CRPS value for the spline.

method crps_tail

crps_tail(z: Tensor, left_tail: bool = True) → Tensor
Compute CRPS in analytical form for left/right tails. Args:
  • z (torch.Tensor): Observation to evaluate. Shape: (*batch_shape,)
  • left_tail (bool, optional): If True, compute CRPS for the left tail. Otherwise, compute CRPS for the right tail. Defaults to True.
Returns:
  • torch.Tensor: Tensor containing the CRPS, of the same shape as z.

method log_prob

log_prob(z: Tensor) → Tensor

method loss

loss(z: Tensor) → Tensor

method parameterize_qk

parameterize_qk(quantile_knots: Tensor) → Tuple[Tensor, Tensor, Tensor, Tensor]
Function to parameterize the x or y positions of the num_qk quantile knots. Args:
  • quantile_knots (torch.Tensor): x or y positions of the quantile knots. Shape: (*batch_shape, num_qk)
Returns:
  • Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor]: A tuple containing:
    • qk: x or y positions of the quantile knots (qk), with index=1, …, num_qk-1. Shape: (*batch_shape, num_qk-1)
    • qk_plus: x or y positions of the quantile knots (qk), with index=2, …, num_qk. Shape: (*batch_shape, num_qk-1)
    • qk_l: x or y positions of the left-most quantile knot (qk). Shape: (*batch_shape)
    • qk_r: x or y positions of the right-most quantile knot (qk). Shape: (*batch_shape)

method parameterize_spline

parameterize_spline(
    spline_knots: Tensor,
    qk: Tensor,
    qk_plus: Tensor,
    tol: float = 0.0001
) → Tuple[Tensor, Tensor]
Function to parameterize the x or y positions of the spline knots. Args:
  • spline_knots (torch.Tensor): Variable that parameterizes the spline knot positions.
  • qk (torch.Tensor): x or y positions of the quantile knots (qk), with index=1, …, num_qk-1. Shape: (*batch_shape, num_qk-1)
  • qk_plus (torch.Tensor): x or y positions of the quantile knots (qk), with index=2, …, num_qk. Shape: (*batch_shape, num_qk-1)
  • tol (float, optional): Tolerance hyperparameter for numerical stability. Defaults to 1e-4.
Returns:
  • Tuple[torch.Tensor, torch.Tensor]: A tuple containing:
    • sk: x or y positions of the spline knots (sk). Shape: (*batch_shape, num_qk-1, num_pieces)
    • delta_sk: difference of x or y positions of the spline knots (sk). Shape: (*batch_shape, num_qk-1, num_pieces)

method parameterize_tail

parameterize_tail(
    beta: Tensor,
    qk_x: Tensor,
    qk_y: Tensor
) → Tuple[Tensor, Tensor]
Function to parameterize the tail parameters. Note that the exponential tails are given by: q(alpha) = a_l log(alpha) + b_l if left tail q(alpha) = a_r log(1-alpha) + b_r if right tail Where: a_l=1/beta_l, b_l=-a_llog(qk_x_l)+q(qk_x_l) a_r=1/beta_r, b_r=a_rlog(1-qk_x_r)+q(qk_x_r) Args:
  • beta (torch.Tensor): Parameterizes the left or right tail. Shape: (*batch_shape,)
  • qk_x (torch.Tensor): Left- or right-most x-positions of the quantile knots. Shape: (*batch_shape,)
  • qk_y (torch.Tensor): Left- or right-most y-positions of the quantile knots. Shape: (*batch_shape,)
Returns:
  • Tuple[torch.Tensor, torch.Tensor]: A tuple containing:
    • tail_a: a_l or a_r as described above
    • tail_b: b_l or b_r as described above

method quantile

quantile(alpha: Tensor) → Tensor

method quantile_internal

quantile_internal(alpha: Tensor, dim: Optional[int] = None) → Tensor
Evaluates the quantile function at the quantile levels input_alpha. Args:
  • alpha (torch.Tensor): Tensor of shape = (*batch_shape,) if axis=None, or containing an additional axis on the specified position, otherwise.
  • dim (Optional[int], optional): Index of the axis containing the different quantile levels which are to be computed. Read the description below for detailed information. Defaults to None.
Returns:
  • torch.Tensor: Quantiles tensor, of the same shape as alpha.

method quantile_spline

quantile_spline(alpha: Tensor, dim: Optional[int] = None) → Tensor

method quantile_tail

quantile_tail(
    alpha: Tensor,
    dim: Optional[int] = None,
    left_tail: bool = True
) → Tensor

method rsample

rsample(sample_shape: Size = torch.Size([])) → Tensor
Function used to draw random samples Args:
  • sample_shape (torch.Size, optional): Shape of the sample. Defaults to torch.Size().
Returns:
  • torch.Tensor: Random samples.

class DistributionLoss

DistributionLoss This PyTorch module wraps the torch.distribution classes allowing it to interact with NeuralForecast models modularly. It shares the negative log-likelihood as the optimization objective and a sample method to generate empirically the quantiles defined by the level list. Additionally, it implements a distribution transformation that factorizes the scale-dependent likelihood parameters into a base scale and a multiplier efficiently learnable within the network’s non-linearities operating ranges. Available distributions:
  • Poisson
  • Normal
  • StudentT
  • NegativeBinomial
  • Tweedie
  • Bernoulli (Temporal Classifiers)
  • ISQF (Incremental Spline Quantile Function)
Args:
  • distribution (str): Identifier of a torch.distributions.Distribution class.
  • level (float list): Confidence levels for prediction intervals.
  • quantiles (float list): Alternative to level list, target quantiles.
  • num_samples (int): Number of samples for the empirical quantiles.
  • return_params (bool): Whether or not return the Distribution parameters.
  • horizon_weight (Tensor): Tensor of size h, weight for each timestamp of the forecasting window.
Returns:
  • tuple: Tuple with tensors of ISQF distribution arguments.
References:

method __init__

__init__(
    distribution,
    level=[80, 90],
    quantiles=None,
    num_samples=1000,
    return_params=False,
    horizon_weight=None,
    **distribution_kwargs
)

method get_distribution

get_distribution(distr_args, **distribution_kwargs) → Distribution
Construct the associated Pytorch Distribution, given the collection of constructor arguments and, optionally, location and scale tensors. Args:
  • distr_args (torch.Tensor): Constructor arguments for the underlying Distribution type.
Returns:
  • Distribution: AffineTransformed distribution.

method sample

sample(distr_args: Tensor, num_samples: Optional[int] = None)
Construct the empirical quantiles from the estimated Distribution, sampling from it num_samples independently. Args:
  • distr_args (torch.Tensor): Constructor arguments for the underlying Distribution type.
  • num_samples (int, optional): Overwrite number of samples for the empirical quantiles. Defaults to None.
Returns:
  • tuple: Tuple with samples, sample mean, and quantiles.

method update_quantile

update_quantile(q: Optional[List[float]] = None)

class PMM

Poisson Mixture Mesh This Poisson Mixture statistical model assumes independence across groups of data G={[gi]}\mathcal{G}=\{[g_{i}]\}, and estimates relationships within the group. P(y[b][t+1:t+H])=[gi]GP(y[gi][τ])=β[gi](k=1Kwk(β,τ)[gi][t+1:t+H]Poisson(yβ,τ,λ^β,τ,k))\mathrm{P}\left(\mathbf{y}_{[b][t+1:t+H]}\right) = \prod_{ [g_{i}] \in \mathcal{G}} \mathrm{P} \left(\mathbf{y}_{[g_{i}][\tau]} \right) = \prod_{\beta\in[g_{i}]} \left(\sum_{k=1}^{K} w_k \prod_{(\beta,\tau) \in [g_i][t+1:t+H]} \mathrm{Poisson}(y_{\beta,\tau}, \hat{\lambda}_{\beta,\tau,k}) \right) Args:
  • n_components (int, optional): The number of mixture components. Defaults to 10.
  • level (float list, optional): Confidence levels for prediction intervals. Defaults to [80, 90].
  • quantiles (float list, optional): Alternative to level list, target quantiles. Defaults to None.
  • return_params (bool, optional): Whether or not return the Distribution parameters. Defaults to False.
  • batch_correlation (bool, optional): Whether or not model batch correlations. Defaults to False.
  • horizon_correlation (bool, optional): Whether or not model horizon correlations. Defaults to False.
References:

method __init__

__init__(
    n_components=10,
    level=[80, 90],
    quantiles=None,
    num_samples=1000,
    return_params=False,
    batch_correlation=False,
    horizon_correlation=False,
    weighted=False
)

method domain_map

domain_map(output: Tensor)

method get_distribution

get_distribution(distr_args) → Distribution
Construct the associated Pytorch Distribution, given the collection of constructor arguments and, optionally, location and scale tensors. Args:
  • distr_args (torch.Tensor): Constructor arguments for the underlying Distribution type.
Returns:
  • Distribution: AffineTransformed distribution.

method sample

sample(distr_args: Tensor, num_samples: Optional[int] = None)
Construct the empirical quantiles from the estimated Distribution, sampling from it num_samples independently. Args:
  • distr_args (torch.Tensor): Constructor arguments for the underlying Distribution type.
  • num_samples (int, optional): Overwrite number of samples for the empirical quantiles. Defaults to None.
Returns:
  • tuple: Tuple with samples, sample mean, and quantiles.

method scale_decouple

scale_decouple(
    output,
    loc: Optional[Tensor] = None,
    scale: Optional[Tensor] = None
)
Scale Decouple Stabilizes model’s output optimization, by learning residual variance and residual location based on anchoring loc, scale. Also adds domain protection to the distribution parameters.

method update_quantile

update_quantile(q: Optional[List[float]] = None)

class GMM

Gaussian Mixture Mesh This Gaussian Mixture statistical model assumes independence across groups of data G={[gi]}\mathcal{G}=\{[g_{i}]\}, and estimates relationships within the group. P(y[b][t+1:t+H])=[gi]GP(y[gi][τ])=β[gi](k=1Kwk(β,τ)[gi][t+1:t+H]Gaussian(yβ,τ,μ^β,τ,k,σβ,τ,k))\mathrm{P}\left(\mathbf{y}_{[b][t+1:t+H]}\right) = \prod_{ [g_{i}] \in \mathcal{G}} \mathrm{P}\left(\mathbf{y}_{[g_{i}][\tau]}\right)= \prod_{\beta\in[g_{i}]} \left(\sum_{k=1}^{K} w_k \prod_{(\beta,\tau) \in [g_i][t+1:t+H]} \mathrm{Gaussian}(y_{\beta,\tau}, \hat{\mu}_{\beta,\tau,k}, \sigma_{\beta,\tau,k})\right) Args:
  • n_components (int, optional): The number of mixture components. Defaults to 10.
  • level (float list, optional): Confidence levels for prediction intervals. Defaults to [80, 90].
  • quantiles (float list, optional): Alternative to level list, target quantiles. Defaults to None.
  • return_params (bool, optional): Whether or not return the Distribution parameters. Defaults to False.
  • batch_correlation (bool, optional): Whether or not model batch correlations. Defaults to False.
  • horizon_correlation (bool, optional): Whether or not model horizon correlations. Defaults to False.
  • weighted (bool, optional): Whether or not model weighted components. Defaults to False.
  • num_samples (int, optional): Number of samples for the empirical quantiles. Defaults to 1000.
References:
  • [Kin G. Olivares, O. Nganba Meetei, Ruijun Ma, Rohan Reddy, Mengfei Cao, Lee Dicker. Probabilistic Hierarchical Forecasting with Deep Poisson Mixtures. Submitted to the International
  • Journal Forecasting, Working paper available at arxiv.](https: //arxiv.org/pdf/2110.13179.pdf)

method __init__

__init__(
    n_components=1,
    level=[80, 90],
    quantiles=None,
    num_samples=1000,
    return_params=False,
    batch_correlation=False,
    horizon_correlation=False,
    weighted=False
)

method domain_map

domain_map(output: Tensor)

method get_distribution

get_distribution(distr_args) → Distribution
Construct the associated Pytorch Distribution, given the collection of constructor arguments and, optionally, location and scale tensors. Args:
  • distr_args (torch.Tensor): Constructor arguments for the underlying Distribution type.
Returns:
  • Distribution: AffineTransformed distribution.

method sample

sample(distr_args: Tensor, num_samples: Optional[int] = None)
Construct the empirical quantiles from the estimated Distribution, sampling from it num_samples independently. Args:
  • distr_args (torch.Tensor): Constructor arguments for the underlying Distribution type.
  • num_samples (int, optional): Overwrite number of samples for the empirical quantiles. Defaults to None.
Returns:
  • tuple: Tuple with samples, sample mean, and quantiles.

method scale_decouple

scale_decouple(
    output,
    loc: Optional[Tensor] = None,
    scale: Optional[Tensor] = None,
    eps: float = 0.2
)
Scale Decouple Stabilizes model’s output optimization, by learning residual variance and residual location based on anchoring loc, scale. Also adds domain protection to the distribution parameters.

method update_quantile

update_quantile(q: Optional[List[float]] = None)

class NBMM

Negative Binomial Mixture Mesh This N. Binomial Mixture statistical model assumes independence across groups of data G={[gi]}\mathcal{G}=\{[g_{i}]\}, and estimates relationships within the group. P(y[b][t+1:t+H])=[gi]GP(y[gi][τ])=β[gi](k=1Kwk(β,τ)[gi][t+1:t+H]NBinomial(yβ,τ,r^β,τ,k,p^β,τ,k))\mathrm{P}\left(\mathbf{y}_{[b][t+1:t+H]}\right) = \prod_{ [g_{i}] \in \mathcal{G}} \mathrm{P}\left(\mathbf{y}_{[g_{i}][\tau]}\right)= \prod_{\beta\in[g_{i}]} \left(\sum_{k=1}^{K} w_k \prod_{(\beta,\tau) \in [g_i][t+1:t+H]} \mathrm{NBinomial}(y_{\beta,\tau}, \hat{r}_{\beta,\tau,k}, \hat{p}_{\beta,\tau,k})\right) Args:
  • n_components (int, optional): The number of mixture components. Defaults to 10.
  • level (float list, optional): Confidence levels for prediction intervals. Defaults to [80, 90].
  • quantiles (float list, optional): Alternative to level list, target quantiles. Defaults to None.
  • return_params (bool, optional): Whether or not return the Distribution parameters. Defaults to False.
  • weighted (bool, optional): Whether or not model weighted components. Defaults to False.
  • num_samples (int, optional): Number of samples for the empirical quantiles. Defaults to 1000.
References:
  • [Kin G. Olivares, O. Nganba Meetei, Ruijun Ma, Rohan Reddy, Mengfei Cao, Lee Dicker. Probabilistic Hierarchical Forecasting with Deep Poisson Mixtures. Submitted to the International
  • Journal Forecasting, Working paper available at arxiv.](https: //arxiv.org/pdf/2110.13179.pdf)

method __init__

__init__(
    n_components=1,
    level=[80, 90],
    quantiles=None,
    num_samples=1000,
    return_params=False,
    weighted=False
)

method domain_map

domain_map(output: Tensor)

method get_distribution

get_distribution(distr_args) → Distribution
Construct the associated Pytorch Distribution, given the collection of constructor arguments and, optionally, location and scale tensors. Args:
  • distr_args (torch.Tensor): Constructor arguments for the underlying Distribution type.
Returns:
  • Distribution: AffineTransformed distribution.

method sample

sample(distr_args: Tensor, num_samples: Optional[int] = None)
Construct the empirical quantiles from the estimated Distribution, sampling from it num_samples independently. Args:
  • distr_args (torch.Tensor): Constructor arguments for the underlying Distribution type.
  • num_samples (int, optional): Overwrite number of samples for the empirical quantiles. Defaults to None.
Returns:
  • tuple: Tuple with samples, sample mean, and quantiles.

method scale_decouple

scale_decouple(
    output,
    loc: Optional[Tensor] = None,
    scale: Optional[Tensor] = None,
    eps: float = 0.2
)
Scale Decouple Stabilizes model’s output optimization, by learning residual variance and residual location based on anchoring loc, scale. Also adds domain protection to the distribution parameters.

method update_quantile

update_quantile(q: Optional[List[float]] = None)

class HuberLoss

Huber Loss The Huber loss, employed in robust regression, is a loss function that exhibits reduced sensitivity to outliers in data when compared to the squared error loss. This function is also refered as SmoothL1. The Huber loss function is quadratic for small errors and linear for large errors, with equal values and slopes of the different sections at the two points where (yτy^τ)2(y_{\tau}-\hat{y}_{\tau})^{2}=yτy^τ|y_{\tau}-\hat{y}_{\tau}|. Lδ(yτ,  y^τ)={12(yτy^τ)2  for yτy^τδ δ (yτy^τ12δ),  otherwise.L_{\delta}(y_{\tau},\; \hat{y}_{\tau}) =\begin{cases}{\frac{1}{2}}(y_{\tau}-\hat{y}_{\tau})^{2}\;{\text{for }}|y_{\tau}-\hat{y}_{\tau}|\leq \delta \ \delta \ \cdot \left(|y_{\tau}-\hat{y}_{\tau}|-{\frac {1}{2}}\delta \right),\;{\text{otherwise.}}\end{cases} where δ\delta is a threshold parameter that determines the point at which the loss transitions from quadratic to linear, and can be tuned to control the trade-off between robustness and accuracy in the predictions. Args:
  • delta (float, optional): Specifies the threshold at which to change between delta-scaled L1 and L2 loss. Defaults to 1.0.
  • horizon_weight (Union[torch.Tensor, None], optional): Tensor of size h, weight for each timestamp of the forecasting window. Defaults to None.
References:

method __init__

__init__(delta: float = 1.0, horizon_weight=None)

method domain_map

domain_map(y_hat: Tensor)
Domain mapping for predicted values. Args:
  • y_hat (torch.Tensor): Predicted values tensor.
    • Univariate: [B, H, 1]
    • Multivariate: [B, H, N]
Returns:
  • torch.Tensor: Mapped values tensor with shape [B, H, N].

class TukeyLoss

Tukey Loss The Tukey loss function, also known as Tukey’s biweight function, is a robust statistical loss function used in robust statistics. Tukey’s loss exhibits quadratic behavior near the origin, like the Huber loss; however, it is even more robust to outliers as the loss for large residuals remains constant instead of scaling linearly. The parameter cc in Tukey’s loss determines the ”saturation” point of the function: Higher values of cc enhance sensitivity, while lower values increase resistance to outliers. Lc(yτ,  y^τ)={c26[1(yτy^τc)2]3  for yτy^τc c26otherwise.L_{c}(y_{\tau},\; \hat{y}_{\tau}) =\begin{cases}{ \frac{c^{2}}{6}} \left[1-(\frac{y_{\tau}-\hat{y}_{\tau}}{c})^{2} \right]^{3} \;\text{for } |y_{\tau}-\hat{y}_{\tau}|\leq c \ \frac{c^{2}}{6} \qquad \text{otherwise.} \end{cases} Please note that the Tukey loss function assumes the data to be stationary or normalized beforehand. If the error values are excessively large, the algorithm may need help to converge during optimization. It is advisable to employ small learning rates. Args:
  • c (float, optional): Specifies the Tukey loss’ threshold on which residuals are no longer considered. Defaults to 4.685.
  • normalize (bool, optional): Wether normalization is performed within Tukey loss’ computation. Defaults to True.
References:

method __init__

__init__(c: float = 4.685, normalize: bool = True)

method domain_map

domain_map(y_hat: Tensor)
Args:
  • y_hat (torch.Tensor): Predicted values
    • shape: [B, H, 1] for univariate
    • shape: [B, H, N] for multivariate
Returns:
  • torch.Tensor: Transformed values.
    • shape: [B, H, 1] for univariate
    • shape: [B, H, N] for multivariate

method masked_mean

masked_mean(x, mask, dim)

class HuberQLoss

Huberized Quantile Loss The Huberized quantile loss is a modified version of the quantile loss function that combines the advantages of the quantile loss and the Huber loss. It is commonly used in regression tasks, especially when dealing with data that contains outliers or heavy tails. The Huberized quantile loss between y and y_hat measure the Huber Loss in a non-symmetric way. The loss pays more attention to under/over-estimation depending on the quantile parameter qq; and controls the trade-off between robustness and accuracy in the predictions with the parameter deltadelta. HuberQL(yτ,y^τ(q))=(1q)Lδ(yτ,  y^τ(q))1{y^τ(q)yτ}+qLδ(yτ,  y^τ(q))1{y^τ(q)<yτ}\mathrm{HuberQL}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}^{(q)}_{\tau}) = (1-q)\, L_{\delta}(y_{\tau},\; \hat{y}^{(q)}_{\tau}) \mathbb{1}\{ \hat{y}^{(q)}_{\tau} \geq y_{\tau} \} + q\, L_{\delta}(y_{\tau},\; \hat{y}^{(q)}_{\tau}) \mathbb{1}\{ \hat{y}^{(q)}_{\tau} < y_{\tau} \} Args:
  • delta (float, optional): Specifies the threshold at which to change between delta-scaled L1 and L2 loss. Defaults to 1.0.
  • q (float, optional): The slope of the quantile loss, in the context of quantile regression, the q determines the conditional quantile level. Defaults to 0.5.
  • horizon_weight (Union[torch.Tensor, None], optional): Tensor of size h, weight for each timestamp of the forecasting window. Defaults to None.
References:

method __init__

__init__(q, delta: float = 1.0, horizon_weight=None)

method domain_map

domain_map(y_hat: Tensor)
Domain mapping for predicted values. Args:
  • y_hat (torch.Tensor): Predicted values tensor.
    • Univariate: [B, H, 1]
    • Multivariate: [B, H, N]
Returns:
  • torch.Tensor: Mapped values tensor with shape [B, H, N].

class HuberMQLoss

Huberized Multi-Quantile loss The Huberized Multi-Quantile loss (HuberMQL) is a modified version of the multi-quantile loss function that combines the advantages of the quantile loss and the Huber loss. HuberMQL is commonly used in regression tasks, especially when dealing with data that contains outliers or heavy tails. The loss function pays more attention to under/over-estimation depending on the quantile list [q1,q2,][q_{1},q_{2},\dots] parameter. It controls the trade-off between robustness and prediction accuracy with the parameter δ\delta. HuberMQLδ(yτ,[y^τ(q1),...,y^τ(qn)])=1nqiHuberQLδ(yτ,y^τ(qi))\mathrm{HuberMQL}_{\delta}(\mathbf{y}_{\tau},[\mathbf{\hat{y}}^{(q_{1})}_{\tau}, ... ,\hat{y}^{(q_{n})}_{\tau}]) = \frac{1}{n} \sum_{q_{i}} \mathrm{HuberQL}_{\delta}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}^{(q_{i})}_{\tau}) Args:
  • level (int list, optional): Probability levels for prediction intervals (Defaults median). Defaults to [80, 90].
  • quantiles (float list, optional): Alternative to level, quantiles to estimate from y distribution. Defaults to None.
  • delta (float, optional): Specifies the threshold at which to change between delta-scaled L1 and L2 loss. Defaults to 1.0.
  • horizon_weight (Union[torch.Tensor, None], optional): Tensor of size h, weight for each timestamp of the forecasting window. Defaults to None.
References:

method __init__

__init__(
    level=[80, 90],
    quantiles=None,
    delta: float = 1.0,
    horizon_weight=None
)

method domain_map

domain_map(y_hat: Tensor)
Args:
  • y_hat (torch.Tensor): Predicted values.
Returns:
  • torch.Tensor: Transformed values.
    • shape: [B, H, 1 * Q] for univariate
    • shape: [B, H, N * Q] for multivariate

class HuberIQLoss

Implicit Huber Quantile Loss Computes the huberized quantile loss between y and y_hat, with the quantile q provided as an input to the network. HuberIQLoss measures the deviation of a huberized quantile forecast. By weighting the absolute deviation in a non symmetric way, the loss pays more attention to under or over estimation. HuberIQL(yτ,y^τ(q))=(1q)Lδ(yτ,  y^τ(q))1{y^τ(q)yτ}+qLδ(yτ,  y^τ(q))1{y^τ(q)<yτ}\mathrm{HuberIQL}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}^{(q)}_{\tau}) = (1-q)\, L_{\delta}(y_{\tau},\; \hat{y}^{(q)}_{\tau}) \mathbb{1}\{ \hat{y}^{(q)}_{\tau} \geq y_{\tau} \} + q\, L_{\delta}(y_{\tau},\; \hat{y}^{(q)}_{\tau}) \mathbb{1}\{ \hat{y}^{(q)}_{\tau} < y_{\tau} \} Args:
  • quantile_sampling (str, optional): Sampling distribution used to sample the quantiles during training. Choose from [‘uniform’, ‘beta’]. Defaults to ‘uniform’.
  • horizon_weight (Union[torch.Tensor, None], optional): Tensor of size h, weight for each timestamp of the forecasting window. Defaults to None.
  • delta (float, optional): Specifies the threshold at which to change between delta-scaled L1 and L2 loss. Defaults to 1.0.
References:

method __init__

__init__(
    cos_embedding_dim=64,
    concentration0=1.0,
    concentration1=1.0,
    delta=1.0,
    horizon_weight=None
)

method domain_map

domain_map(y_hat)
Adds IQN network to output of network Args:
  • y_hat (torch.Tensor): Predicted values.
    • shape: [B, h, 1] for univariate
    • shape: [B, h, N] for multivariate

method update_quantile

update_quantile(q: List[float] = [0.5])

class Accuracy

Accuracy Computes the accuracy between categorical y and y_hat. This evaluation metric is only meant for evalution, as it is not differentiable. Accuracy(yτ,y^τ)=1Hτ=t+1t+H1{yτ==y^τ}\mathrm{Accuracy}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}_{\tau}) = \frac{1}{H} \sum^{t+H}_{\tau=t+1} \mathrm{1}\{\mathbf{y}_{\tau}==\mathbf{\hat{y}}_{\tau}\}

method __init__

__init__()

method domain_map

domain_map(y_hat: Tensor)
Args:
  • y_hat (torch.Tensor): Predicted values.
    • shape: [B, H, 1] for univariate
    • shape: [B, H, N] for multivariate
Returns:
  • torch.Tensor: Transformed values.
    • shape: [B, H, 1] for univariate
    • shape: [B, H, N] for multivariate

class sCRPS

Scaled Continues Ranked Probability Score Calculates a scaled variation of the CRPS, as proposed by Rangapuram (2021), to measure the accuracy of predicted quantiles y_hat compared to the observation y. This metric averages percentual weighted absolute deviations as defined by the quantile losses. sCRPS(y^τ(q),yτ)=2Ni01QL(y^τ(qyi,τ)qiyi,τdq\mathrm{sCRPS}(\mathbf{\hat{y}}^{(q)}_{\tau}, \mathbf{y}_{\tau}) = \frac{2}{N} \sum_{i} \int^{1}_{0} \frac{\mathrm{QL}(\mathbf{\hat{y}}^{(q}_{\tau} y_{i,\tau})_{q}}{\sum_{i} | y_{i,\tau} |} dq where y^τ(q\mathbf{\hat{y}}^{(q}_{\tau} is the estimated quantile, and yi,τy_{i,\tau} are the target variable realizations. Args:
  • level (int list, optional): Probability levels for prediction intervals (Defaults median). Defaults to [80, 90].
  • quantiles (float list, optional): Alternative to level, quantiles to estimate from y distribution. Defaults to None.
References:

method __init__

__init__(level=[80, 90], quantiles=None)

method domain_map

domain_map(y_hat: Tensor)
Domain mapping for predicted values. Args:
  • y_hat (torch.Tensor): Predicted values tensor.
    • Univariate: [B, H, 1]
    • Multivariate: [B, H, N]
Returns:
  • torch.Tensor: Mapped values tensor with shape [B, H, N].