1. Synthetic Panel Data

generate_series

 generate_series (n_series:int, freq:str='D', min_length:int=50,
                  max_length:int=500, n_temporal_features:int=0,
                  n_static_features:int=0, equal_ends:bool=False,
                  seed:int=0)

*Generate Synthetic Panel Series. Generates n_series of frequency freq of different lengths in the interval [min_length, max_length]. If n_temporal_features > 0, then each serie gets temporal features with random values. If n_static_features > 0, then a static dataframe is returned along the temporal dataframe. If equal_ends == True then all series end at the same date. Parameters:
n_series: int, number of series for synthetic panel.
min_length: int, minimal length of synthetic panel’s series.
max_length: int, minimal length of synthetic panel’s series.
n_temporal_features: int, default=0, number of temporal exogenous variables for synthetic panel’s series.
n_static_features: int, default=0, number of static exogenous variables for synthetic panel’s series.
equal_ends: bool, if True, series finish in the same date stamp ds.
freq: str, frequency of the data, panda’s available frequencies.
Returns:
freq: pandas.DataFrame, synthetic panel with columns [unique_id, ds, y] and exogenous.*

synthetic_panel = generate_series(n_series=2)
synthetic_panel.groupby('unique_id').head(4)

temporal_df, static_df = generate_series(n_series=1000, n_static_features=2,
                                         n_temporal_features=4, equal_ends=False)
static_df.head(2)

2. AirPassengers Data

The classic Box & Jenkins airline data. Monthly totals of international airline passengers, 1949 to 1960. It has been used as a reference on several forecasting libraries, since it is a series that shows clear trends and seasonalities it offers a nice opportunity to quickly showcase a model’s predictions performance.

AirPassengersDF.head(12)

#We are going to plot the ARIMA predictions, and the prediction intervals.
fig, ax = plt.subplots(1, 1, figsize = (20, 7))
plot_df = AirPassengersDF.set_index('ds')

plot_df[['y']].plot(ax=ax, linewidth=2)
ax.set_title('AirPassengers Forecast', fontsize=22)
ax.set_ylabel('Monthly Passengers', fontsize=20)
ax.set_xlabel('Timestamp [t]', fontsize=20)
ax.legend(prop={'size': 15})
ax.grid()

import numpy as np
import pandas as pd

n_static_features = 3
n_series = 5

static_features = np.random.uniform(low=0.0, high=1.0, 
                        size=(n_series, n_static_features))
static_df = pd.DataFrame.from_records(static_features, 
                   columns = [f'static_{i}'for i in  range(n_static_features)])
static_df['unique_id'] = np.arange(n_series)

static_df

3. Panel AirPassengers Data

Extension to classic Box & Jenkins airline data. Monthly totals of international airline passengers, 1949 to 1960. It includes two series with static, temporal and future exogenous variables, that can help to explore the performance of models like NBEATSx and TFT.

fig, ax = plt.subplots(1, 1, figsize = (20, 7))
plot_df = AirPassengersPanel.set_index('ds')

plot_df.groupby('unique_id')['y'].plot(legend=True)
ax.set_title('AirPassengers Panel Data', fontsize=22)
ax.set_ylabel('Monthly Passengers', fontsize=20)
ax.set_xlabel('Timestamp [t]', fontsize=20)
ax.legend(title='unique_id', prop={'size': 15})
ax.grid()

fig, ax = plt.subplots(1, 1, figsize = (20, 7))
plot_df = AirPassengersPanel[AirPassengersPanel.unique_id=='Airline1'].set_index('ds')

plot_df[['y', 'trend', 'y_[lag12]']].plot(ax=ax, linewidth=2)
ax.set_title('Box-Cox AirPassengers Data', fontsize=22)
ax.set_ylabel('Monthly Passengers', fontsize=20)
ax.set_xlabel('Timestamp [t]', fontsize=20)
ax.legend(prop={'size': 15})
ax.grid()

4. Time Features

We have developed a utility that generates normalized calendar features for use as absolute positional embeddings in Transformer-based models. These embeddings capture seasonal patterns in time series data and can be easily incorporated into the model architecture. Additionally, the features can be used as exogenous variables in other models to inform them of calendar patterns in the data. References
- Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong, Wancai Zhang. “Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting”

Getting Started

Capabilities

Tutorials

Use cases

API Reference

​1. Synthetic Panel Data

​generate_series

​2. AirPassengers Data

​3. Panel AirPassengers Data

​4. Time Features

​augment_calendar_df

​time_features_from_frequency_str

​WeekOfYear

​MonthOfYear

​DayOfYear

​DayOfMonth

​DayOfWeek

​HourOfDay

​MinuteOfHour

​SecondOfMinute

​TimeFeature

​get_indexer_raise_missing

​5. Prediction Intervals

​PredictionIntervals

​add_conformal_distribution_intervals

​add_conformal_error_intervals

​get_prediction_interval_method

​quantiles_to_level

​level_to_quantiles

1. Synthetic Panel Data

generate_series

2. AirPassengers Data

3. Panel AirPassengers Data

4. Time Features

augment_calendar_df

time_features_from_frequency_str

WeekOfYear

MonthOfYear

DayOfYear

DayOfMonth

DayOfWeek

HourOfDay

MinuteOfHour

SecondOfMinute

TimeFeature

get_indexer_raise_missing

5. Prediction Intervals

PredictionIntervals

add_conformal_distribution_intervals

add_conformal_error_intervals

get_prediction_interval_method

quantiles_to_level

level_to_quantiles