Skip to main content

1. Synthetic Panel Data


source

generate_series

 generate_series (n_series:int, freq:str='D', min_length:int=50,
                  max_length:int=500, n_static_features:int=0,
                  equal_ends:bool=False, engine:str='pandas', seed:int=0)
*Generate Synthetic Panel Series. Generates n_series of frequency freq of different lengths in the interval [min_length, max_length]. If n_static_features > 0, then each series gets static features with random values. If equal_ends == True then all series end at the same date.*
TypeDefaultDetails
n_seriesintNumber of series for synthetic panel.
freqstrDFrequency of the data, ‘D’ or ‘M’.
min_lengthint50Minimum length of synthetic panel’s series.
max_lengthint500Maximum length of synthetic panel’s series.
n_static_featuresint0Number of static exogenous variables for synthetic panel’s series.
equal_endsboolFalseSeries should end in the same date stamp ds.
enginestrpandasOutput Dataframe type (‘pandas’ or ‘polars’).
seedint0Random seed used for generating the data.
ReturnsUnionSynthetic panel with columns [unique_id, ds, y] and exogenous.
synthetic_panel = generate_series(n_series=2)
synthetic_panel.groupby('unique_id', observed=True).head(4)

2. AirPassengers Data

The classic Box & Jenkins airline data. Monthly totals of international airline passengers, 1949 to 1960. It has been used as a reference on several forecasting libraries, since it is a series that shows clear trends and seasonalities it offers a nice opportunity to quickly showcase a model’s predictions performance.
from statsforecast.utils import AirPassengersDF
AirPassengersDF.head(12)
#We are going to plot the ARIMA predictions, and the prediction intervals.
fig, ax = plt.subplots(1, 1, figsize = (20, 7))
plot_df = AirPassengersDF.set_index('ds')

plot_df[['y']].plot(ax=ax, linewidth=2)
ax.set_title('AirPassengers Forecast', fontsize=22)
ax.set_ylabel('Monthly Passengers', fontsize=20)
ax.set_xlabel('Timestamp [t]', fontsize=20)
ax.legend(prop={'size': 15})
ax.grid()

Model utils