In this example we will go through the dataset input requirements of the core.NeuralForecast class.

The core.NeuralForecast methods operate as global models that receive a set of time series rather than single series. The class uses cross-learning technique to fit flexible-shared models such as neural networks improving its generalization capabilities as shown by the M4 international forecasting competition (Smyl 2019, Semenoglou 2021).

You can run these experiments using GPU with Google Colab.

Long format

Multiple time series

Store your time series in a pandas dataframe in long format, that is, each row represents an observation for a specific series and timestamp. Let’s see an example using the datasetsforecast library.

Y_df = pd.concat( [series1, series2, ...])

!pip install datasetsforecast
import pandas as pd
from datasetsforecast.m3 import M3
Y_df, *_ = M3.load('./data', group='Yearly')
Y_df.groupby('unique_id').head(2)
unique_iddsy
0Y11975-12-31940.66
1Y11976-12-311084.86
20Y101975-12-312160.04
21Y101976-12-312553.48
40Y1001975-12-311424.70
18260Y971976-12-311618.91
18279Y981975-12-311164.97
18280Y981976-12-311277.87
18299Y991975-12-311870.00
18300Y991976-12-311307.20

Y_df is a dataframe with three columns: unique_id with a unique identifier for each time series, a column ds with the datestamp and a column y with the values of the series.

Single time series

If you have only one time series, you have to include the unique_id column. Consider, for example, the AirPassengers dataset.

Y_df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/air_passengers.csv')
Y_df
timestampvalue
01949-01-01112
11949-02-01118
21949-03-01132
31949-04-01129
41949-05-01121
1391960-08-01606
1401960-09-01508
1411960-10-01461
1421960-11-01390
1431960-12-01432

In this example Y_df only contains two columns: timestamp, and value. To use NeuralForecast we have to include the unique_id column and rename the previuos ones.

Y_df['unique_id'] = 1. # We can add an integer as identifier
Y_df = Y_df.rename(columns={'timestamp': 'ds', 'value': 'y'})
Y_df = Y_df[['unique_id', 'ds', 'y']]
Y_df
unique_iddsy
01.01949-01-01112
11.01949-02-01118
21.01949-03-01132
31.01949-04-01129
41.01949-05-01121
1391.01960-08-01606
1401.01960-09-01508
1411.01960-10-01461
1421.01960-11-01390
1431.01960-12-01432

References