In this example we will go through the dataset input requirements of the core.NeuralForecast class.

The core.NeuralForecast methods operate as global models that receive a set of time series rather than single series. The class uses cross-learning technique to fit flexible-shared models such as neural networks improving its generalization capabilities as shown by the M4 international forecasting competition (Smyl 2019, Semenoglou 2021).

You can run these experiments using GPU with Google Colab.

Long format

Multiple time series

Store your time series in a pandas dataframe in long format, that is, each row represents an observation for a specific series and timestamp. Let’s see an example using the datasetsforecast library.

Y_df = pd.concat( [series1, series2, ...])

!pip install datasetsforecast
import pandas as pd
from datasetsforecast.m3 import M3
Y_df, *_ = M3.load('./data', group='Yearly')
100%|██████████| 1.76M/1.76M [00:00<00:00, 5.55MiB/s]
INFO:datasetsforecast.utils:Successfully downloaded M3C.xls, 1757696, bytes.
Y_df.groupby('unique_id').head(2)
unique_iddsy
0Y11975-12-31940.66
1Y11976-12-311084.86
20Y101975-12-312160.04
21Y101976-12-312553.48
40Y1001975-12-311424.70
18260Y971976-12-311618.91
18279Y981975-12-311164.97
18280Y981976-12-311277.87
18299Y991975-12-311870.00
18300Y991976-12-311307.20
Y_df.groupby('unique_id').tail(2)
unique_iddsy
18Y11993-12-318407.84
19Y11994-12-319156.01
38Y101993-12-313187.00
39Y101994-12-313058.00
58Y1001993-12-313539.00
18278Y971994-12-314507.00
18297Y981993-12-311801.00
18298Y981994-12-311710.00
18317Y991993-12-312379.30
18318Y991994-12-312723.00

Y_df is a dataframe with three columns: unique_id with a unique identifier for each time series, a column ds with the datestamp and a column y with the values of the series.

Single time series

If you have only one time series, you have to include the unique_id column. Consider, for example, the AirPassengers dataset.

Y_df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/air_passengers.csv')

In this example Y_df only contains two columns: timestamp, and value. To use NeuralForecast we have to include the unique_id column and rename the previuos ones.

Y_df['unique_id'] = 1. # We can add an integer as identifier
Y_df = Y_df.rename(columns={'timestamp': 'ds', 'value': 'y'})
Y_df = Y_df[['unique_id', 'ds', 'y']]
Y_df
unique_iddsy
01.01949-01-01112
11.01949-02-01118
21.01949-03-01132
31.01949-04-01129
41.01949-05-01121
1391.01960-08-01606
1401.01960-09-01508
1411.01960-10-01461
1421.01960-11-01390
1431.01960-12-01432

References